Cancellable turns: /clear aborts an in-flight streaming response #42

Closed
opened 2026-07-06 01:39:22 +08:00 by weiwen · 1 comment
Owner

What to build

Make an in-flight pi turn cancellable, and prove it end-to-end by wiring /clear to cancel a streaming response cleanly. Today teloxide serializes updates per-chat, so a /clear sent mid-stream can't even run until the response finishes; and killing the process mid-stream surfaces a spurious "crashed" error. After this slice, /clear during streaming stops the turn cleanly and resets the session.

This is the foundation for mid-stream abort-and-resend (follow-up slice). It introduces the cancel primitive and the concurrency needed to act on a chat while it is streaming.

End-to-end behavior across all layers:

  • Dispatch: drop teloxide's default per-chat distribution serialization so a second same-chat update can be handled while the first is still streaming. (Single-user bot — full concurrency is acceptable.)
  • RPC (pi): the response read loop becomes cancellable. On cancel it writes an abort command to pi (see RPC reference in CONTEXT.md), keeps reading until the aborted agent_end so the process returns to idle, and reports an Aborted outcome distinct from a normal response or a crash.
  • Session state: a per-chat "current turn" handle (cancel signal + join) guarded by a per-chat async mutex. Stop holding the sessions-map write lock for the whole turn — acquire briefly, then run the turn without holding it. Guard the idle-cleanup task so it never kills a session that has a live turn.
  • Telegram: /clear issued mid-stream routes through the cancel primitive: abort the turn, delete the in-flight placeholder/answer message(s), then recreate the session. /help and /start leave any in-flight turn untouched.

Acceptance criteria

  • A second same-chat message is handled while a response is still streaming (per-chat serialization no longer blocks it)
  • /clear sent while a response is streaming stops the turn cleanly with no "crashed"/error message shown to the user
  • The in-flight placeholder/answer message(s) are removed on cancel (all pages, if the answer overflowed)
  • After a mid-stream /clear, the session is reset and the next message starts a fresh session
  • Cancel path sends abort to pi and drains to agent_end; the process is reused (not killed) where a session still exists
  • The idle-cleanup task does not kill a session that has a live in-flight turn
  • The sessions-map write lock is no longer held for the duration of a turn
  • Existing tests pass; cancel/abort outcome has coverage

Blocked by

None - can start immediately

## What to build Make an in-flight `pi` turn cancellable, and prove it end-to-end by wiring `/clear` to cancel a streaming response cleanly. Today teloxide serializes updates per-chat, so a `/clear` sent mid-stream can't even run until the response finishes; and killing the process mid-stream surfaces a spurious "crashed" error. After this slice, `/clear` during streaming stops the turn cleanly and resets the session. This is the foundation for mid-stream abort-and-resend (follow-up slice). It introduces the cancel primitive and the concurrency needed to act on a chat while it is streaming. End-to-end behavior across all layers: - **Dispatch:** drop teloxide's default per-chat distribution serialization so a second same-chat update can be handled while the first is still streaming. (Single-user bot — full concurrency is acceptable.) - **RPC (`pi`):** the response read loop becomes cancellable. On cancel it writes an `abort` command to `pi` (see RPC reference in `CONTEXT.md`), keeps reading until the aborted `agent_end` so the process returns to idle, and reports an `Aborted` outcome distinct from a normal response or a crash. - **Session state:** a per-chat "current turn" handle (cancel signal + join) guarded by a per-chat async mutex. Stop holding the sessions-map write lock for the whole turn — acquire briefly, then run the turn without holding it. Guard the idle-cleanup task so it never kills a session that has a live turn. - **Telegram:** `/clear` issued mid-stream routes through the cancel primitive: abort the turn, delete the in-flight placeholder/answer message(s), then recreate the session. `/help` and `/start` leave any in-flight turn untouched. ## Acceptance criteria - [ ] A second same-chat message is handled while a response is still streaming (per-chat serialization no longer blocks it) - [ ] `/clear` sent while a response is streaming stops the turn cleanly with no "crashed"/error message shown to the user - [ ] The in-flight placeholder/answer message(s) are removed on cancel (all pages, if the answer overflowed) - [ ] After a mid-stream `/clear`, the session is reset and the next message starts a fresh session - [ ] Cancel path sends `abort` to `pi` and drains to `agent_end`; the process is reused (not killed) where a session still exists - [ ] The idle-cleanup task does not kill a session that has a live in-flight turn - [ ] The sessions-map write lock is no longer held for the duration of a turn - [ ] Existing tests pass; cancel/abort outcome has coverage ## Blocked by None - can start immediately
Author
Owner

Implementation complete on branch sandcastle/issue-42 (commit 6c30405). No Rust toolchain in sandbox so CI must validate compilation and tests.

Implementation complete on branch sandcastle/issue-42 (commit 6c30405). No Rust toolchain in sandbox so CI must validate compilation and tests.
weiwen 2026-07-06 02:39:49 +08:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
weiwen/evie#42
No description provided.