fix(screentracker): make writeStabilize Phase 1 non-fatal when agents don't echo input by johnstcn · Pull Request #208 · coder/agentapi

johnstcn · 2026-03-31T11:31:17Z

Fixes #123.

Changes

Make Phase 1 (echo detection) of writeStabilize non-fatal on timeout — proceed to Phase 2 instead of returning HTTP 500
Guard non-fatal path with errors.Is(err, util.WaitTimedOut) so context cancellation still propagates
Reduce Phase 1 timeout from 15s to 2s (terminal echo is near-instant)
Extract writeStabilizeEchoTimeout and writeStabilizeProcessTimeout constants
Log at Info level (not Warn) since non-echoing agents hit this on every message
Add send-message-no-echo-agent-reacts test: agent does not echo but reacts to Enter → success
Add send-message-no-echo-no-react test: agent is unresponsive → error from Phase 2
Add send-message-no-echo-context-cancelled test: context cancellation during Phase 1 propagates as fatal (validates errors.Is guard)
Add doc comment on formatPaste in claude.go documenting the ESC limitation with TUI selection prompts

Known limitation

For TUI selection prompts (numbered/arrow-key lists), this fix eliminates the 500 but does not deliver the correct selection — the \x1b (ESC) in bracketed paste cancels the selection widget. The correct approach is MessageTypeRaw. Documented via a comment on formatPaste in lib/httpapi/claude.go.

Also discovered a separate issue during smoke-testing: #209

Implementation plan and decision log

Root cause

writeStabilize Phase 1 assumes the screen will change after writing message text (echo detection). TUI agents using bracketed paste buffer input internally and do not render until Enter. Phase 1 waited 15s for a change that never came → timeout → HTTP 500.

Key decisions

Decision	Rationale
Non-fatal only for `WaitTimedOut`	`ctx.Err()` must still propagate — otherwise context cancellation logs a misleading warning and writes a spurious `\r`
2s timeout (down from 15s)	Echo is near-instant; WaitFor polls at 50ms intervals (5+ checks/s). 2s is generous.
Info level, not Warn	Non-echoing agents hit this on every message. Warn implies something a human should investigate.
"echo detection timed out" log message	Matches codebase style (short, descriptive). Structured `timeout` field carries the duration.
Doc comment on formatPaste instead of screentracker test	Per review feedback (mafredri P3): the ESC limitation lives in the formatting layer, not the screentracker layer. A comment is cheaper and equally durable.

Behavioral changes

Slow-echo agents (>1s): may now trigger the non-fatal timeout. Benign — Phase 2 still succeeds.
Unresponsive agents: total timeout increases from 15s to ~17s (2s + 15s). Carriage return is now sent before failing, leaving PTY in a more consistent state.

🤖 Written by a Coder Agent. Will be reviewed by a human.

…agents don't echo input Phase 1 of writeStabilize waited 15s for the screen to change after writing message text (echo detection), returning HTTP 500 if it didn't. Many TUI agents using bracketed paste don't echo input until Enter is pressed, causing every message send to fail. Phase 1 timeout is now non-fatal (2s) — if the screen doesn't change, we log at Info level and proceed to Phase 2 (send carriage return). Phase 2 remains the real indicator of agent responsiveness. Key changes: - Guard non-fatal path with errors.Is(err, util.WaitTimedOut) so context cancellation still propagates as a fatal error - Reduce Phase 1 timeout from 15s to 2s (echo is near-instant) - Extract named constants for both timeouts - Add tests for no-echo-success and no-echo-no-react-failure - Add documentation test for TUI selection prompt ESC limitation Closes #123

github-actions · 2026-03-31T11:33:29Z

✅ Preview binaries are ready!

To test with modules: agentapi_version = "agentapi_208" or download from: https://github.com/coder/agentapi/releases/tag/agentapi_208

…tests Restructure test comments to follow Cucumber-style Given/When/Then pattern for clarity. Also fix send-no-echo-agent-reacts assertion to scan for the user message instead of assuming it's the last message in the conversation (the snapshot loop may append an agent message after Send returns).

mafredri

Clean design. Making Phase 1 non-fatal for WaitTimedOut while preserving context cancellation as fatal is the right call. The errors.Is guard, extracted constants, and 2s timeout are all well-calibrated. Two P2 findings (missing test coverage for the key invariant, gofmt failure), two P3s (doc accuracy, test layering), and a handful of notes.

"Oh, this test suite looks lovely! Fifty rows, full coverage, green across the board. It's fake. Every row hits the same code path. You dressed up one test in fifty outfits." -- Bisky, on a different test. The new tests here are mostly genuine.

Severity count: 0 P0, 0 P1, 2 P2, 2 P3, 3 Notes.

🤖 This review was automatically generated with Coder Agents.

lib/screentracker/pty_conversation.go

lib/screentracker/pty_conversation_test.go

lib/screentracker/pty_conversation.go

lib/screentracker/pty_conversation_test.go

lib/screentracker/pty_conversation.go

lib/screentracker/pty_conversation_test.go

lib/screentracker/pty_conversation.go

- Add send-message-no-echo-context-cancelled test: verifies the errors.Is(WaitTimedOut) guard by cancelling context during Phase 1 and asserting context.Canceled propagates (P2) - Fix gofmt: correct indentation, proper brace placement (P2) - Fix constant comment: describe WaitFor timeout semantics accurately, note 1s stability check can extend past timeout, add TODO tag (P3) - Drop send-tui-selection-esc-cancels test from screentracker, add ESC limitation comment to formatPaste in claude.go instead (P3) - Shorten log message to match codebase style (Note) - Rename tests to send-message-* prefix, use newConversation helper with opts callbacks (Note)

The test had a race: advanceFor could complete before the Send() goroutine enqueued, so the stableSignal never fired, and sendCancel ran while the message was still queued (never reaching writeStabilize). Fix: use onWrite callback as a synchronization point. advanceUntil waits for writeStabilize to start writing (onWrite fires), then cancels. This guarantees Phase 1 WaitFor is running when ctx is cancelled, and its sleep select sees ctx.Done() immediately.

Copilot

Pull request overview

This PR updates the screentracker PTY conversation send pipeline to avoid failing requests when an agent doesn’t echo typed input during writeStabilize Phase 1 (echo detection), and adds tests to codify the new behavior.

Changes:

Make writeStabilize Phase 1 (echo detection) timeout non-fatal and proceed to Phase 2 (processing detection), while still propagating context cancellation.
Reduce Phase 1 timeout and extract Phase 1/2 timeouts into constants.
Add new test coverage for non-echoing agents (reacting vs unresponsive) and context-cancellation behavior; add clarifying documentation comment about bracketed paste and TUI selection cancellation.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
lib/screentracker/pty_conversation.go	Makes echo-detection timeout non-fatal, adds timeouts as constants, and adjusts logging/error handling.
lib/screentracker/pty_conversation_test.go	Adds tests for non-echoing agents, unresponsive agents, and context cancellation during Phase 1.
lib/httpapi/claude.go	Documents bracketed-paste ESC interaction and suggests `MessageTypeRaw` for TUI selection prompts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lib/screentracker/pty_conversation.go

lib/httpapi/claude.go

lib/screentracker/pty_conversation.go

johnstcn self-assigned this Mar 31, 2026

johnstcn requested a review from mafredri March 31, 2026 11:58

mafredri reviewed Mar 31, 2026

View reviewed changes

johnstcn mentioned this pull request Mar 31, 2026

Send() blocks indefinitely when ReadyForInitialPrompt returns false despite Status() reporting "stable" #209

Open

johnstcn added 2 commits March 31, 2026 13:59

johnstcn marked this pull request as ready for review March 31, 2026 15:28

Copilot AI review requested due to automatic review settings March 31, 2026 15:28

Copilot started reviewing on behalf of johnstcn March 31, 2026 15:29 View session

Copilot AI reviewed Mar 31, 2026

View reviewed changes

lib/screentracker/pty_conversation.go Show resolved Hide resolved

lib/httpapi/claude.go Show resolved Hide resolved

lib/screentracker/pty_conversation.go Show resolved Hide resolved

johnstcn requested a review from 35C4n0r March 31, 2026 16:49

This was referenced Mar 31, 2026

bug: potential goroutine leak in util.WaitFor #210

Closed

fix: exorcise goroutine-leaking util.After from the codebase #211

Merged

johnstcn requested a review from mafredri March 31, 2026 22:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(screentracker): make writeStabilize Phase 1 non-fatal when agents don't echo input#208

fix(screentracker): make writeStabilize Phase 1 non-fatal when agents don't echo input#208
johnstcn wants to merge 4 commits intomainfrom
fix/write-stabilize-non-fatal-phase1

johnstcn commented Mar 31, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

mafredri left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

johnstcn commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Known limitation

Root cause

Key decisions

Behavioral changes

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

mafredri left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

johnstcn commented Mar 31, 2026 •

edited

Loading