-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat: native race / first-completed wins for multiple child runs #3301
Description
Is your feature request related to a problem? Please describe.
Today we run two improvement strategies as separate child tasks (trigger + poll with runs.retrieve + wait.for) because we need to return as soon as one child finishes successfully, without waiting for the other. triggerAndWait only waits for a single run, and batch.triggerAndWait waits for all runs. Promise.race is not a safe pattern with checkpoint-based child waits.
Polling works but is noisy in traces, adds latency between polls, and requires us to re-parse outputs and handle cancellation ourselves. We would prefer a first-class API that expresses “run these child tasks in parallel and resume the parent when the first one satisfies a condition (or all complete).”
Describe the solution you'd like to see
A supported primitive such as:
race / triggerAndRace: trigger N child runs and checkpoint until one run reaches a terminal state (or optionally until a predicate on run status/output is true), then optionally cancel or leave the others running.
Or wait.any over multiple run IDs with the same checkpoint semantics as triggerAndWait.
Describe alternate solutions
- Keep polling (runs.retrieve in a loop + wait.for) — works today but is verbose, harder to read in the dashboard, and easy to get wrong (e.g. output parsing vs parseRunData).
- Inline execution — run both code paths in one task with Promise.race; loses per-child run logs, separate machines, and clean cancellation of the loser.
- Sequential runs — no race, longer wall-clock time.
Additional information
No response