Skip to content

feat: native race / first-completed wins for multiple child runs #3301

@WoutDeRijck

Description

@WoutDeRijck

Is your feature request related to a problem? Please describe.

Today we run two improvement strategies as separate child tasks (trigger + poll with runs.retrieve + wait.for) because we need to return as soon as one child finishes successfully, without waiting for the other. triggerAndWait only waits for a single run, and batch.triggerAndWait waits for all runs. Promise.race is not a safe pattern with checkpoint-based child waits.

Polling works but is noisy in traces, adds latency between polls, and requires us to re-parse outputs and handle cancellation ourselves. We would prefer a first-class API that expresses “run these child tasks in parallel and resume the parent when the first one satisfies a condition (or all complete).”

Describe the solution you'd like to see

A supported primitive such as:

race / triggerAndRace: trigger N child runs and checkpoint until one run reaches a terminal state (or optionally until a predicate on run status/output is true), then optionally cancel or leave the others running.
Or wait.any over multiple run IDs with the same checkpoint semantics as triggerAndWait.

Describe alternate solutions

  • Keep polling (runs.retrieve in a loop + wait.for) — works today but is verbose, harder to read in the dashboard, and easy to get wrong (e.g. output parsing vs parseRunData).
  • Inline execution — run both code paths in one task with Promise.race; loses per-child run logs, separate machines, and clean cancellation of the loser.
  • Sequential runs — no race, longer wall-clock time.

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions