Non-determinism is not a failure mode

The execution layer of the software supply chain was built for one assumption. That assumption no longer holds.

The assumption is simple: same input, same output. Anything else is treated as a failure to be debugged, retried, or quarantined. CI was built around it, and so were build systems, deployment pipelines, and most of the tooling between a commit and a running service, because for a long time the inputs were code and configuration, and the outputs were artifacts produced by deterministic processes.

That is no longer the dominant case.

A growing share of what flows through the supply chain is generated, modified, or evaluated by models, and once you put models in the loop, non-determinism is not an edge case but a property of the system. The same prompt produces different patches, the same diff gets different reviews, and two agents asked to summarize the same change return answers that do not fully agree. None of this is a defect of any individual model — it is a consequence of the system you built.

What CI was built for

CI solved a real problem. Before it existed, integration happened late, painfully, and at the worst possible moment. Running builds and tests automatically on every change, in a controlled environment, made an entire class of failures visible early instead of catastrophic later.

The implicit model was simple: humans write code, machines verify it, humans review the result. Automation sat between two human acts, and the contract between them was deterministic.

That model has held, with variations, for the better part of two decades.

What is changing now is not the value of automation but the shape of the workflow it runs through.

The human is no longer guaranteed to be at the start of the loop. Code arrives from agents, reviews arrive from agents, and the human becomes one participant among others — sometimes the author, sometimes the reviewer, sometimes the escalation path when other participants disagree. The shape of the graph changes with the position of the human in it.

At the same time, agents are not just users of the pipeline. They are software components inside it, and they behave statistically rather than deterministically. Not flaky — statistically. The output is a function of the input, the model, the prompt version, the tools attached, and parameters that move. That is not a tooling failure to be patched out. It is a different kind of participant, and CI’s execution model has no place to put it.

CI is not broken. It has reached the boundary of what it was designed for, and the workflow on the other side of that boundary needs a different execution model.

The gap in current tools

Modern graph-based systems — Dagger, Earthly, pipeline-as-code frameworks — already moved beyond YAML glue and treat reproducibility of deterministic inputs seriously, which is useful and well-implemented.

The gap shows up at the meta-level.

These systems can execute non-deterministic steps, but they do not model them as such. They do not capture provenance in a way the graph can reason about, they do not compare multiple runs of the same step as a first-class operation, and they do not let you express what should happen when two agents disagree or when the same agent produces different outputs over time.

So the logic moves outside the graph: comparison becomes a script, provenance becomes a side-channel, and divergence is either dropped or escalated elsewhere, while the execution model continues to pretend that all steps are deterministic.

What the system needs to know

Once you treat non-determinism as a property of the system rather than a failure of it, a few requirements follow.

The graph needs to distinguish between deterministic and non-deterministic nodes, because a static analysis step and a model-generated patch are not the same kind of thing, and treating them identically forces the system to ignore information it could otherwise act on.

It needs to capture enough context for a step to be replayed or audited later, including which model produced the output, against which prompt version, with which tools, and against which inputs.

It needs first-class comparison between runs, not as an assertion that fails when outputs differ, but as a routing primitive, where convergence and divergence are both outcomes the graph can act on.

And it needs explicit contracts between participants, so that when humans, agents, and services interact, their assumptions are visible in the graph rather than buried inside individual steps.

Why BEAM

The substrate Sykli runs on is the BEAM — the runtime behind Erlang and Elixir — because the problem looks like a BEAM problem.

A model invocation is a long-running, failure-prone interaction with an external system, which on the BEAM maps naturally to a process with isolated state, supervision, and restart semantics. Two agents reviewing the same patch are simply two processes whose outputs are compared via message passing, rather than a pattern you have to simulate on top of the runtime.

Divergent outputs are data, not crash conditions. Backpressure, retries, and timeouts are runtime primitives, not libraries layered on later. On most other substrates, a significant amount of effort would go into building this scaffolding before addressing the actual problem.

What it looks like

Pipelines are Elixir code, not YAML:

defmodule SupplyChain.AgentReview do
  use Sykli.Graph

  task :draft,    run: {ClaudeAgent, :propose_patch}
  task :review_a, run: {OpenAIAgent, :review},  deps: [:draft]
  task :review_b, run: {ClaudeAgent, :review},  deps: [:draft]

  contract :reviews_must_converge,
    over: [:review_a, :review_b],
    when_diverged: :route_to_human
end

Two reviewers run in parallel against the same draft. The contract is part of the graph, so divergence is not an exception handled after the fact but an explicit edge that routes execution.

The runtime underneath the forge

There is a larger story behind all of this, which is worth naming even if this post is not the place to develop it in full.

Git is distributed. The forge built on top of it — pull requests, CI, identity, artifacts, the whole control plane — is not. That centralization is usually treated as a product decision, but it is more accurately a runtime constraint. The dominant forge was built on a stack that does not handle distribution, supervision, or partition gracefully, so coordination got pushed into a control plane and the control plane got pushed into one region.

Centralization in developer infrastructure is a runtime failure, not a product failure.

The BEAM has handled distribution, supervision, and partition as ordinary operations since 1996. Build the forge on a runtime that treats those properties as primitives and the gravity changes: CI is a node, code review is a node, artifact storage is a node, and federation stops being a feature you add and becomes the default shape of the system.

Sykli is the first node.

It is still very experimental.

The repository is here:
https://github.com/false-systems/sykli