KAI OS Proof Pack

Five claims, five proofs.

The evidence is portable and local-first: no API key, hosted dashboard, or provider log is required for the first verification path.

No-key first run Deterministic mock provider powers kaios tour.

Agent = Process PID, state, tokens, context, syscalls, worker, and events.

Tool = Syscall Ledger records permission, allowed/denied status, args, time, and cost.

Run = Evidence Capsule, trace, provenance hashes, and replay commands travel together.

CI = Gate Baseline diff catches stable runtime behavior drift.

Claim-to-artifact map.

Each row points to a file or command that exists in the repository today.

Claim	Proof today	Check it
No API key is required for the first run	deterministic mock provider and disposable tour	`kaios tour`
Agent runs produce process evidence	checked-in process trace and review JSON	`examples/evidence-sample/change-review.trace.json`
Tool use is syscall-bounded	syscall ledger with permission, duration, cost, and redacted args	`examples/evidence-sample/change-review.trace.json`
Run capsules can replay offline	capsule embeds snapshot and trace for deterministic replay checks	`kaios replay --file examples/evidence-sample/change-review.capsule.json`
CI can catch runtime drift	baseline/current capsules produce a stable nonzero diff under `--check`	`./scripts/evidence-samples-smoke.sh`

Verify it in three paths.

Choose the lowest-friction path for the environment you are in.

Browser only

Open the Evidence Viewer and inspect the same checked-in run without installing Java, Gradle, Docker, or an API key.

Open Evidence Viewer

Installed CLI

Run the local tour. It creates a disposable Git workspace and writes review, trace, capsule, evidence, and recovery artifacts.

curl -fsSL https://morning-verlu.github.io/KAI/install.sh | sh
export PATH="$HOME/.kaios/bin:$PATH"
kaios tour

Source checkout

Run deterministic smoke checks against checked-in capsules and baseline gate artifacts.

./scripts/evidence-samples-smoke.sh
./scripts/repository-ci-smoke.sh

Community trust signal.

Small external contributions are now part of the product proof, not a separate vanity metric.

PR #24 added the Evidence Glossary across the Proof Pack, evaluator path, and checked-in evidence sample. It was verified with git diff --check and ./scripts/evidence-samples-smoke.sh.

Evidence glossary.

The main artifacts KAI OS produces, explained without requiring the JSON schema docs first.

Term	Meaning
Review artifact	A Markdown summary from `kaios review`; the human-readable version of the run evidence.
Process trace	A structured JSON record of processes, state transitions, token counts, syscalls, cost, and lifecycle events.
Syscall ledger	The audit log for tool calls, including allowed or denied status, duration, redacted args, and cost.
Replay capsule	A portable package that bundles snapshot, trace, provenance hashes, and replay commands for offline checks.
Baseline diff	A stable comparison of two capsules that ignores timestamp noise and focuses on runtime behavior changes.
Evidence summary	A compact Markdown report for PRs and CI summaries: verdict, changed behavior, fix-first notes, and process table.
Recovery dry-run	A read-only report that explains crashed processes and recovery evidence without restarting anything.

FAQ for skeptical developers.

Short answers to the first questions people usually ask before trusting a new agent runtime category.

How is this different from Koog or LangChain4j?

Koog and LangChain4j are better fits for application-level agent and provider integration. KAI OS focuses on the evidence layer around a run: traces, ledgers, capsules, recovery evidence, and CI gates.

Is the mock provider just a demo trick?

No. The deterministic mock provider makes first runs, examples, capsules, and CI checks reproducible without API keys, network access, or provider billing.

What does offline replay replay?

Offline replay checks saved evidence: snapshots, trace shape, artifact contracts, replay metadata, and stable behavior comparisons. It does not re-call a hosted model.

Does KAI OS prove the agent answer is correct?

No. It proves runtime evidence: what ran, which tools were allowed or denied, what changed from baseline, and what can be replayed offline.

Why Kotlin/JVM?

JVM teams already run a lot of backend, CI, build, and internal automation infrastructure. Kotlin gives the runtime typed APIs, DSL ergonomics, and coroutine-friendly scheduling.

What this proves, and what it does not.

KAI OS proof is runtime proof, not model-answer truth.

It proves

what agent processes ran.
which tools were requested.
which syscalls were allowed or denied.
whether a capsule can replay offline.
whether stable runtime behavior drifted.

It does not prove

that every agent answer is correct.
that real provider calls are replayed.
that hosted observability is unnecessary for every team.
that v0.3 is a mature managed platform.

Honest gaps

Public GitHub Actions CI is blocked on missing workflow token scope.
Full Docker smoke depends on external image download speed.
Real model providers are optional outside the default proof path.

If this evidence model should exist, star it.

The project is small, but the direction is crisp: local-first runtime evidence for JVM/Kotlin agents.

Star / Fork Give Kotlin/JVM feedback Trust matrix