This document answers one question: what should happen next, in order.
For landed milestones, see CHANGELOG.md. For compiler structure, see docs/ARCHITECTURE.md and docs/PASSES.md. For identity and safety, see docs/IDENTITY.md and docs/SAFETY.md. For subsystem references, see docs/FFI.md, docs/ABI.md, docs/DIAGNOSTICS.md, docs/STDLIB.md, docs/VALUE_MODEL.md, and docs/EXECUTION_MODEL.md.
Concrete should stay small enough to remain readable, auditable, and mechanically understandable. New work should be judged by grammar cost, audit cost, and proof cost, not only by expressiveness.
The Lean 4 compiler implements the full pipeline:
Parse → Resolve → Check → Elab → CoreCheck → Mono → Lower → EmitSSA → LLVM IR
The core language, stdlib foundation, report surfaces, and project workflow are real. Phase H (real-program pressure testing) is nearly complete: discovery is done, cleanup is wrapping up. The main missing structural pieces are package/artifact architecture, broader formalization, backend plurality, and a fuller runtime story.
Status: active — discovery complete, cleanup wrapping up.
The remaining work is narrow and evidence-backed. Do not reopen H as open-ended exploration.
Landed during cleanup:
- match-as-expression — value-producing
matchfor branch-produced values, including linear cases
Tasks:
- clean up stdlib output surface so examples stop using builtin-shaped
print_string/print_char— done when stdlib output reads like coherent library code rather than builtin vocabulary string.splitandstring.trim— parser examples reimplement split by hand repeatedly; no workaround exists — done whenStringhassplit,trim,trim_left,trim_rightmethods- path decomposition:
parent,file_name,extension— path construction exists but decomposition is completely absent — done whenPathorPathBufhas all three methods - minimal FFI pressure test — FFI is implemented but has zero small end-to-end validations — done when there is one minimal example that calls C from Concrete with
with(Unsafe)at the boundary andtrustedwrappers - write a classification of remaining runtime/stack pressure findings into language, runtime, stdlib, or tooling — done when there is a document in
research/that assigns each finding to exactly one owner
Not next unless Phase H evidence still demands it:
- string
==operator —.eq()works but is friction at scale — done when==and!=work onStringvalues
References: phase-h-findings, text-and-output-design, cleanup-ergonomics
Status: not started. This is the next major architectural build-out once H cleanup is done.
Tasks:
- incremental compilation: serialize pipeline artifacts, cache by source hash, skip unchanged modules — done when unchanged modules are skipped on rebuild
- split interface artifacts from body artifacts — done when interface/body boundaries are clean enough for separate caching (needed before dependency resolution can be fast)
- third-party dependency model: version constraints, lockfile, resolution — done when package/dependency semantics are explicit
- workspace and multi-package support — done when a multi-package project builds and tests from a single root (depends on dependency model)
- package-aware testing tooling — done when tests can target individual packages (depends on workspaces)
- cross-target FFI/ABI validation — done when validation is empirical, not hand-wavy (depends on package boundaries being real)
- first authority-budget path at module/package scope — done when the authority-budget path is structurally possible (depends on package graph)
- package manifest parsing and version-constraint support — done when the toolchain can parse its own manifest format and version constraints without ad hoc logic
- provenance-aware publishing direction — done when the package graph is not heading toward a trust-model redesign (design-only, no implementation yet)
References: artifact-driven-compiler, package-model, package-manager-design, package-testing-tooling
Status: not started. Do this after package/artifact boundaries are cleaner.
Tasks:
- broaden the pure Core proof fragment — done when the provable subset covers more than the current narrow pure fragment
- stabilize the provable subset as an actual target — done when users can know what is and isn't provable
- source-to-Core and Core-to-proof traceability — done when proof claims trace back to source
- proof-backed authority reports as real artifacts — done when reports are artifacts, not just a research direction
- user-program proof workflow, artifact-driven — done when a user can prove a property end-to-end
- push selected compiler-preservation work where tractable — done when preservation proofs cover the highest-value passes
References: formalization-breakdown, formalization-roi, proving-concrete-functions-in-lean, proof-addon-architecture
Status: not started. Only after the package model and the biggest ergonomics gaps are under control.
The showcase corpus should deliberately rebalance away from mostly text-heavy examples and toward binary parsing, ownership-heavy structures, capability-separated tools, FFI boundaries, and no-allocation-friendly systems code.
Tasks:
- define domains where Concrete should be unusually strong — done when signature strengths are written down
- curate public showcase corpus — done when there are polished examples for each signature domain
- improve onboarding and example presentation — done when a newcomer can build something in under an hour
- define stability / experimental surface — done when users know what is stable and what is not
- sharpen positioning vs neighboring systems languages — done when the pitch is one paragraph, not a lecture
Examples to build (ranked by what they prove about the language):
- Packet parser — binary protocol decoding with capability-controlled I/O, shows
with()separation between parser and network - ELF inspector — structured binary parsing with
#[repr(C)],packed, raw pointers; noUnsafein user code - FFI showcase — C library interop (e.g., zlib or sqlite) with
with(Unsafe)at the boundary andtrustedwrappers - Ownership-heavy data structure — linked list or tree using
Heap<T>, linear ownership, and deterministic cleanup - Privilege-separated tool — hasher can't touch network, reporter can't read files (capability demo)
- No-alloc example — fixed-buffer state machine or ring buffer (depends on Phase 8 allocation-profile work being far enough along)
Presentation formats (ranked by reach):
- Live audit of a real dependency —
with()signatures reveal what code touches - Capability escalation attack (blocked) — compiler says no
- Formal proof demo — correct because proved, not because tested
- "Spot the bug" side-by-side — C/Rust/Concrete, C has a hidden capability leak
- Performance benchmark against C — SHA-256, JSON parsing
- Interactive playground / REPL — highest reach, highest cost
- Package ecosystem demo — practical stdlib usage
- Conference talk with storytelling — narrative-driven
References: adoption-strategy, showcase-workloads
Status: not started. This turns the compiler into a durable reviewable operational system.
Tasks:
- machine-readable reports — done when report output is structured and parseable
- verified FFI envelopes and reportable FFI boundary facts — done when FFI boundaries are auditable from report output
- trust bundles and report-first review workflows — done when reviews can be driven by compiler-emitted trust reports
- semantic query/search over compiler facts — done when you can ask questions about the program and get structured answers
- compatibility checks and trust-drift diffing — done when version bumps surface semantic/trust changes automatically
- review-policy gates — done when CI can enforce authority, trust, FFI, and proof-facing policies
- coverage tooling over tests, reports, and proof artifacts — done when coverage gaps across all three are visible
- editor/LSP baseline — done when there is basic editor support with go-to-definition and diagnostics
- dependency auditing — done when dependencies can be audited for capability and trust properties
- release/compatibility discipline — done when there is a versioning policy and it is enforced
References: evidence-review-workflows, proof-evidence-artifacts, trust-multipliers, developer-tooling
Status: not started. Keep explicit and late.
Tasks:
- stabilize SSA as the backend contract — done when SSA is the only interface between front and back end in practice
- evaluate QBE as first lightweight second backend — done when there is a working QBE path or a clear rejection with reasons
- cross-backend validation and emitted-code inspection — done when two backends produce equivalent output for the test suite
- debug-info and codegen maturity — done when debug builds produce usable source-level debugging
References: qbe-backend, qbe-in-concrete, mlir-backend-shape, optimization-policy
Status: not started. Keep the model explicit, small, and late.
Tasks:
- structured concurrency as semantic center — done when concurrency primitives enforce structured lifetimes
- OS threads + message passing as base primitive — done when thread + channel programs work end-to-end; likely first pieces are
std.thread, typed channels, and only the minimumstd.syncsurface needed to make that model usable - evented I/O only as later specialized model — done when the async story is explicit and opt-in, not default
References: concurrency, long-term-concurrency
Status: not started. Do this after the broader compiler/runtime structure is more stable.
Tasks:
- strengthen
--report alloc— done when the report accurately attributes every allocation to its source - enforceable
NoAlloc— done whenNoAllocfunctions that allocate fail to compile - structural boundedness reports where explainable — done when the compiler can report which functions have bounded allocation
BoundedAlloc(N)only where structurally explainable — done when bounded allocation is enforced without requiring user annotation on every call
References: allocation-budgets, arena-allocation, execution-cost
Status: not started. Do this after allocation profiles and the first explicit concurrency model exist.
Tasks:
- define a restricted analyzable execution profile — done when there is a documented profile covering a recursion ban, no unrestricted allocation, loop-bound rules, concurrency limits, blocking-operation limits, and FFI boundaries
- define the reported operational/trust effect set — done when the compiler has a clear taxonomy and report model for
may_block,crosses_ffi,uses_trusted, recursion/call-cycle status, unknown loop bounds, concurrency usage, and allocation class - implement boundedness and timing-relevant reports from that model — done when the compiler can surface unknown loop bounds, recursion, blocking operations, FFI timing boundaries, and other sources of execution uncertainty
- make the restricted profile enforceable where structurally possible — done when recursion, unknown-bound loops, unrestricted allocation, blocking operations, unrestricted FFI, and disallowed concurrency fail clearly at compile time rather than relying on convention
- define the concurrency subset for analyzable systems — done when the project has a clear answer on whether this profile is single-threaded first or uses a Ravenscar-style restricted concurrency model
- define a tighter bounded-allocation subprofile — done when there is a clear next-stage profile for structurally bounded allocation rather than only a binary no-allocation rule
- define the backend and target assumptions — done when it is explicit what can be claimed at the source/compiler level versus what requires target-specific timing models
- validate the model with bounded examples — done when there are small examples such as a fixed-buffer parser, bounded-state controller, or ring buffer that fit the profile cleanly
References: predictable-execution, effect-taxonomy, allocation-budgets, execution-cost, concurrency, long-term-concurrency, backend-traceability, failure-semantics, trusted-code-policy, interrupt-signal-model
Status: not started. Keep visible without forcing premature language growth.
Candidates:
- typestate
- arena allocation
- target-specific timing models
- layout reports
- binary-format DSLs
- ghost/proof-only syntax
- hardware capability mapping
- capability sandbox profiles
- Miri-style interpreter
References: high-leverage-systems-ideas, ten-x-improvements, typestate
- keep the parser LL(1)
- keep SSA as the only backend boundary
- prefer stable storage for mutable aggregate loop state over phi transport
- avoid parallel semantic lowering paths
- keep builtins minimal and implementation-shaped; keep stdlib clean and user-facing
- keep trust, capability, and foreign boundaries explicit and auditable
- mutable aggregate lowering can still be too backend-sensitive if promoted storage is incomplete
- formalization scope is still narrow
- type-coercion completeness is not proved, only hardened
- the linearity checker is tested heavily but not formally audited
- proof-backed trust claims
- stronger audit outputs
- a smaller trusted computing base
- a better capability/sandboxing story
References: ten-x-improvements, capability-sandboxing, trust-multipliers, ai-assisted-optimization