Knowledge base · 00-start-here

00 - Start here

Status. Living index. v1.0 · First draft: 17-03-2026 · Reflects the knowledge base after the first complete pass. Updated when sections are added, removed, or substantively revised.

This is the front door to the knowledge base. It exists to orient a reader who has no prior context, point them to the right reading path for their role, and serve as a map back to specific sections when they come looking for something later.

What this knowledge base is for

Everyone touching SIDK and GREMI - internally, on the investor side, on partner and integrator sides - should share the same mental model of what the platform is, how it works, what it is honestly capable of today, and what it is built to become. This knowledge base is that shared model, written down once, kept up to date.

It is not marketing material, not sales collateral, not a technical specification. The marketing message is downstream of this; the technical specifications it draws from are in Docs/. This is the layer in between - the explainer that lets a reader pick up the canonical specs with the right framing, and lets a non-technical reader (an investor, a board member, a new joiner) build a working mental model without having to read every spec themselves.

Who this is for, in priority order

Internal team and new hires. Primary audience. Reading the knowledge base end-to-end should give any new joiner a working mental model in one focused afternoon.
Investors, board, and aligned partners. Secondary. Needs the moat, the regulatory tailwind, and the structural reason the platform’s position is unusually defensible.
Technical partners and integrators. Tertiary. The knowledge base gives them enough context to know whether the canonical source documents in Docs/ are worth their time - and the right framing to read them with once they are.

If you are a customer (a Tenant, a Verifier, a Buyer), the knowledge base is open to you, but it is not yet tuned to your specific workflow. That is a later layer of product documentation that builds on this one.

The shape of the platform, in one paragraph

The world is moving to a regime where embodied carbon in industrial goods - steel, cement, aluminium, chemicals - must be measured, attested, and trusted across organisational boundaries. CBAM is the first binding regulatory regime that turns carbon accounting from voluntary disclosure into a customs gate; more regimes are coming. SIDK is a vertical-agnostic industrial data kernel built by Sphuran, a research organisation, to anchor data against physical reality - asset topologies, lot lineages, transformation events - so that downstream tooling, models, and verifiable claims have something concrete to attach to. Sphuran has used SIDK internally across multiple research domains for six years; carbon, via GREMI, is the first product taken to market on it. GREMI is built by Grevoro, a separate company whose mission aligns with Sphuran’s around sustainable manufacturing. The two companies are structurally independent - separate legal entities, separate cap tables, separate investors, no overlap in leadership - which gives the trust chain an unusually clean answer to the structural-conflict-of-interest question. The platform underneath GREMI is wired to support a multi-phase release trajectory; the first phase is in PoC integration at Topworth Steel today. AI sits on the SIDK boundary, never inside it - which makes AI built on SIDK qualitatively different in kind from AI built on operational spreadsheets, because every input the AI reads has full audit lineage to a sensor reading, a lab result, or an approved Submission.

That paragraph is the elevator pitch. The rest of the knowledge base is its argument.

Reading paths

Three suggested paths depending on time available and the reader’s role.

Path A - Internal team or new joiner (≈ 2–3 hours, full picture)

Read in numerical order: 01 → 02 → 02a → 03 → 03a → 04 → 04a → 05 → 06 → 07 → 08b → 08 → 99 (FAQ). Note that 08b (the destination picture of GREMI) comes before 08 (the trajectory) - reading the destination first makes the phasing language easier to follow. Stop and ask questions whenever something is unclear; the knowledge base has gaps, and your question is probably a doc that should exist or be expanded.

Path B - Investor, board member, or aligned partner (≈ 45 minutes, the substantive case)

01 - Embodied carbon, from first principles - what problem the world is forcing on us
03 - Regulations: the landscape and 03a - CBAM, in depth - the regulatory tailwind, with sources
04 - Carbon as trust infrastructure - the seven properties any serious carbon platform must deliver
04a - Sphuran, Grevoro, and what their separation means - why the structural answer here is unusually clean
05 - SIDK, and what it is NOT, especially §6 (six years of refinement) and §8 (SIDK as substrate for trustworthy AI) - what makes this platform credible specifically
08b - GREMI at maturity - the destination picture: five apps, five actors, three loops, the trust chain composed end-to-end
08 - GREMI, phased - what is actually being shipped, and in what order, on the way to 08b’s destination

The 45-minute version skips the engine internals (doc 06) and inputs (doc 07) on the assumption that the structural and commercial argument is what matters to this audience. Read those if curiosity demands.

Path C - Technical partner or integrator (≈ 60 minutes)

05 - SIDK, and what it is NOT - especially §2 (the anti-list), §4 (the boundary test), §5 (pack-as-data)
08b - GREMI at maturity - the five-app, three-loop, five-actor architecture you’re building toward or integrating with
06 - The Carbon Engine - the deterministic calculation subsystem
07 - Inputs and sensors - how real data reaches the kernel
02a - Boundaries - the vocabulary, before reading the source specs

After that, go directly to the canonical source documents in Docs/SIDK Handoff Docs/ - architecture.md first, then classification-methodology.md. The source set is Apache-2.0-licensed.

Section map

#	Section	What it covers
00	Start here	This page.
01	Embodied carbon, from first principles	What carbon accounting is about. Why steel is hard. The basic equation and why it’s already a simplification.
02	Scope 1, 2, 3	How emissions get categorised by who controls the source. The dual Scope 2 problem.
02a	Boundaries	The other axis. Cradle-to-gate, cradle-to-grave, cradle-to-cradle, gate-to-gate, EN 15804 modules, CBAM production process. How to read any published carbon number.
03	Regulations: the landscape	GHG Protocol, ISO 14064/14067, EN 15804, CBAM, EU ETS, CSRD, ISSB, California SB 253/261, India CCTS, SBTi, CDP. With “why it matters to us” framing.
03a	CBAM, in depth	The first binding regime. Per-shipment-vs-annual declarations, lot-level traceability, defaults vs actuals, verifier accreditation, the 50-tonne de minimis. Sourced answers to specific questions.
04	Carbon as trust infrastructure	The seven properties any serious carbon platform must deliver. The bridge between “what carbon is” and “how the platform does it”. Includes the AI-substrate consequence.
04a	Sphuran, Grevoro, and what their separation means	The structural argument. Separate entities, no leadership overlap, mission-aligned. What it means for the trust chain. What to watch as both companies scale.
05	SIDK, and what it is NOT	The vertical-agnostic data-anchoring kernel. The seven primitives. The boundary test. Pack-as-data. Six years of refinement. Determinism. SIDK as substrate for trustworthy AI.
06	The Carbon Engine	The deterministic calculation subsystem inside SIDK. Four methods, fallback hierarchy, boundary as first-class input, mass balance for integrated routes, GUM + Monte Carlo uncertainty, shadow calc, locked snapshots, restatement.
07	Inputs and sensors	How real data reaches the kernel. Edge agent, sensor lifecycle, lab results, ERP integration, supplier PCFs, quality flags, the provisioning/operating seam.
08	GREMI, phased	The first product on SIDK. Current state honestly named, six-phase release trajectory, AI overlay across all phases, what is and is not in scope today.
08b	GREMI at maturity, the complete shape	The destination picture. Five apps, five actors, three loops, eleven architectural principles, the trust chain composed end-to-end. The vocabulary doc 08 builds its phasing language on.
99	FAQ	Living. Grows as new questions come up. Organised by topic.

What’s not yet here

Honest about the gaps:

Customer-facing documentation. This knowledge base is for stakeholders inside the platform’s orbit. Tenant-facing onboarding material, verifier-facing engagement guides, and buyer-facing passport-reading guides are separate documentation layers that build on this one.
A second Product Owner’s product on SIDK. The architecture contemplates more than one. None exists today. When and if it does, doc 04a §5 (Pattern 3) flags it as needing dedicated treatment.
Live numbers from Topworth. Doc 08 is honest that integration is being wired and no real production carbon numbers are flowing yet. Once they are, doc 08 will be revised and the worked-example sections of doc 06 will gain real-data versions alongside the spec-derived ones.
Vertical packs beyond steel. The platform is designed for cement, aluminium, pharma, agri, and others, but the only mature pack today is steel. When a second pack ships, this knowledge base will gain a doc on cross-pack patterns and the doc-set will become genuinely vertical-neutral in voice.
A chat interface over the knowledge base. Mentioned in the original framing as an eventual product on top of this material. Not built. Once the knowledge base is stable, this is a reasonable next thing to scope.

If you find a gap not listed here, file it - that is the kind of feedback the knowledge base improves on.

The source documents this draws from

This knowledge base is the explainer layer. The source documents in Docs/ are normative - where they disagree with anything in here, they are correct and the knowledge base is the bug.

Docs/GREMI-App-Ecosystem/ - Foundation (the architectural root), the three loop specs (Quantification, Trust, Stewardship), the five app specs (GREMI, Grevoro Console, Verifier Workspace, Public Verifier, Sphuran Console), the design system, the entity-additions spec.
Docs/SIDK Handoff Docs/ - the SIDK developer documentation set. Start with README.md, then architecture.md, then classification-methodology.md.
Docs/05-carbon-engine.md - the Carbon Engine canonical specification.

Apache-2.0 licensing applies to the SIDK developer documentation set; the canonical Foundation and app specs sit under their own internal-canonical status. See each document’s status line for specifics.

A note on tone and convention

A few patterns held consistently across the knowledge base:

Every numeric or factual claim is cited inline with footnote-style references resolved at the bottom of each doc. The references include primary authoritative sources (IPCC, GHG Protocol, ISO standards, EU regulation text) and further-reading material. Where a claim is interpretive or based on internal context rather than a published source, the doc says so explicitly.
Where the source-of-truth document and the explainer diverge, the explainer flags the divergence (e.g., doc 08 explicitly notes that the three loops in Foundation are vision, not current state).
Deferred and aspirational items are named, not hidden. Doc 08 §3 lists capabilities deliberately not in any current phase. Doc 00 §“What’s not yet here” lists the knowledge base’s own gaps.
No verbatim quotes from EU regulation text. EUR-Lex blocks programmatic fetching; verbatim quotes have not been verified against the official text and so are not used in any doc. Article numbers and regulation identifiers are used instead. This will be revisited when verbatim text can be reliably pulled.

This index is the living entry point. Send corrections, gaps, or suggestions for new docs to the team’s working channel. The knowledge base improves by being read and challenged.

01-embodied-carbon