Chapter 05 — Human in the loop is the product

A lot of AI products treat the model as the asset. The harder the model is to train, the deeper the moat — that is the conventional shape of an AI business. We took a different view of where the moat lives in the rooms this series is about.

The model is commodity input. The reconciliation logic that decides which model output is allowed to reach the audience — that is the product.

This chapter is about the human curation pipeline that produces, vets, and maintains the only translations the audience ever sees in the HADR mode of our system. The pipeline runs entirely on the same machine as the live runtime. It is the back office that makes the front-of-house dignity parity possible.

The four-state approval lifecycle

Every translation in the lexicon exists in one of four states. Draft is the linguist's personal workspace. Proposed is a candidate awaiting peer review — either a draft promoted by the linguist or an LLM-generated candidate the system has surfaced for human evaluation. Approved is the operational state; only approved translations get used at runtime. Rejected is the terminal state for candidates that fail review.

Every state transition writes to an immutable audit log. Who proposed it. Who approved it. When. What the prior state was. What the new state is. Every approved translation in the lexicon has a complete provenance trail. The trail is the trust.

The hybrid authoring workflow

The linguist's UI is a local web application served by the deployment machine itself. It does not require a network. The workflow has three input paths, used in combination.

Bulk imports come from existing terminology lists — government glossaries, established HADR vocabulary, prior-event lexicons. Each imported entry lands in the proposed state, ready for the linguist to confirm or edit.

LLM proposals fire when the content-package builder encounters a line it can't match against an existing approved entry. The system calls a local translation model and surfaces the candidate to the linguist with provenance marked ML-proposed. The candidate is not used until the linguist has reviewed it.

Manual entries are the linguist's direct work — terms they add from their own knowledge of the domain, the region, or the speaker's register. These enter the draft state until the linguist submits them for review.

In every path, the same gate applies: nothing reaches the audience until a human linguist has actively approved it for that specific language pair.

Per-language-pair independence

The lexicon is organized by language pair, and approval is per-pair. A term approved for English-to-Japanese is independent of the same source term's English-to-Mandarin entry. Different linguists may handle different pairs. The system does not assume one approved translation in one direction implies anything about any other direction.

This is operationally important. A linguist working on Japanese disaster terminology is not the same person as a linguist working on Tagalog or Yup'ik. The system respects that. The audit trail captures it. The lexicon stays consistent within a language pair without forcing cross-pair coupling.

Sneakernet updates

The pipeline produces, at the end of each curation cycle, an encrypted package containing the updated lexicon, the new content packages, and any software updates. Distribution happens on physical media — an encrypted USB drive carried by hand to the deployment site. There is no cloud sync. The deployment site has no path to a vendor server. Updates are merged with the local lexicon using a conflict-resolution UI when the same entry has been modified in both places.

This is not "we didn't have internet so we used a USB drive." It is a designed data lifecycle. The sneakernet cycle has provenance tracking, conflict resolution, and approval gates at every stage. The architectural commitment to no-egress applies to the curation pipeline as fully as it applies to the live runtime.

Why this is the moat

Translation models get cheaper every quarter. Apple Translation framework, NLLB-200, Megatron NMT, the next open-weight model — each is a candidate for the underlying translation engine, and each will be displaced by the one after it. The choice of translation model is a commodity decision.

The human curation workflow — the lexicon database, the approval lifecycle, the audit trail, the sneakernet cycle, the partner-organization integration — is the harder thing to build. It is the harder thing to maintain. It is the harder thing to replicate.

A competitor could acquire the same translation model tomorrow. They cannot acquire the relationship with the coalition language cell, the prefecture's multilingual disaster information consortium, or the Tribal language office. They cannot acquire the audit-trail provenance that lets a procurement officer trace every translation back to the linguist who approved it. They cannot acquire the discipline to refuse to display what hasn't been through the workflow.

ML proposes. Humans approve. The approval gate is the product, not the ML.

The next chapter widens the lens. The same architecture that runs the HADR briefing room runs the theater stage and the executive boardroom — and is designed to port cleanly to ruggedized hardware for tactical-edge deployment.

Human in the loop. On the edge. In the room.

The four-state approval lifecycle

The hybrid authoring workflow

Per-language-pair independence

Sneakernet updates

Why this is the moat

The HADR doctrine in one page.