BlogMay 7, 20265 min read

Adding a LOINC validator to your import pipeline

A practical look at validating LOINC codes in a lab-result import, with examples drawn from the AdaptivMapr lab_result_catalog template.

By AdaptivMapr Team

LOINC codes look like 718-7 or 4548-4. They are stable, well-curated, and almost no clinical CSV in production carries them in the column the spec asks for. What you get instead is "HbA1c", "Hämoglobin A1c", "GlykHb", or — on a bad day — a free-text description like "blood sugar long-term average".

A LOINC validator on its own does not fix that. It tells you whether 718-7 is a syntactically valid LOINC code; it does not tell you whether "Hämoglobin A1c" should be 4548-4. You need a validator for the syntactic check and a mapping layer for the semantic one. AdaptivMapr separates both jobs.

The validator

On the lab_result_catalog template, the loinc_code field carries a loinc validator. The check is:

/^[0-9]{1,7}-[0-9]$/

plus an optional check digit step. If the customer wrote 718.7 or LP718-7, the validator rejects the row and the cascade flags the column for human review.

The mapping

The harder problem is the semantic one. The heuristic layer of the cascade compares the customer's column header (and, in schema-only mode, three sample rows) against the hint list on the LOINC field. The hint list ships with the common synonyms in the five supported languages.

When a column hits the auto-accept threshold — {minN:100, minRatio:0.95} against the persisted statistics table, or {minN:20, minRatio:1.00} for first-time matches — the LLM never runs. That is the single biggest cost lever in the system; we go out of our way to preserve it.

Wiring it into your pipeline

If you already have an ingestion job and want to add LOINC validation in front of it, the call shape is:

POST /v1/uploads
{
  "file": "...",
  "template": "lab_result_catalog",
  "mode": "schema_only"
}

The response carries the proposed mapping plus, per row, a list of validation errors keyed by field. Rows that fail the LOINC check are not committed; they land in the review queue so a human can either correct the source or accept a fallback LP* code.

What this does not solve

Free-text descriptions ("HbA1c percentage in whole blood") are still a human-curation problem. Full-data mode can take a stab at it via the LLM, but the right move for high-volume catalogues is to maintain a customer-specific synonyms list and let the statistics layer learn it over time.

Browse the lab_result_catalog template for the full field list, or jump to the dashboard to test it against your data.

Try it in 30 seconds

Schema-only mode is free and unlimited. No DPA, no card, no signup required for the MCP free tier.

Browse templates Open the dashboard →