BlogFHIR

CSV to FHIR: Mapping Patient Demographics Step by Step

Walk a messy partner CSV through the patient_demographics template and out the other side as a clean FHIR Patient resource — name, gender, birthDate, and identifier — without hand-writing a mapping file.

The AdaptivMapr TeamHealthcare IntegrationsJune 11, 20268 min read

Almost every healthcare integration starts the same way: a partner sends you a CSV of patients, and not one of its column names matches what your system — or the FHIR Patient resource — actually expects. The header says Pat_Nachname where you want a family name, Geschl where you want gender, and a date column that could be ISO, could be US MM/DD/YYYY, and is probably both depending on the row.

This post walks one realistic file through AdaptivMapr’s patient_demographics template and shows how the mapping resolves to a valid FHIR Patient resource — without you writing a column-by-column mapping file by hand.

The CSV we were handed

Here is a trimmed, anonymised version of a partner export. Note the mixed-language headers — this is the norm, not the exception, in cross-border European healthcare data:

Pat_ID,Pat_Nachname,Pat_Vorname,Geschl,Geburtsdatum,Krankenkasse_Nr
P-10293,Müller,Anna,W,1984-03-02,80391847200
P-10294,Rossi,Marco,M,07/11/1990,80391847201

Six columns, three of which are German, one date that disagrees with itself between rows, and an identifier whose meaning is entirely implicit in the column name. A regex-based importer falls over on the first one.

Step 1 — Send the schema, not the data

You do not need to ship the whole file to get a mapping proposal. In schema-only mode AdaptivMapr only sees the column headers and up to three short sample rows. That is enough for the cascade to propose a complete mapping, and it keeps the request free of any data protection agreement. Point the upload at the template and let the engine do the matching:

curl -X POST https://api.adaptivmapr.com/v1/uploads \
  -H "Authorization: Bearer $ADAPTIVMAPR_API_KEY" \
  -F "template=patient_demographics_v1" \
  -F "file=@partner_patients.csv"

The full quickstart, including how to read back and confirm the proposed mapping, lives in the docs.

Step 2 — How the cascade resolves each header

AdaptivMapr runs a five-layer cascade, cheapest first, and stops the moment a column resolves. For this file most columns never reach the metered LLM layer at all:

Pat_Nachname → family_name — caught at the heuristic layer. The template carries nachname as a German hint, and normalisation strips the Pat_ prefix and the underscore before comparing.
Pat_Vorname → given_name — same layer, German hint vorname.
Geschl → gender — an abbreviation of Geschlecht. The heuristic match is partial, so this one is settled by the fuzzy layer’s token-set comparison, then sanity-checked by the semantic layer.
Geburtsdatum → birth_date — heuristic hit on the German hint geburtsdatum.
Krankenkasse_Nr → insurance_id — “health insurer number.” This is the kind of domain-specific phrasing the semantic (embeddings) layer is built for.

Because the earlier layers absorb most of the work, the LLM layer is only consulted on genuinely ambiguous columns, and even then it is constrained to the template’s allowed column set so it cannot invent a field that does not exist.

Step 3 — Validation before commit

Mapping a header to a field is only half the job; the value still has to be plausible. patient_demographics attaches field-level validators that fire on commit. The birth_date column is validated as a date and range-checked so a typo’d 2904 birth year is rejected rather than silently written. The gender column is normalised against an allowed set rather than free-text. You can read how each validator behaves in the validator reference.

patient_demographics is a medium-risk template. That means the mapping response flags requires_hitl: true so your own pipeline can route the proposed mapping to a human reviewer before anything is committed. AdaptivMapr does not decide that for you — it surfaces the flag and lets you gate.

Step 4 — The FHIR Patient resource that comes out

With the columns mapped and the values validated, the row for Anna Müller projects onto a FHIR Patient resource. The template carries a fhir_resource mapping of Patient, so the canonical fields line up directly with the spec:

{
  "resourceType": "Patient",
  "identifier": [
    {
      "system": "urn:partner:patient-id",
      "value": "P-10293"
    },
    {
      "system": "urn:health-insurer:member-no",
      "value": "80391847200"
    }
  ],
  "name": [
    { "use": "official", "family": "Müller", "given": ["Anna"] }
  ],
  "gender": "female",
  "birthDate": "1984-03-02"
}

A few details worth calling out, because they trip people up:

name is an array of HumanName objects. family is a single string; given is an array (a person can have several given names). The German W/M codes are normalised to FHIR’s administrative-gender values (female, male, other, unknown) — see the administrative-gender value set.
birthDate is a FHIR date, which is why Marco’s 07/11/1990 has to be coerced to 1990-11-07 before it is valid — the validator catches the ambiguity rather than guessing silently across the batch.
identifier is also an array. A patient legitimately carries more than one — here a partner-assigned ID and an insurer member number — each under its own system URI so the two never collide.

Why this beats a hand-written mapping

The first partner CSV is easy enough to map by hand. The tenth one, with slightly different headers, in a slightly different language, is where bespoke mapping code rots. The template encodes the canonical shape once — multilingual hints, validators, and the FHIR resource mapping — and the cascade re-derives the column mapping for every new file. You review the diff; you do not rewrite the parser.

Start from the patient_demographics template, read the FHIR Patient definition alongside it, and send a schema-only request with your own headers to see the proposed mapping before any data leaves your system.