CSV to FHIR: Mapping Patient Demographics Step by Step
Walk a messy partner CSV through the patient_demographics template and out the other side as a clean FHIR Patient resource — name, gender, birthDate, and identifier — without hand-writing a mapping file.
Almost every healthcare integration starts the same way: a partner sends you a CSV of patients, and not one of its column names matches what your system — or the FHIR Patient resource — actually expects. The header says Pat_Nachname where you want a family name, Geschl where you want gender, and a date column that could be ISO, could be US MM/DD/YYYY, and is probably both depending on the row.
This post walks one realistic file through AdaptivMapr’s patient_demographics template and shows how the mapping resolves to a valid FHIR Patient resource — without you writing a column-by-column mapping file by hand.
The CSV we were handed
Here is a trimmed, anonymised version of a partner export. Note the mixed-language headers — this is the norm, not the exception, in cross-border European healthcare data:
Pat_ID,Pat_Nachname,Pat_Vorname,Geschl,Geburtsdatum,Krankenkasse_Nr
P-10293,Müller,Anna,W,1984-03-02,80391847200
P-10294,Rossi,Marco,M,07/11/1990,80391847201Six columns, three of which are German, one date that disagrees with itself between rows, and an identifier whose meaning is entirely implicit in the column name. A regex-based importer falls over on the first one.
Step 1 — Send the schema, not the data
You do not need to ship the whole file to get a mapping proposal. In schema-only mode AdaptivMapr only sees the column headers and up to three short sample rows. That is enough for the cascade to propose a complete mapping, and it keeps the request free of any data protection agreement. Point the upload at the template and let the engine do the matching:
curl -X POST https://api.adaptivmapr.com/v1/uploads \
-H "Authorization: Bearer $ADAPTIVMAPR_API_KEY" \
-F "template=patient_demographics_v1" \
-F "file=@partner_patients.csv"The full quickstart, including how to read back and confirm the proposed mapping, lives in the docs.
Step 2 — How the cascade resolves each header
AdaptivMapr runs a five-layer cascade, cheapest first, and stops the moment a column resolves. For this file most columns never reach the metered LLM layer at all:
Pat_Nachname→family_name— caught at the heuristic layer. The template carriesnachnameas a German hint, and normalisation strips thePat_prefix and the underscore before comparing.Pat_Vorname→given_name— same layer, German hintvorname.Geschl→gender— an abbreviation ofGeschlecht. The heuristic match is partial, so this one is settled by the fuzzy layer’s token-set comparison, then sanity-checked by the semantic layer.Geburtsdatum→birth_date— heuristic hit on the German hintgeburtsdatum.Krankenkasse_Nr→insurance_id— “health insurer number.” This is the kind of domain-specific phrasing the semantic (embeddings) layer is built for.
Because the earlier layers absorb most of the work, the LLM layer is only consulted on genuinely ambiguous columns, and even then it is constrained to the template’s allowed column set so it cannot invent a field that does not exist.
Step 3 — Validation before commit
Mapping a header to a field is only half the job; the value still has to be plausible. patient_demographics attaches field-level validators that fire on commit. The birth_date column is validated as a date and range-checked so a typo’d 2904 birth year is rejected rather than silently written. The gender column is normalised against an allowed set rather than free-text. You can read how each validator behaves in the validator reference.
patient_demographics is a medium-risk template. That means the mapping response flags requires_hitl: true so your own pipeline can route the proposed mapping to a human reviewer before anything is committed. AdaptivMapr does not decide that for you — it surfaces the flag and lets you gate.
Step 4 — The FHIR Patient resource that comes out
With the columns mapped and the values validated, the row for Anna Müller projects onto a FHIR Patient resource. The template carries a fhir_resource mapping of Patient, so the canonical fields line up directly with the spec:
{
"resourceType": "Patient",
"identifier": [
{
"system": "urn:partner:patient-id",
"value": "P-10293"
},
{
"system": "urn:health-insurer:member-no",
"value": "80391847200"
}
],
"name": [
{ "use": "official", "family": "Müller", "given": ["Anna"] }
],
"gender": "female",
"birthDate": "1984-03-02"
}A few details worth calling out, because they trip people up:
nameis an array of HumanName objects.familyis a single string;givenis an array (a person can have several given names). The GermanW/Mcodes are normalised to FHIR’s administrative-gender values (female,male,other,unknown) — see the administrative-gender value set.birthDateis a FHIRdate, which is why Marco’s07/11/1990has to be coerced to1990-11-07before it is valid — the validator catches the ambiguity rather than guessing silently across the batch.identifieris also an array. A patient legitimately carries more than one — here a partner-assigned ID and an insurer member number — each under its ownsystemURI so the two never collide.
Why this beats a hand-written mapping
The first partner CSV is easy enough to map by hand. The tenth one, with slightly different headers, in a slightly different language, is where bespoke mapping code rots. The template encodes the canonical shape once — multilingual hints, validators, and the FHIR resource mapping — and the cascade re-derives the column mapping for every new file. You review the diff; you do not rewrite the parser.
Start from the patient_demographics template, read the FHIR Patient definition alongside it, and send a schema-only request with your own headers to see the proposed mapping before any data leaves your system.