BlogMay 21, 20265 min read

Multilingual patient roster import (DE/FR/IT)

Swiss healthcare data arrives in three languages on the same file. Here is how the cascade resolves "Vorname", "Prenom", and "Nome" to the same field.

By AdaptivMapr Team

A Swiss hospital roster will, on a normal day, arrive with German column headers from the HR system, French headers from the bilingual cantonal payroll feed, and the occasional Italian header from the Ticino branch. The same logical field — given name — appears as "Vorname", "Prénom", and "Nome" on three different files from the same customer.

Generic CSV importers handle this by asking the user to map manually, file by file. That works for one tenant. It does not scale to a multi-tenant healthcare SaaS where the import widget is in the critical path of every onboarding.

How the cascade handles it

Each field on each template carries a multilingual hint list. For given_name on the patient_demographics template the hints include — among others — "first name", "given name", "Vorname", "Prénom", "Prenom", "Nome", "Nombre". The heuristic layer normalises both the incoming header and the hint list (strip accents, lowercase, collapse punctuation/whitespace) before comparing.

"Prénom" and "PRENOM" both normalise to prenom. The comparison is a direct equality check at that point. No LLM involved.

Three layers, in order

  1. Statistics — if this customer (or the global learning table) has accepted "Prénom" as given_name on the patient template more than 100 times with >95% agreement, the cascade auto-accepts without ever checking the hints.
  2. Heuristic — the normalised header is compared against the column key, the human label, and every hint. Multilingual hints live here.
  3. LLM — only invoked for the columns that neither statistics nor heuristics resolved. Constrained to return one of the allowed columns from the target template.

On a typical Swiss patient roster, layers 1 and 2 resolve every column. The LLM call cost is zero, and the latency is dominated by the file parse.

What about the values

Headers are the easy half. The values are messier — a "language" column might carry "DE", "Deutsch", "Allemand", "Tedesco", or "German" depending on the source system. For schema-only callers, AdaptivMapr returns the proposed mapping and the customer's code applies it; value-level normalisation is up to the customer, with the validators on the target field constraining the accepted shape.

For full-data callers, the LLM cleanup pass runs through PHI Gateway and normalises the values inline — "Deutsch" becomes de, "Allemand" becomes de. The rules are deterministic where they can be and learned where they cannot.

Adding your own hints

If your tenant population uses a vocabulary the templates do not already cover — a regional dialect, an internal code, a vendor abbreviation — adding hints to the field definition is the highest-leverage change you can make. A single new hint can move thousands of imports from the LLM layer down to the heuristic layer, which is free and instant.

Browse the patient_demographics template to see the current hint list, or jump to the dashboard to test against a multilingual fixture.

Try it in 30 seconds

Schema-only mode is free and unlimited. No DPA, no card, no signup required for the MCP free tier.

Multilingual patient roster import (DE/FR/IT) — AdaptivMapr