17 February 2026 - Xavier Flanagan, Founder

How exora’s AI pipeline works

What happens when you upload a document to exora?

From your perspective, it is straightforward: you drop in a PDF or photo of a medical document, wait a few moments, and your health data appears - structured, searchable, and linked back to the original. But behind that simplicity is a multi-pass AI pipeline that reads your documents the way a clinician would, not just scanning for keywords but understanding clinical meaning.

Here is how it works.

Why multiple passes?

The naive approach to document processing is to throw the whole thing at an AI model and say “extract everything.” That works poorly for medical documents. A discharge summary might contain medication lists, vital signs, lab results, diagnoses, procedure notes, and follow-up instructions all woven together across multiple pages. Asking one model to do everything at once leads to missed information, confused context, and lower accuracy.

Instead, exora breaks the work into focused stages. Each pass has a specific job, and each one builds on the results of the previous pass. Think of it like a team of specialists rather than one generalist - each one focuses on what they do best.

Stage 1: Document analysis and encounter discovery

The first thing the pipeline does is understand what it is looking at. Is this a pathology report? A discharge summary? A specialist letter? A prescription? The document type determines how it should be read.

Then the pipeline identifies the healthcare encounters within the document. A single PDF might describe multiple visits - a hospital admission that included a surgery, a follow-up appointment, and a series of blood tests. Each encounter is logged with its date, provider, and facility, building the chronological backbone of your health timeline.

Stage 2: Entity detection

With the document structure understood, the pipeline scans for health entities - the individual facts that make up your medical record. This includes:

Conditions and diagnoses - from “Type 2 Diabetes Mellitus” to “mild osteoarthritis of the left knee”
Medications - drug names, doses, frequencies, routes of administration
Vital signs - blood pressure, heart rate, temperature, oxygen saturation
Laboratory results - blood tests, urine tests, pathology findings with reference ranges
Procedures - surgeries, imaging studies, biopsies
Allergies and adverse reactions
Immunisations

This is not simple keyword matching. When the pipeline sees “BP 120/80,” it understands that this represents two distinct measurements: a systolic blood pressure of 120 mmHg and a diastolic of 80 mmHg. When it sees “Amoxicillin 500mg TDS,” it knows that is amoxicillin, 500 milligrams, three times daily. Clinical context matters, and the pipeline is built to understand it.

Stage 3: Clinical extraction and structuring

Detected entities are then extracted into structured clinical data. This is where a mention of “metformin 500mg BD” becomes a proper medication record with the drug name, dose, frequency, and route all separated into discrete fields. Lab results get structured with their test names, values, units, and reference ranges.

Every extracted fact is linked to the healthcare encounter it belongs to, building a complete clinical picture organised by time and context rather than by document.

Stage 4: Medical coding

The final processing stage assigns internationally recognised medical codes to your health data. This is what makes the data truly interoperable - usable across different health systems, not just readable by humans.

Three coding systems are used:

SNOMED CT - the global standard for clinical terminology. It gives every condition, procedure, and finding a unique code that means the same thing in any health system worldwide. “Type 2 Diabetes Mellitus” becomes SNOMED code 44054006, unambiguous regardless of language or country.
RxNorm - the standard for medications. It normalises drug names across brands and generics, so “Panadol,” “Tylenol,” and “paracetamol 500mg tablet” all resolve to the same clinical concept.
LOINC - the standard for laboratory and clinical observations. It ensures that a “fasting blood glucose” test means the same thing whether it was ordered in Melbourne or Montreal.

Medical coding matters because it turns human-readable notes into machine-comparable data. When you want to see all your blood glucose results over time - across different labs, different doctors, different years - coding is what makes that possible.

Source provenance: every fact has a receipt

Throughout every stage, the pipeline tracks exactly where each piece of information came from. Not just which document, but the specific page and location within that page.

When you see a medication in your exora record, you can tap it and be taken directly to the exact spot in the original document where that medication was mentioned. This is not a summary or a paraphrase - it is a direct link to the source.

In healthcare, this matters enormously. AI systems can make mistakes. Documents can contain errors. The ability to verify any fact against its source is not optional - it is essential. We call it “every fact has a receipt” because that is exactly what it is: proof.

The AI providers

exora uses AI models from Google (Gemini) and OpenAI, selected per stage based on which performs best for that specific task. We continuously evaluate model performance and update our selections as providers release improvements.

An important note on data handling: your documents are processed through these providers’ commercial APIs. Under their paid API terms, your data is not used to train their AI models. It is processed and returned to us. The providers may temporarily retain data for safety monitoring (up to 30 days), but it is not stored long-term or used for any purpose beyond serving your request.

AI is a tool, not a doctor

The pipeline is powerful, but it is not infallible. AI-extracted information can contain errors or miss nuances that a human clinician would catch. Medical codes are assigned algorithmically and have not been verified by a healthcare professional.

That is why source provenance is so central to exora. We do not ask you to trust the AI blindly. We give you the tools to verify everything it produces. The AI does the heavy lifting of reading, extracting, and organising. You and your healthcare team make the clinical decisions.

exora is a tool that helps you understand and manage your health information. It does not diagnose, does not recommend treatment, and does not replace professional medical advice. It helps you be a more informed participant in your own care.

Back to blog

Language and Region