Storage Footprint of 4 FHIR Servers After Loading the Same Dataset

Storage size is the metric that quietly shapes hosting cost, backup windows, and disaster-recovery RTO over the life of a FHIR platform. A new public benchmark from Health Samurai loaded the same Synthea dataset (1,000 patients, around 2 million resources) into four FHIR servers running on identical hardware, and the storage footprint after import ranged from 4.24 GB to 22.6 GB. That spread is large enough to change the conversation about which server fits which deployment.

The Storage Numbers After Identical Data Load

The four servers, ranked by on-disk size after loading the same Synthea data:

  • Microsoft FHIR Server: 4.24 GB
  • Aidbox: 6.83 GB
  • Medplum: 11.8 GB
  • HAPI FHIR: 22.6 GB

That is a 5.3x spread on the same dataset. For more on FHIR storage and hosting models, the trade-offs behind the numbers matter more than the ranking itself.

What Drives the Spread

The spread is not really about how efficiently each server encodes a FHIR resource. The dominant factor is how each server treats search indexes at write time. HAPI, Medplum, and the Microsoft FHIR Server pre-build search indexes as resources arrive. The index tables, supporting structures, and metadata they maintain on write are what fill most of the disk. Aidbox ships without default search indexes; the operator decides which indexes to create based on the queries the application actually issues.

That single design choice produces the bulk of the footprint difference. The Microsoft FHIR Server happens to land smallest in this snapshot, which reflects SQL Server 2022's storage layout for the indexed columns it builds. HAPI's footprint reflects the JPA schema's broader pre-built index coverage.

Why the Indexing Default Matters in Production

A FHIR platform that pre-builds every plausible index ships ready for any query the application might issue. The cost is on disk, in import time, and in write throughput. A FHIR platform that ships with no default indexes is faster to import and smaller on disk, but the operator carries a real responsibility: an unindexed search at production load will degrade in ways that are hard to debug after the fact.

Neither default is wrong. They serve different operator profiles. Teams that want the platform to handle search performance automatically benefit from the pre-indexed default. Teams that own their search patterns and want explicit control prefer the no-default-indexes posture. The benchmark report makes the trade-off readable, which is the useful contribution.

What the Storage Footprint Implies for Hosting Cost

Storage cost itself is rarely the limiting factor in a CMS-0057-F deployment. What scales with storage is everything around it: backup volume, replication bandwidth, restore time, snapshot storage on cloud disks, and the working-set size that has to fit in memory for hot paths to stay fast. A FHIR platform that is 5x larger than its peers has 5x the backup time, 5x the replication catch-up window after a node loss, and 5x the cold-storage spend.

That is not a reason to pick the smallest footprint; it is a reason to understand what the footprint implies operationally before committing.

A Note on the Source

The benchmark was authored by Marat Surmashev, VP of Engineering at Health Samurai, which is also the company that develops Aidbox. The repository is open source and reruns daily, which makes the storage numbers re-verifiable on the live dashboard, but the obvious caveat applies: this is a vendor-run benchmark, and readers evaluating any of the four servers should re-run it against their own workload before drawing conclusions.

What to Do With This Number

Storage footprint is one input among many. For the broader question of how PostgreSQL-based FHIR engines compare on storage and operational tooling, the Top 5 PostgreSQL-based FHIR engines for 2026 covers the substrate layer. For how storage and indexing choices feed into the data store patterns a reusable FHIR investment depends on, the Top 6 FHIR data store patterns for reusable compliance investment covers the architectural side.

Share: Facebook Twitter Linkedin

Comments are closed.