top of page

AI-Driven Digitization of Paper ECG Records - Comprehensive Overview

  • Writer: Kasturi Murthy
    Kasturi Murthy
  • 5 hours ago
  • 6 min read

Tavily Deep Research(*) is an AI-powered search engine designed specifically to act as the "eyes" for autonomous agents and large language models (LLMs). Unlike traditional search engines that prioritize visual layouts and advertisements for human browsing, Tavily focuses on parallel data extraction, scouring hundreds of web sources simultaneously to identify and aggregate the most relevant, high-quality information. By stripping away messy HTML "noise" and providing structured, token-efficient data, it enables AI systems to perform multi-hop reasoning and generate reports with high factual accuracy and verifiable citations

This blog post title comes from Tavily Research when Tavily was tasked with researching the topic "I want an overview on the usage of AI in the digitization of ECG paper records". The results obtained by this tool are presented below

1. AI Techniques & Algorithms

Technique


Typical Use in ECG Digitisation

Representative Studies

Object-detection CNNs (Faster R-CNN)

Locate the region that contains all leads on a scanned sheet, robust to rotation, cropping, creases and text artefacts.

Shang et al. combined Faster R-CNN with a ResNet-101 backbone for precise region proposal and achieved an SNR of 0.893 on the challenge data [1].

Segmentation U-Net

Pixel-level extraction of the waveform trace after the region has been detected; yields clean binary masks for each lead.

The same hybrid pipeline used U-Net to segment waveforms with high accuracy [1].

End-to-end deep-learning pipelines

Two-stage models that first detect grid intersections and correct geometric distortions, then generate waveform masks; fully automated from mobile-captured images.

Demolder et al. described such a pipeline that processes raw photos without manual steps [4].

Pure CNN-based signal recovery

Direct mapping from image patches to sampled voltage values; often combined with post-processing to enforce physiological constraints.

Wu et al. presented a fully-automated tool that learns this mapping and removes gridlines internally [2].

Hybrid pixel-based methods (Otsu / Sauvola + DL)

Classical thresholding (Otsu for high-quality scans, Sauvola for noisy images) is used to pre-process images before a DL refinement stage.

ECGtizer integrates these methods and offers three extraction algorithms (Full, Fragmented, Lazy) that outperform earlier tools [3].

Transformer-based or foundation models

Emerging research applies attention mechanisms to jointly model multi-lead spatial relationships and temporal waveform reconstruction, improving label-efficiency.

Recent surveys note transformer-based ECG models achieving state-of-the-art performance on large datasets [27].

Optical Character Recognition (OCR) modules

Extract demographic and clinical metadata (patient ID, lead labels) from the printed header, enabling automatic annotation of the digitised signal.

Some pipelines augment Faster R-CNN with OCR to capture text before waveform extraction [1].

2. Datasets & Benchmarks

Dataset

Size / Composition

Availability

Notable Use

PTB-Image (new 2025 release)

Thousands of scanned 12-lead ECGs with diverse artefacts; paired with original digital waveforms.

Publicly released on arXiv with code and data links.

Primary benchmark for the 2024 PhysioNet Challenge; used to train Faster R-CNN/U-Net models [9].

PTB-XL (and PTB-XL+)

21,799 12-lead recordings; includes both raw signals and synthetic ECG images generated via ECG-Image-Kit.

Open access via PhysioNet.

Basis for many digitisation challenges; provides ground-truth waveforms for image-to-signal tasks [7],[8].

ECG-Image-Database (PM-ECG-ID)

6,000 synthetic and real-world ECG images derived from 100 unique ECGs, augmented with rotation, blur, and grid variations.

Used in high-precision AI digitisation studies; available on request [10].


CPEIC Cardiac ECG Images

12-lead ECG images from Pakistani hospitals, sampled at 500 Hz; annotated with clinical labels.

Publicly described in a 2022 MDPI paper; digitised with the ecg_digitize tool [6].


Moody Challenge 2024 training set

21,799 ECG waveforms + corresponding images; hidden validation set includes scanned and photographed records from Emory Hospital.

Distributed to challenge participants; serves as a realistic benchmark for robustness to real-world artefacts [7],[8].


Proprietary vendor datasets (e.g., PMcardio, iRhythm)

Hundreds of thousands of scanned ECGs collected in routine clinical practice; not publicly released.

Used internally for commercial product validation; regulatory submissions reference these datasets [14],[20].


When a dataset’s exact size or composition is not disclosed in the source, it is noted as “open-ended.”

3. Performance Metrics & Validation

Metric

What It Measures

Typical Reported Values

Signal-to-Noise Ratio (SNR)

Ratio of signal power to residual error after digitisation; higher = cleaner reconstruction.

Faster R-CNN pipeline reported SNR ≈ 0.893 [1]; other studies cite SNR > 2.5 for high-quality scans [10].

Pearson Correlation Coefficient (PCC)

Linear correlation between digitised and reference waveforms (per lead).

Reported PCC values range from 0.67 (challenging V2 lead) to >0.99 for clean images [10],[11].

Root-Mean-Square Error (RMSE)

Average amplitude error in millivolts; directly interpretable clinically.

RMSE < 0.10 mV achieved in several high-precision methods [11]; some pipelines report lead-wise RMSE ≈ 0.08 mV [10].

Lead-wise waveform correlation

Cross-correlation after allowing small horizontal/vertical shifts to account for grid mis-alignment.

The PhysioNet Challenge evaluation used this metric; top-ranked teams achieved average correlation > 0.90 [8].

Downstream diagnostic performance

Impact of digitisation on AI-based arrhythmia classification (e.g., AUC, accuracy).

ECGtizer’s digitised signals yielded AUC improvements of 2-3 % in TdP-risk prediction compared with older tools [3].

Fail-rate (%)

Percentage of images where the pipeline could not produce a usable waveform.

Reported fail rates vary from <1 % on curated datasets to ~5 % on highly noisy mobile captures [10].

Validation approaches typically combine internal cross-validation on synthetic/augmented data, external testing on hidden challenge sets, and clinical visual inspection by cardiologists to ensure that subtle features (e.g., P-waves) are preserved [4],[10].

4. Commercial & Clinical Implementations

Vendor / Product

Core AI Digitisation Capability

Integration Points

PMcardio (Powerful Medical)

Proprietary CNN-based engine that automatically detects and extracts ECG waveforms from any image format (paper, screenshot, photo).

Integrated with major EHRs via HL7/DICOM, offers JSON/XML export; mobile app for point-of-care capture [14],[15],[16],[17].

iRhythm Zio®

Uses FDA-cleared deep-learned arrhythmia detection; includes a pre-processing stage that digitises scanned ECGs for retrospective analysis.

Data flow through secure cloud; supports HL7 and FHIR for EHR interoperability [20].

PM-ECG-ID / PMcardio automated digitisation

Real-time conversion of uploaded ECG images into digital signals, feeding directly into the AI interpretation pipeline.

Embedded in the PMcardio Enterprise dashboard; supports batch upload for bulk archival [14],[17].

ECGtizer (open-source)

Fully automated tool offering three extraction algorithms; can be deployed on local servers or cloud instances.

Provides Python API; compatible with DICOM-ECG standards for downstream analysis [3].

Custom solutions from research labs (e.g., ETH Zurich’s Faster R-CNN/U-Net, Demolder’s two-stage pipeline)

Typically released as GitHub repositories; used in academic hospitals for pilot studies.

Often require manual integration with PACS or local storage; may be wrapped in containerised services for easier deployment [1],[4].

These products are commonly used in tele-cardiology platforms to enable remote specialists to review legacy paper ECGs, and in hospital digitization projects that aim to migrate historic archives into searchable databases.

5. Regulatory & Compliance Landscape

Region

Regulatory Pathway

Status of AI-ECG Digitisation Tools

United States (FDA)

Software as a Medical Device (SaMD) - 510(k) or De Novo (SAMD) pathway.

PMcardio has obtained CE marking and is pursuing FDA clearance; iRhythm’s AI modules are FDA-cleared under 510(k) [19],[20].

European Union

Medical Device Regulation (MDR) 2017/745; CE marking required.

PMcardio holds a CE mark and complies with GDPR for data privacy [17],[21].

Japan / UK

PMDA (Japan) and UKCA (UK) recognitions align with FDA/CE pathways.

iRhythm reports approvals in Japan and the UK [20].

Standards

IEC 60601 (electrical safety), DICOM-ECG (image-signal interchange), HL7/FHIR (clinical data exchange).

Commercial solutions claim conformity to DICOM-ECG and secure HL7 interfaces [14],[17].

Regulators emphasize Good Machine Learning Practice (GMLP), post-market surveillance, and human-in-the-loop review for safety-critical stages such as waveform extraction, especially when the digitized signal feeds diagnostic AI [21].

6. Challenges & Limitations

Category

Key Issues

Reported Mitigation Strategies

Technical

• Variability in paper quality (crease, fading, low contrast).


• Non-standard lead layouts and missing calibration grids.


• Noise from gridlines overlapping the waveform.

• Data-augmentation pipelines that simulate 3D deformations and artefacts during training [4];


• Hybrid OCR + CNN pipelines to recognise lead labels and adjust scaling [1];


• Use of adaptive thresholding (Sauvola) before DL refinement [3].

Operational

• Integration with existing PACS/EHR workflows; staff may need to capture high-quality images.

• Mobile apps with built-in guidance (e.g., alignment overlays) to improve capture quality [15];


• Batch-processing APIs that accept DICOM-ECG containers for seamless ingestion [14].

Ethical / Bias

• Training data often sourced from a limited set of hospitals (predominantly Western populations), risking reduced performance on under-represented groups.

• Emerging federated learning frameworks that allow hospitals to collaboratively improve models without sharing raw images [27],[28];


• Explicit reporting of demographic performance sub-analyses in validation studies [19].

Clinical Safety

• Small localisation errors can distort critical intervals (e.g., PR, QT).

• Post-processing checks that enforce physiologically plausible interval ranges; flagging of outliers for manual review [4].

7. Future Directions & Research Gaps

  1. Multimodal Fusion - Combining image-derived waveforms with auxiliary data (e.g., patient demographics, prior digital ECGs) using transformer-based architectures to improve reconstruction fidelity and downstream diagnosis [27].

  2. Real-time Mobile Capture - Edge-optimized models (TinyML, quantised CNNs) that run directly on smartphones, enabling instant digitisation without cloud latency [30].

  3. Federated & Privacy-Preserving Learning - Protocols that let institutions co-train digitisation models on local paper ECG archives while complying with GDPR/HIPAA [27],[28].

  4. Standardised Benchmarks - Need for a universally accepted test-set that includes diverse paper qualities, multi-lead formats, and clinically annotated ground truth; PTB-Image is a step forward but broader community adoption is required.

  5. Explainability & Clinical Validation - Development of visual heat-maps that show which image regions contributed to the reconstructed signal, supporting clinician trust and regulatory acceptance [16].

  6. Regulatory Evolution - Anticipated updates to FDA’s “predetermined change control” and EU MDR guidance to accommodate continuously learning SaMD, which will affect how digitisation algorithms are updated post-deployment [21].

References

  1. https://cinc.org/archives/2024/pdf/CinC2024-199.pdf

  2. https://www.nature.com/articles/s41598-022-25284-1

  3. https://arxiv.org/html/2412.12139v1

  4. https://www.medrxiv.org/content/10.1101/2025.11.19.25340630v1.full-text

  5. https://pmc.ncbi.nlm.nih.gov/articles/PMC8248903/

  6. https://www.mdpi.com/1424-8220/24/8/2484

  7. https://moody-challenge.physionet.org/2024/papers/cinc_template.pdf

  8. https://www.cinc.org/archives/2024/pdf/CinC2024-011.pdf

  9. https://arxiv.org/abs/2502.14909v1/

  10. https://www.medrxiv.org/content/10.1101/2024.08.31.24312876v1.full.pdf

  11. https://www.sciencedirect.com/science/article/abs/pii/S0022073625000287

  12. https://pmc.ncbi.nlm.nih.gov/articles/PMC9572306/

  13. https://www.sciencedirect.com/science/article/pii/S2667099223000191

  14. https://www.powerfulmedical.com/

  15. https://www.powerfulmedical.com/pmcardio-individuals/

  16. https://www.linkedin.com/posts/powerful-medical_pmcardio-stemi-ecginterpretation-activity-7330520550823407616--drC

  17. https://www.powerfulmedical.com/pmcardio-organizations/

  18. https://www.rhythm360.io/blog/electronic-health-records-system

  19. https://www.tctmd.com/news/fda-clears-ai-ecg-screening-tools-cv-care-whats-next-grabs

  20. https://www.irhythmtech.com/us/en/solutions-services/fda-cleared-ai

  21. https://www.techindia.com/company/resources/ethical-considerations-and-regulatory-compliance-in-aibased-ecg-arrhythmia-detection

  22. https://www.nature.com/articles/s41746-020-00324-0

  23. https://www.cureus.com/articles/376607-investigating-the-efficacy-of-ai-powered-innovations-in-ecg-analysis-and-continuous-heart-monitoring-a-comprehensive-narrative-review

  24. https://www.sciencedirect.com/science/article/pii/S001048252500945X

  25. https://brieflands.com/journals/ijcp/articles/143437

  26. https://www.mdpi.com/2039-7283/15/9/169

  27. https://www.mdpi.com/1424-8220/25/17/5272

  28. https://www.preprints.org/manuscript/202508.1425

  29. https://cikm2025.org/program/proceedings

  30. https://www.sciencedirect.com/science/article/pii/S1383762125002590

  31. https://www.sciencedirect.com/science/article/pii/S2468451125000571

Recent Posts

See All
Exploring the Power of Parallel Web Systems

In the current landscape of 2025, we are witnessing a fundamental shift in how the internet is used. For decades, the web was built for human eyes—optimized for visual layouts, advertisements, and seq

 
 
 
Bridging Patents to Academic Validation

Searching is Science, but Finding is an Art This phrase is often repeated in the halls of Information Science, and for good reason. Searching is a mechanical act—the "Science. " It is the algorithm t

 
 
 
  • Facebook
  • Twitter
  • LinkedIn

©2018 by Indriya. Proudly created with Wix.com

bottom of page