Multivariate Contrastive Predictive Coding with Sliding Windows for Disease Prediction from Electronic Health Records

  • Hongxu Yuan
  • , Xiaozhu Jing
  • , Yuzheng Yan
  • , Wuman Luo

Research output: Contribution to journalArticlepeer-review

Abstract

Learning effective patient representations from electronic health records (EHRs) is crucial for improving disease prediction models. However, existing supervised learning methods are hindered by high labeling costs. Moreover, capturing complex temporal and multi-indicator relationships—as well as localized temporal pattern shifts in clinical settings—remains a significant challenge. To address these issues, the adaptive multi-indicator contrastive predictive coding (AMCPC) framework is proposed, a self-supervised pretraining approach tailored for EHR data. AMCPC utilizes an adaptive optimal window-size selection algorithm to segment patient visit sequences into temporal sub-windows, enabling the model to focus on localized, context-specific health patterns. Furthermore, by extending contrastive predictive coding (CPC) through a multi-indicator approach, AMCPC employs a 2D convolutional neural network to capture global correlations among medical indicators within each sub-window. Extensive experiments on three real-world clinical datasets demonstrate that AMCPC outperforms both fully supervised and existing self-supervised methods, particularly when trained with limited labeled data. AMCPC establishes an effective self-supervised pretraining framework for unlabeled EHR data, which can be fine-tuned with minimal labeled data—significantly enhancing downstream predictive performance and reducing reliance on large-scale labeled datasets.

Original languageEnglish
JournalAdvanced Intelligent Systems
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • contrastive predictive coding
  • disease prediction
  • electronic health records
  • patient representation

Fingerprint

Dive into the research topics of 'Multivariate Contrastive Predictive Coding with Sliding Windows for Disease Prediction from Electronic Health Records'. Together they form a unique fingerprint.

Cite this