Type 2 Diabetes Mellitus Real-World Data Preprocessing and Modeling

Published: 25 February 2026| Version 1 | DOI: 10.17632/j26gdxcw7f.1
Contributor:

Description

Machine Learning (ML) applied to healthcare Real World Data (RWD) may improve patient management. RWD, however, requires extensive preprocessing to make it ML-ready. The aim of this resource is to explore the impact of preprocessing on ML models applied to RWD from type 2 diabetes patients visits. A GitHub repo is available with the code related to the experiments, to show all the steps we prepared, the decisions we took and alternative pipelines we created. Work done in partial fulfillment of my PhD dissertation: "Leveraging real-world data with machine learning to disentangle the complexity of multimorbid internal medicine patients" Linked in the paper: Montagna M, Rabadzhiev AS, Traverso A, Setola E, Draetta E, Dimonte A, Barbieri S, Fabiani B, Piemonti L, Esposito A, Tacchetti C and Rovere Querini P (2026) From raw data to actionable insights: preprocessing real-world data for machine learning in diabetes care. Front. Digit. Health 8:1685842. doi: 10.3389/fdgth.2026.1685842

Files not available for this dataset

This contains only metadata

Institutions

  • Ospedale San Raffaele
    Lombardia, Milano
  • Universita Vita-Salute San Raffaele Facolta di Medicina e Chirurgia
    Lombardia, Milano

Categories

Medicine, Data Science, Machine Learning, Healthcare Research

Licence