Framework for Federated Artificial Intelligence for the Optimization of Pancreatic Cancer Treatment
Introduction: Research has shown AI models to be highly effective in predicting medical phenotypes such as disease prognosis or treatment response, often outperforming standard inference [1], [2]. However, studies integrating medical registry and omics data are often limited by sample size and systematic biases of single cohorts or non-independent, identically distributed (non-IID) and partially-non-overlapping (PNO) data in multi-cohort studies and thus may lack model robustness, hindering the transition towards clinical practice. This becomes particularly apparent when investigating complex oncological diseases such as pancreatic ductal adenocarcinoma (PDAC), presenting an extraordinarily aggressive, locally invasive tumour biology, a tendency to distant metastases, stromal dependent tumour growth [3], and the exceptionally high and heterogeneous resistance to conventional chemotherapy. This aggressive malignancy with a rising incidence is predicted to become the second leading cause of cancer-related death by 2030 in the industrialised world.
State of the art: While a data-centralised integration of multiple cohorts and subsequent model training can aid to overcome the issues of small, biassed data, such methods are often prohibited by legal patient privacy regulations. Federated Artificial Intelligence (FAI) approaches developed for such circumstances are able to aggregate locally trained machine learning models without sharing distributed data [4]. Widely used in commercial FAI applications, research only recently started adapting FAI towards biomedical applications [5].
Concept & implementation: Here we will present the FAIrPaCT consortium, consisting of the University Medical Center Göttingen, the University Hospital Giessen and Marburg and the Rechts der Isar Hospital, Technical University Munich. Our goal is to develop a software system supported by federated artificial intelligence called FAIrPaCT that will enable the analysis of clinical patient data and molecular cancer cell data from patients with pancreatic cancer across institutes. Our project combines three of the largest patient cohorts (KFO5002, KFO325, SFB1321) on pancreatic cancer in Germany, which are unique in size and heterogeneity.
While all datasets adhere to good scientific practice concerning reproducibility and are well suited for local analysis major efforts are required to map the challenging data, suffering from heterogeneity and nonuniform nomenclature, non-IIDnes and site specific information into a common information model. In particular, we will build a data management (DM) framework encompassing the Medical Informatics Initiative’s common data model in combination with PDAC specific extension modules harmonising ontologies.
Moreover, we will develop FAI algorithms based on Federated Deep Neural Networks and Federated Random Forest, enable them to tackle challenges such as non-IIDnes and PNO, and tailor these to potentially privacy-sensitive cancer-registry and biomedical patient data. Federated AI techniques aim to build a generalised global model by aggregating strictly locally trained models and therefore require a fundamentally different privacy-by-design architecture. Moreover, we will evaluate the hardware requirements of different FAI algorithms and the subsequent feasibility of their application within the clinical infrastructure. We will benchmark the developed FAI algorithms to current state-of-the-art approaches. The most promising strategies that are non-IID and PNO-ready and adhere to the defined hardware requirements are integrated into the FAI framework.
Finally, we will develop and integrate xAI and bioinformatics strategies that foster the identification of PDCA specific omics and clinical markers as well as molecular pathomechanisms substantial to PDAC progression and treatment response that remain hidden when separately analysing local datasets.
In conclusion, FAIrPaCT aims to develop a tailored federated artificial intelligence framework that can aid the research and clinical community to move towards personalised treatment. The FAIrPaCT framework will be available as open access project.
Acknowledgements: We are very thankful for the BMBF funding in this project (Förderkennzeichen BMBF 01KD2208A). Our Ethics amendment based on the previous projects is in preparation (May 2023).
The authors declare that they have no competing interests.
The authors declare that an ethics committee vote is not required.