We are honored to invite leading experts from both academia and industry on Federated Learning to share their cutting-edge research and viewpoints. The talks will introduce the advances, challenges, and new technologies in federated learning.

Flags

Yong Chen

University of Pennsylvania



Title: PDA: Privacy-preserving Distributed Algorithms and statistical inference in the era of real-world data networks

Abstract: With the increasing availability of electronic health records (EHR) data, it is important to effectively integrate evidence from multiple data sources to enable reproducible scientific discovery. However, we are still facing practical challenges in data integration, such as protection of data privacy, the high dimensionality of features, and heterogeneity across different datasets. Aim to facilitate efficient multi-institutional data analysis without sharing individual patient data (IPD), we developed a toolbox of Privacy-preserving Distributed Algorithms (PDA) that conduct distributed learning and inference for various models, such as association analyses, causal inference, cluster analyses, counterfactual analyses, and beyond. Our algorithms do not require iterative communication across sites and are able to account for heterogeneity across different hospitals. The validity and efficiency of PDA are also demonstrated with real-world use cases in Observational Health Data Sciences and Informatics (OHDSI), PCORnets including PEDSnet and OneFlorida, and Penn Medicine Biobank (PMBB).

Bio: Yong Chen is a Professor of Biostatistics in the Department of Biostatistics, Epidemiology, and Informatics at the University of Pennsylvania. His methodological research focuses on leveraging real-world data, such as electronic health records (EHR) and administrative claims data, to rapidly and reliably generate and synthesize clinical evidence. Dr. Chen specializes in developing cutting-edge machine learning and causal inference methods, along with generative AI techniques, to support personalized disease prevention and intervention strategies. Through innovative solutions such as target trial emulation, privacy-preserving federated learning, and synthetic data/digital twins, he aims to facilitate rapid and reliable evidence generation. His work enables data-driven decisions that improve outcomes and drive innovation in clinical research and healthcare delivery. Dr. Chen is the Founding Director of the Center for Health AI and Synthesis of Evidence (CHASE) at the University of Pennsylvania. He is an elected fellow of American Statistical Association, American College of Medical Informatics, American Medical Informatics Association, International Statistical Institute, and Society for Research Synthesis Methodology. He founded the Penn Computing, Inference and Learning (PennCIL) lab at the University of Pennsylvania. During pandemic, Dr. Chen has been serving as the Director of Biostatistics Core for a national multi-center study on Post-Acute Sequelae of SARS CoV-2 infection (PASC), involving more than 12 million pediatric patients across 40 health systems. Dr. Chen has published over 200 peer-reviewed papers in statistical inference, medical informatics, comparative effectiveness research, and biomedical sciences. He has taught short courses at FDA, JSM, ENAR, ASA Biopharmaceutical Section, the Deming Conference on Applied Statistics, the New England Statistics Symposium, ICSA annual conference, and workshops at the University of Pennsylvania and the University of Oxford in the last 10 years.

Flags

Sebastian Stich

CISPA & ELLIS



Title: Local Update Methods for Federated Optimization

Abstract: Federated learning has emerged as an important paradigm in modern large-scale machine learning. Unlike traditional centralized learning, where models are trained using large datasets stored on a central server, federated learning keeps the training data distributed across many clients, such as phones, network sensors, hospitals, or other local information sources. In this setting, communication-efficient optimization algorithms are crucial. We provide a brief introduction to local update methods developed for federated optimization and discuss their worst-case complexity. Surprisingly, these methods often perform much better in practice than predicted by theoretical analyses using classical assumptions. Recent results show that notions capturing the similarity among client objectives are essential for explaining this improved practical performance.

Bio: Dr. Sebastian Stich is a faculty member at the CISPA Helmholtz Center for Information Security and a member of the European Lab for Learning and Intelligent Systems (ELLIS). His research interests span machine learning, optimization, and statistics, with a focus on efficient parallel algorithms for training ML models over decentralized datasets. He obtained his PhD from ETH Zurich and worked as a postdoctoral fellow at UCLouvain and EPFL. He is a co-organizer of the International Optimization for Machine Learning workshop at NeurIPS and the Federated Learning One World Seminar, and serves on the editorial boards of the Journal of Optimization Theory and Applications (JOTA) and the Transactions on Machine Learning Research (TMLR). He received a Meta Research Award in 2022 and a Google Scholar Research Award in 2023.

Flags

Jundong Li

University of Virginia



Title: Data-Efficient Federated Learning: Harnessing the Power of Small Data

Abstract: Federated Learning (FL) is transforming collaborative and distributed machine learning by enabling multiple clients to train models without the need to share their local data. However, conventional FL frameworks often assume that clients have sufficient data for local model updates, a condition that is rarely met in real-world applications. In many cases, clients have access only to limited or class-imbalanced data, leading to a significant drop in FL performance. This talk will address the challenges of Data-Efficient Federated Learning (DEFL) by exploring strategies to maximize the utility of small datasets within federated environments. We will introduce a novel decoupled meta-learning framework designed for federated settings, which achieves strong performance on new tasks despite limited labeled data. Additionally, we will explore Federated Graph Learning (FGL), a specialized FL approach for graph-structured data distributed across clients. A new framework will be presented that mitigates learning bias for nodes from minority classes, where training data is often scarce. Finally, we will highlight future directions in DEFL, paving the way for more effective and inclusive federated learning systems.

Bio: Dr. Jundong Li is an Assistant Professor at the University of Virginia with appointments in the Department of Electrical and Computer Engineering, Department of Computer Science, and School of Data Science. Prior to joining UVA, he received his Ph.D. degree in Computer Science at Arizona State University in 2019, M.Sc. degree in Computer Science at University of Alberta in 2014, and B.Eng. degree in Software Engineering at Zhejiang University in 2012. His research interests are generally in data mining and machine learning, with a particular focus on graph machine learning, trustworthy/safe machine learning, causal inference, and, more recently large language models. He has published over 150 papers in high-impact venues and won several prestigious awards, including SIGKDD Best Research Paper Award (2022), PAKDD Best Paper Award (2024), NSF CAREER Award (2022), PAKDD Early Career Research Award (2023), JP Morgan Chase Faculty Research Award (2021 & 2022), and Cisco Faculty Research Award (2021). He has served on the organizing committees of conferences such as KDD, WSDM, SDM, and IEEE BigData, and is currently on the editorial boards for ACM Transactions on Intelligent Systems and Technology (TIST) and ACM Transactions on Knowledge Discovery from Data (TKDD).

Flags

Graham Cormode

University of Warwick



Title: Federated Computation Beyond Learning

Abstract: The federated model of computation has attracted much interest due to the power of federated learning. But there is much to do outside of training: federated data preparation and cleaning, post-training federated calibration and monitoring, and federated analytics to track the behaviour. In this talk, I will touch on the algorithms and systems needed by federated computation, informed by deployment experience.

Bio: Graham Cormode is a research scientist at Facebook, and a professor in the Department of Computer Science at the University of Warwick in the UK. His research interests are in data privacy, data stream analysis, massive data sets, and general algorithmic problems. His work on statistical analysis of data has been recognized by the 2017 Adams Prize in Mathematics and as a Fellow of the ACM.