News Release
A digital illustration of coding numbers

PHILADELPHIA— What if your own medical record could figure out that you are at risk of developing a rare disease so you could receive a diagnosis months or even years earlier than you might otherwise, allowing doctors to get started on important treatment sooner? That’s what a team of researchers co-led by faculty at the Perelman School of Medicine at the University of Pennsylvania and the University of Florida College of Medicine will explore with the help of a $4.7 million grant from the National Institutes of Health (NIH).

For the next four years, researchers will work to develop a set of algorithms powered by machine learning, a form of artificial intelligence (AI), to identify which patients are at risk of five different types of vasculitis and two different types of spondyloarthritis. These predictions, derived from information already available in patients’ electronic health records, could greatly increase the chance of patients being diagnosed sooner.

The efforts to develop this prediction method, called “PANDA: Predictive Analytics via Networked Distributed Algorithms for multi-system diseases,” will be led by principal investigators Yong Chen, PhD, a professor of Biostatistics, and Peter A. Merkel, MD, MPH, chief of Rheumatology and a professor of Medicine and Epidemiology at Penn, and Jiang Bian, PhD, chief data scientist of the University of Florida Health system and a professor in the Health Outcomes & Biomedical Informatics at the University of Florida College of Medicine.

“This is an exciting step forward, building on our current PDA framework, from clinical evidence generation toward AI-informed interventions in clinical decision-making,” Chen said. “Despite the clear need to reduce the dangerous and costly delays in diagnosis, individual clinicians, especially in primary care, face important challenges.” 

Chen used one of the forms of vasculitis under study, granulomatosis with polyangiitis (GPA), as an example of the promise the PANDA system holds. GPA involves inflammation of many organs and can be very severe or even fatal. Mortality rates for patients with this condition remain high in the first year after diagnosis, and the correct diagnosis of this type of vasculitis, and all the other types, can be delayed by months or even years. 

“An earlier diagnosis of any of the types of vasculitis and spondyloarhritis we’re working on leads to a much better prognosis and better clinical outcomes,” Merkel said. “Even if we determine that a patient has just a 10 percent likelihood of developing one of these diseases, that is a much higher chance of a rare problem, and clinicians can keep that in mind and make better decisions for their patients.”

Among the challenges in diagnosis faced by clinicians and their patients are how rare diseases can camouflage themselves as other common diseases, a lack of access to data or other clinicians the patient works with, and, simply, a lack of familiarity with extremely uncommon conditions. An algorithm that automatically scans known information to identify the possibility of a disease like GPA could be lifesaving.

“The increasing availability of real-world data, such as electronic health records collected through routine care, provides a golden opportunity to generate real-world evidence to inform clinical decision-making,” Bian said. “Nevertheless, to leverage these large collections of real-world data, which are often distributed across multiple sites, novel distributed algorithms like PANDA are much needed.”

The researchers plan is to pull data through Patient-Centered Clinical Research Networks (PCORnet), a national database including information from different health systems, adding up to more than 27 million patients. De-identified data from these patients, including lab test results, comorbid conditions, past treatments, and other commonly available information, will be used to create the algorithms. Once built, the researchers will test each algorithm’s predictive power across 10-plus health systems, and then following these tests, the methods the team develops will be shared and available to apply to other diseases.

Because, as its name implies, machine learning algorithms are designed to “learn” and refine themselves as they’re used and fed more data, it’s possible that PANDA will continuously refine itself and become more helpful as time passes. “The proposed machine learning algorithms will adaptively update their key parameters as more data are made available,” said Chen. “We plan to evaluate these machine learning algorithms periodically to ensure they meet our pre-specified standards and can evolve positively over time.”

The grant funding the research is 1U01TR003709.


Penn Medicine is one of the world’s leading academic medical centers, dedicated to the related missions of medical education, biomedical research, excellence in patient care, and community service. The organization consists of the University of Pennsylvania Health System and Penn’s Raymond and Ruth Perelman School of Medicine, founded in 1765 as the nation’s first medical school.

The Perelman School of Medicine is consistently among the nation's top recipients of funding from the National Institutes of Health, with $550 million awarded in the 2022 fiscal year. Home to a proud history of “firsts” in medicine, Penn Medicine teams have pioneered discoveries and innovations that have shaped modern medicine, including recent breakthroughs such as CAR T cell therapy for cancer and the mRNA technology used in COVID-19 vaccines.

The University of Pennsylvania Health System’s patient care facilities stretch from the Susquehanna River in Pennsylvania to the New Jersey shore. These include the Hospital of the University of Pennsylvania, Penn Presbyterian Medical Center, Chester County Hospital, Lancaster General Health, Penn Medicine Princeton Health, and Pennsylvania Hospital—the nation’s first hospital, founded in 1751. Additional facilities and enterprises include Good Shepherd Penn Partners, Penn Medicine at Home, Lancaster Behavioral Health Hospital, and Princeton House Behavioral Health, among others.

Penn Medicine is an $11.1 billion enterprise powered by more than 49,000 talented faculty and staff.

Share This Page: