Researchers from Mass General Brigham determined that ChatGPT achieved an accuracy rate of almost 72% across all medical specialties and phases of clinical care, and 77 percent accuracy in making final diagnoses.
Researchers from Mass General Brigham have conducted a study which reveals that ChatGPT demonstrated an accuracy rate of approximately 72% in overall clinical decision-making processes, ranging from suggesting potential diagnoses to finalizing diagnoses and determining care management strategies. This expansive language model-based AI chatbot exhibited consistent performance in both primary care and emergency medical environments across diverse medical fields. The findings were recently published in the Journal of Medical Internet Research.
“Our paper comprehensively assesses decision support via ChatGPT from the very beginning of working with a patient through the entire care scenario, from differential diagnosis all the way through testing, diagnosis, and management,” said corresponding author Marc Succi, MD, associate chair of innovation and commercialization and strategic innovation leader at Mass General Brigham and executive director of the MESH Incubator.
“No real benchmarks exist, but we estimate this performance to be at the level of someone who has just graduated from medical school, such as an intern or resident. This tells us that LLMs, in general, have the potential to be an augmenting tool for the practice of medicine and support clinical decision-making with impressive accuracy.”
The study was done by pasting successive portions of 36 standardized, published clinical vignettes into ChatGPT. The tool first was asked to come up with a set of possible, or differential, diagnoses based on the patient’s initial information, which included age, gender, symptoms, and whether the case was an emergency. ChatGPT was then given additional pieces of information and asked to make management decisions as well as give a final diagnosis—simulating the entire process of seeing a real patient. The team compared ChatGPT’s accuracy on differential diagnosis, diagnostic testing, final diagnosis, and management in a structured blinded process, awarding points for correct answers and using linear regressions to assess the relationship between ChatGPT’s performance and the vignette’s demographic information.
The researchers found that overall, ChatGPT was about 72 percent accurate and that it was best in making a final diagnosis, where it was 77 percent accurate. It was lowest-performing in making differential diagnoses, where it was only 60 percent accurate. And it was only 68 percent accurate in clinical management decisions, such as figuring out what medications to treat the patient with after arriving at the correct diagnosis. Other notable findings from the study included that ChatGPT’s answers did not show gender bias and that its overall performance was steady across both primary and emergency care.
“ChatGPT struggled with differential diagnosis, which is the meat and potatoes of medicine when a physician has to figure out what to do,” said Succi. “That is important because it tells us where physicians are truly experts and adding the most value—in the early stages of patient care with little presenting information, when a list of possible diagnoses is needed.”
The authors note that before tools like ChatGPT can be considered for integration into clinical care, more benchmark research and regulatory guidance is needed. Next, Succi’s team is looking at whether AI tools can improve patient care and outcomes in hospitals’ resource-constrained areas.
The emergence of artificial intelligence tools in health has been groundbreaking and has the potential to positively reshape the continuum of care. Mass General Brigham, as one of the nation’s top integrated academic health systems and largest innovation enterprises, is leading the way in conducting rigorous research on new and emerging technologies to inform the responsible incorporation of AI into care delivery, workforce support, and administrative processes.
“Mass General Brigham sees great promise for LLMs to help improve care delivery and clinician experience,” said co-author Adam Landman, MD, MS, MIS, MHS, chief information officer and senior vice president of digital at Mass General Brigham. “We are currently evaluating LLM solutions that assist with clinical documentation and draft responses to patient messages with a focus on understanding their accuracy, reliability, safety, and equity. Rigorous studies like this one are needed before we integrate LLM tools into clinical care.”
Reference: “Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study” by Arya Rao, Michael Pang, John Kim, Meghana Kamineni, Winston Lie, Anoop K Prasad, Adam Landman, Keith Dreyer and Marc D Succi, 22 August 2023, Journal of Medical Internet Research.
DOI: 10.2196/48659
The study was funded by the National Institute of General Medical Sciences.

News
Global Nanomaterial Regulation: A Country-by-Country Comparison
Nanomaterials are materials with at least one dimension smaller than 100 nanometres (about 100,000 times thinner than a human hair). Because of their tiny size, they have unique properties that can be useful in [...]
Pandemic Potential: Scientists Discover 3 Hotspots of Deadly Emerging Disease in the US
Virginia Tech researchers discovered six new rodent carriers of hantavirus and identified U.S. hotspots, highlighting the virus’s adaptability and the impact of climate and ecology on its spread. Hantavirus recently drew public attention following reports [...]
Studies detail high rates of long COVID among healthcare, dental workers
Researchers have estimated approximately 8% of Americas have ever experienced long COVID, or lasting symptoms, following an acute COVID-19 infection. Now two recent international studies suggest that the percentage is much higher among healthcare workers [...]
Melting Arctic Ice May Unleash Ancient Deadly Diseases, Scientists Warn
Melting Arctic ice increases human and animal interactions, raising the risk of infectious disease spread. Researchers urge early intervention and surveillance. Climate change is opening new pathways for the spread of infectious diseases such [...]
Scientists May Have Found a Secret Weapon To Stop Pancreatic Cancer Before It Starts
Researchers at Cold Spring Harbor Laboratory have found that blocking the FGFR2 and EGFR genes can stop early-stage pancreatic cancer from progressing, offering a promising path toward prevention. Pancreatic cancer is expected to become [...]
Breakthrough Drug Restores Vision: Researchers Successfully Reverse Retinal Damage
Blocking the PROX1 protein allowed KAIST researchers to regenerate damaged retinas and restore vision in mice. Vision is one of the most important human senses, yet more than 300 million people around the world are at [...]
Differentiating cancerous and healthy cells through motion analysis
Researchers from Tokyo Metropolitan University have found that the motion of unlabeled cells can be used to tell whether they are cancerous or healthy. They observed malignant fibrosarcoma [...]
This Tiny Cellular Gate Could Be the Key to Curing Cancer – And Regrowing Hair
After more than five decades of mystery, scientists have finally unveiled the detailed structure and function of a long-theorized molecular machine in our mitochondria — the mitochondrial pyruvate carrier. This microscopic gatekeeper controls how [...]
Unlocking Vision’s Secrets: Researchers Reveal 3D Structure of Key Eye Protein
Researchers have uncovered the 3D structure of RBP3, a key protein in vision, revealing how it transports retinoids and fatty acids and how its dysfunction may lead to retinal diseases. Proteins play a critical [...]
5 Key Facts About Nanoplastics and How They Affect the Human Body
Nanoplastics are typically defined as plastic particles smaller than 1000 nanometers. These particles are increasingly being detected in human tissues: they can bypass biological barriers, accumulate in organs, and may influence health in ways [...]
Measles Is Back: Doctors Warn of Dangerous Surge Across the U.S.
Parents are encouraged to contact their pediatrician if their child has been exposed to measles or is showing symptoms. Pediatric infectious disease experts are emphasizing the critical importance of measles vaccination, as the highly [...]
AI at the Speed of Light: How Silicon Photonics Are Reinventing Hardware
A cutting-edge AI acceleration platform powered by light rather than electricity could revolutionize how AI is trained and deployed. Using photonic integrated circuits made from advanced III-V semiconductors, researchers have developed a system that vastly [...]
A Grain of Brain, 523 Million Synapses, Most Complicated Neuroscience Experiment Ever Attempted
A team of over 150 scientists has achieved what once seemed impossible: a complete wiring and activity map of a tiny section of a mammalian brain. This feat, part of the MICrONS Project, rivals [...]
The Secret “Radar” Bacteria Use To Outsmart Their Enemies
A chemical radar allows bacteria to sense and eliminate predators. Investigating how microorganisms communicate deepens our understanding of the complex ecological interactions that shape our environment is an area of key focus for the [...]
Psychologists explore ethical issues associated with human-AI relationships
It's becoming increasingly commonplace for people to develop intimate, long-term relationships with artificial intelligence (AI) technologies. At their extreme, people have "married" their AI companions in non-legally binding ceremonies, and at least two people [...]
When You Lose Weight, Where Does It Actually Go?
Most health professionals lack a clear understanding of how body fat is lost, often subscribing to misconceptions like fat converting to energy or muscle. The truth is, fat is actually broken down into carbon [...]