Algorithmic Approach Looks to Improve Identification of Misdiagnosis

CHICAGO (360Dx) – Researchers at Johns Hopkins University and New York's Albert Einstein College of Medicine have developed an automated method of identifying diagnostic errors based on change of diagnosis over time, as well as patient outcomes.

"This approach has the potential to transform diagnostic quality and safety across a broad range of clinical problems and settings," the researchers wrote in an article published online last week in the journal BMJ Quality & Safety.

Called Symptom-Disease Pair Analysis of Diagnostic Error, or SPADE, the system combines a number of simple computer algorithms to weed through large clinical and administrative data sets, as well as reports of adverse events. SPADE is meant to follow recommendations from a seminal 2015 report by the National Academy of Medicine — then called the Institute of Medicine — which said that the majority of Americans will be victim to at least one "meaningful" diagnostic error in their lifetimes, either due to a delayed or inaccurate diagnosis.

That report, Improving Diagnosis in Healthcare, called for extra efforts in identifying inaccurate or delayed diagnoses, as well as the processes that can lead to such mistakes. It also advocated better communication between clinicians and patients, though SPADE does not specifically address the latter, according to the BMJ paper.

"A key advantage of this approach is that using 'hard' clinical outcomes avoids much of the subjectivity inherent in other methods that rely on detailed, human medical record reviews to assess for errors," wrote coauthors David Newman-Toker, director of the Armstrong Institute Center for Diagnostic Excellence at Johns Hopkins Medicine, and Ava Liberman, a neurologist at Montefiore Medical Center in the Bronx, New York.

The traditional method of manual chart review is not only slow and labor intensive, it also can be inaccurate, according to Newman-Toker, who also directs the division of neuro-visual and vestibular disorders in the Hopkins neurology department. Often, a key piece of information was never recorded.

"You have to make a judgment about whether there was an error based on what's not in the chart. At the end of the day, that becomes a subjective discussion and an argument about [whether clinicians should write down everything if they think the information is irrelevant]," Newman-Toker said. "That's why chart reviews, in the end, are not reliable."

He said it is wrong to see chart review as the "gold standard" for identifying diagnostic error, since reviewers do not always agree on what is important. Even with the most rigorous research studies, concordance among raters of chart reviews about whether a diagnostic error occurred is “low to moderate, at best. … That's because it's a tough judgment call, and often the relevant information that you need to make that decision isn't actually in the chart," Newman-Toker said.

"One of the first things we figured out 15 years ago when I started this odyssey was [that] I did a chart review and what I realized was all the relevant information was essentially selectively missing from the chart," he added.

For example, a patient might present in an emergency room complaining of a severe headache that to the uninformed or unprepared might appear to be a migraine. In reality, it could be far worse.

"The single most important piece of information in a headache patient in the emergency room is how quickly that headache reached its peak intensity," he said. "If it was less than 10 minutes, then the patient is at risk for a brain aneurysm and if it's more than 10 minutes, that risk is close to zero," explained Newman-Toker, who wanted the initial work on SPADE to focus on stroke.

But not every clinician in the ER might know that and could ask the wrong questions, such as whether the headache is the worst the patient has ever had, instead of the time between onset and highest peak intensity. "They don't write down how long it took for the patient to get to the peak of the intensity of their headache and they just write down 'worst headache' or 'not worst headache,' and then they get the diagnosis wrong," Newman-Toker said.

SPADE takes a different approach, looking for potential misdiagnoses by matching symptoms to actual and recommended treatments. Specifically, it aims to identify statistical deviations in treatment from expected relationships between symptoms and diseases.

Newman-Toker said that SPADE offers two distinct advantages over earlier ways of measuring and assessing diagnostic error.

"Our approach is valid because it relies on actual patient outcomes that happen to patients rather than a vague judgment call about whether somebody did something wrong in the diagnostic process. It's an outcome measure, not a process measure." Newman-Toker said.

"Second, it can be done in a way that doesn't require humans to go dig through every piece of chart material to figure out whether [to give] thumbs-up or thumbs-down on the error. I think it's going to be a way for us to get reliable, cross-institutional comparisons and will ultimately be able to track the harms from diagnostic error on a broad scale in a way that can be publicly reported," he continued.

This viewpoint recalls the work of Larry Weed, the medical informatics pioneer who died in 2017 at the age of 93. Weed, who had been advocating for clinical decision support since the 1960s, promoted the use of computers to "couple" patient-specific problems clinicians see with the ever-growing body of medical knowledge.

Among Weed's creations was the problem-oriented medical record, which helps structure clinician notes in a way that paints a reasonably accurate picture of each patient's situation."Totally, as Larry Weed suggested, in medicine, diagnosis is ultimately problem-oriented," Newman-Toker said.

"If you take a problem-specific view of how you approach diagnosis, you can not only, as he suggested, get the diagnosis right more often, but also can actually assess whether you're getting the diagnosis right or not, which is what we've suggested."

Newman-Toker called this test of SPADE a proof of concept. "We believe that we've identified and described an approach that could fill an important gap in our current ability to improve diagnosis," he said.

Liberman, Newman-Toker, and other collaborators — particularly at Kaiser Permanente of the Mid-Atlantic States and KP Southern California — now are continuing this work in three ways.

"For the stroke problem, where we've really nailed down the outcomes issue pretty well, we're expanding to link process-of-care measures to the stroke outcomes measures," Newman-Toker said.

Once they are able to define who has or has not suffered a diagnostic error based on outcomes measures, the researchers are applying free-text searches and natural-language processing to help clinicians pin down relevant data in the medical record in hopes of identifying patterns that could lead to accurate predictions of who might be most at risk for harm from a misdiagnosis.

These mistakes could be fixed prospectively by changing practice patterns or by facilitating interventions through measuring process errors before any harm is done, Newman-Toker suggested.

They also are expanding SPADE for other conditions. "We're building out measures for sepsis and heart attack and other conditions," Newman-Toker said, adding that work is being done to include other institutions.

Newman-Toker said that Johns Hopkins not only has partnered with Kaiser Permanente, it is working with the Maryland Patient Safety Center, a state-chartered, federally recognized patient safety organization, to expand some of the measures that SPADE might be able to identify "on the whole state level on our way toward making them national."

He said that he and a collaborator are getting ready to submit for peer-review research about integrating SPADE into an operational dashboard.

"It's not just in theory," he said. "We're actually able now to do this in practice."