NEW YORK — A team of European researchers has developed a new approach to diagnose patients with myelodysplastic syndromes, a group of malignant hematological disorders. The workflow relies on flow cytometry data coupled with a cell population detection method and a machine learning classifier to produce a diagnosis.
According to its inventors, the time to result using the approach is about 30 seconds, which would represent an improvement over conventional methods, which rely on using bone marrow morphology, cytogenetics, and other information to reach a diagnosis.
In a study detailed in Cytometry Part A last month, the researchers validated the method in a cohort of 230 patients with 90 percent sensitivity and 93 percent specificity. The next step toward bringing the method into routine clinical use will be to apply it in other samples, according to Yvan Saeys, group leader at the VIB Center for Inflammation Research at Ghent University in Belgium.
The Belgian researchers worked with partners at Amsterdam University Medical Center in the Netherlands for the current study, and the partners are now reaching out to potential collaborators in France and Germany to explore their method in other settings.
"There are many reasons that you can expect there will be variation in other patient cohorts because people are using slightly different instruments to measure bone marrow samples, so you will want to validate if the method applies to other ways of measuring the data," said Saeys, a coauthor on the Cytometry Part A study. There could also be other differences in cohorts, depending on geography or other factors.
"The next step is to gather more data from more patient cohorts from other hospitals," said Saeys. "If the method still works there, we can start working toward a general diagnostic system that can be implemented in different clinics."
Saeys has been working with machine learning tools for the past two decades, and his lab at the VIB Center for Inflammation Research includes experts in computational biology as well as those with medical backgrounds. Together, they develop computational models for different inflammatory diseases, including techniques to analyze flow cytometry data, and were approached several years ago by a group at AUMC that was interested in applying machine-learning tools in MDS, a group of rare blood disorders that are difficult to diagnose.
"From a clinical perspective, it's a very heterogeneous disease, so it's not easy to say if that parameter is above a certain threshold, we can say for sure it is MDS," Saeys said. "And if you are below a certain threshold, you also cannot be sure the person does not have the disease."
Carolien Duetz, a Ph.D. candidate at AUMC and lead author on the new study, agreed there is a high clinical need to improve the diagnostic workup of MDS. Symptoms of MDS can occur in other diseases, so patients typically must meet several criteria before being formally diagnosed.
The lab of Arjan van de Loosdrecht, a professor of hematology at AUMC, has been developing flow cytometry-based tests for MDS, but these are still time-consuming, and can take up to an hour to produce a diagnosis for each patient. "The idea was to turn to machine learning tools to decrease the time for diagnosis," said Duetz. "There is also room for improvement with regards to accuracy and ease of use."
Working together, the groups developed a computational pipeline for diagnosing MDS. It combines algorithms for preprocessing flow cytometry data called FlowAI, a feature generation algorithm called FlowSOM, and then a classification tool called Random Forest.
The investigators evaluated the workflow on six tubes of a flow cytometry panel in a validation cohort of MDS patients diagnosed using conventional methods. They also selected the best tube for evaluations, and in the validation cohort assessed the single-tube approach, which yielded 97 percent sensitivity and 95 percent specificity. In a second validation cohort of MDS patients with excess blasts, a subset of patients, a sensitivity of 100 percent was achieved using both the six- and single-tube workflows. Analysis was achieved in less than three minutes, using the approach.
"We actually found out there is a lot of hidden information in flow cytometry data that people typically don't look at when they are analyzing or making a diagnosis," said Saeys. "You have to realize, for all of these clinical workflows there is still a lot of manual analysis, people looking at pictures of cells, and people manually defining cell types that could be abnormal," he said. "What we have tried to do is make computational methods that look at all the information, not just a subset of the information."
The work was supported through MDS-RIGHT, a project backed by the European Union Horizon 2020 program, which ran from 2015 through October last year and had a budget of €6 million.
Ongoing validation work using other cohorts will be crucial to progressing the method to clinical utility, Saeys noted.
"Ideally we are looking for large medical centers that have a substantial number of patients," he said. For now, the team is working with European partners, but welcomes collaborators from other regions. Saeys added that since the team's software is freely available, other groups could attempt to replicate the findings in their own patient cohorts.
He stressed that at this point, it will take time before the approach makes it into routine clinical care. "This is still a research project," said Saeys. "Implementing it in the clinic requires further validation and legal aspects."
"This is still a one-center approach, and these methods are sensitive to technical variation in between centers and the machines used to measure cellular proteins," noted AUMC's Duetz. "We need to find other cohorts to validate our methods and also set up a multicenter study so we can use the model we trained on data in our center to classify patients from other centers," she said.
She added that in order to carry out a multicenter validation of the approach, investigators need to harmonize flow cytometry tools and sample preparation methods. This might involve standardizing parameters between flow cytometers or developing a method without using parameters that requires additional optimization to get more accuracy.
Clinical implementation will also necessitate the development of a user-friendly interface, Duetz added. "We have to make it easy to use for clinicians and laboratory technicians, so you don't need to be a bioinformatician to use it," she said.
Duetz noted that while machine-learning tools are being adopted to guide diagnosis in other diseases, such as cancer of unknown primary or chronic kidney disease, she said she was unaware of other similar tools under development for MDS.
She added that the methods developed for diagnosing MDS had also enabled the detection of new cellular features, which could be linked to disease in the future. "Our approach is not just useful for improving accuracy and ease of use, but also for identifying novel cell characteristics," said Duetz. "That is something else we quite liked about this approach."