NEW YORK (360Dx) – Researchers from Indiana University-Purdue University Indianapolis and elsewhere have developed a machine-learning algorithm for predicting which patients diagnosed with acute myeloid leukemia, or AML, will go into remission following treatment for their disease, and which patients will relapse.
According to a paper published in IEEE Transactions on Biomedical Engineering, the so-called Anomalous Sample Phenotype Identification with Random Effects (ASPIRE) algorithm uses cell phenotype information from bone marrow samples from patients with various AML subtypes to identify changes in disease progression. Its developers claim that the method offers an automated approach for measuring and monitoring treatment response that is crucial for evaluating patient prognosis and assessing the effectiveness of treatment strategies.
Murat Dundar, senior author of the disease-progression study and associate professor of computer science at IUPUI, and Bartek Rajwa, first author of the AML study and research assistant professor of computational biology at Purdue University, initially developed the model back in 2010 for use in identifying emerging pathogens in food samples. They used it in at least one study that was focused on analyzing emerging strains of Salmonella but saw potential for the underlying methodology to be used in other areas including cancer studies. In both cases, the underlying problem is trying to find or identify and classify and detect anomalous clusters, Rajwa explained in an interview. Essentially, "discovering unknown unknowns" in datasets, he said.
In the context of AML, one of the goals for clinicians is identifying which patients will respond to treatments and which will relapse. However making those determinations can be quite challenging for clinical pathologists, according to Rajwa. Flow cytometry, which is widely used to diagnose AML cases, provides detailed measurements of phenotypic characteristics of single cells contained in samples, and outputs "gigantic" clouds of data in multidimensional space that can be difficult to interpret manually.
Machine learning algorithms can help with parsing these data clouds, but it is difficult to specify exactly what data points cytometrists and pathologists use to make their determinations. "If it was a one-dimensional curve, you could pretty easily specify [a] cutoff point [that shows] something is abnormal," Rajwa said. That cutoff point is less obvious if you have a multidimensional point cloud.
To that end, rather than trying to build a system that mimics pathologists' analysis process, "we [trained] our system to recognize characteristic patterns describing the immune system or abnormality of the immune system that are predictive of higher or lower responses to therapy," he explained.
ASPIRE is also able to accumulate knowledge on a scale that a clinical cytometrist or pathologist cannot and to use that information to improve its predictions as it receives new cases. A pathologist could potentially remember data from a few hundred AML cases and apply that knowledge to new patients, but computers can remember thousands or even hundreds of thousands of cases "so its accumulated experience is going to be better the more cases are analyzed," Dundar noted in an interview. It would help pathologists to use data from thousands of patients in the diagnosis of new ones, rather than rely on the few cases that they can remember.
Furthermore, computers can visualize data in much higher dimensions than humans are able to and can potentially capture important properties and characteristics that are lost when data dimensionality is reduced. "Human beings are not equipped with imagination to visualize anything beyond three dimensions," Rajwa said.
According to the National Cancer Institute, there were an estimated 19,950 new AML cases in 2016 and an estimated 10,430 deaths from the disease for the year. AML represents 1.2 percent of all new cancer cases in the US. Only 26 percent of patients diagnosed with AML will survive five years after diagnosis.
Predicting which AML patients will go into remission, and who will relapse is a puzzle that the researchers don't claim they've fully solved, "but our initial proof of concept clearly demonstrates that there is some potential in using this machine learning algorithm," Dundar said. "Most hospitals in the US don't have the resources to hire two pathologists or cytometrists, so they could, for example, hire one pathologist and then have this algorithm deployed on a computer. That is how we envision this algorithm [working] in the long-term."
As explained in the paper, the researchers used flow cytometry data from bone marrow samples and medical history data from AML patients, as well as blood data from healthy individuals — a total of 200 diseased and non-diseased immunophenotypic panels — to train their algorithm. They then tested it with data from 36 additional AML cases that were collected at multiple time points.
Specifically, the researchers used non-parametric Bayesian algorithms to cluster patients in the training dataset into sub-populations that represent different cancer cell phenotypes. They then trained a classification algorithm to learn the patterns in the data and build a model that distinguishes AML with remission cases from AML without remission. According to results reported in the IEEE Transactions on Biomedical Engineering paper, the model predicted remission cases with 100 percent accuracy — 26 out of 26 cases — and to predict relapse in 90 percent of the relevant cases — nine out of 10 cases. Full details of the ASPIRE algorithm are available in a separate paper that was published in BMC Bioinformatics in 2014.
Besides AML, the ASPIRE algorithm can be used to analyze data from other hematological neoplasms. The logical next step would be to test the algorithm in a translational context with an eye towards moving it into clinical use, but the researchers have been unable to secure funding for that purpose.
"It's much, much easier to get funding to do basic research [but] it is very tough to get funding for this next step," Rajwa said. "It is still the major obstacle of taking new ideas to broader use."
Their work has so far been supported with grants from the US Department of Agriculture, the National Institutes of Health, and the National Science Foundation.
Rajwa, Dundar, and their colleagues are continuing to develop the method for use in basic research in areas other than AML. "This is not what we want to do but with the current funding situation … funding a translational science project is not looking very promising," Dundar said.