NEW YORK – Researchers in the UK have developed a new method to enhance multiplexing capabilities of conventional real-time PCR instrumentation using data analytics. The method, initially demonstrated using digital real-time PCR, could be used to increase the throughput of molecular diagnostic platforms and reduce cost without any changes to instrument hardware.
In two proof-of-concept studies published last month, a team at Imperial College London showed that using real-time PCR data in a novel way could ultimately support development of a nine-target multiplex test for genes conferring colistin resistance using the Fluidigm Biomark HD instrument and digital PCR arrays, a standard intercalating dye called EvaGreen, and a single fluorescence channel.
Jesus Rodriguez Manzano, an infectious disease specialist and lead author of the papers, said that being a postdoc in the lab of Rustem Ismagalov at CalTech and an initial position in the electrical engineering department at Imperial exposed him to an engineering-oriented worldview. "We are looking into real-time amplification data from a new perspective," he said.
The current molecular diagnostics workflow takes too long and is too expensive, he said. However, "I realized we produce a lot of valuable data, and we are not using it properly."
In typical real-time PCR, the fluorescence at the end of each cycle is used to generate an amplification curve, which in turn can be used to quantify the amount of starting template.
Using a DNA intercalating dye, it is also possible to determine the temperature range over which the two strands of DNA melt apart. By measuring this in real time, a melt curve is generated that can be used to distinguish the presence of different targets.
In the first study, published in Analytical Chemistry last month, the team explored ways to enhance multiplexing capabilities using only the amplification curve of the reaction. In the second study, also in Analytical Chemistry, the group incorporated the melting curve data to create a signature for each target.
"We have demonstrated that the information contained in the amplification and melting curves is non-mutual and it can be used in our favor to improve target classification," Rodriguez Manzano said.
The studies build on data-driven approaches in which he and his team explored single-channel multiplexing of carbapenem-resistance genes and used multidimensional standard curves for DNA quantification and outlier detection.
The methods rely on machine learning to train algorithms to distinguish the amplification and melt signatures for each target in a multiplex assay.
This training process requires inputting hundreds of reactions to teach the algorithm, but the team hit on a useful solution to this challenge using real-time digital PCR.
"Digital PCR instruments provide a huge amount of data that can be used to develop our machine learning algorithms," Rodriguez Manzano said. "They are an excellent tool for understanding the fundamental processes that occur during an amplification reaction."
The group used the Biomark HD platform from Fluidigm because it is currently the only commercially available instrument that can perform real-time dPCR and allow the user to access the amplification and melt curve data.
The system compartmentalizes the PCR reaction into droplets using a microfluidic chip, which is then imaged to obtain the fluorescence intensity of each droplet after each of 40 PCR cycles.
"We use these forty data points to generate a unique multidimensional signature, which is specific to the assay," Rodriguez Manzano said. "By doing so, we can now simultaneously quantify and identify the amplified product, only considering the sigmoidal shape of the amplification reaction."
The Fluidigm digital PCR chip creates 48 panels of 770 or so droplets, and users typically put in one sample per panel. For research purposes, the team needed amplification and melt curve data from positive droplets, so with a high concentration of target, they could see approximately 300 positive droplets in a sample, for example. "If we wanted to do that in qPCR, we would need to run 300 wells," Rodriguez Manzano explained. These reactions then generated the data used to train the algorithms.
In the recently-published approach, team is not disregarding the digital PCR results, however. "We merge Poisson statistics with kinetic and thermodynamic information," Rodriguez Manzano said, so "we can still quantify, as any digital platform will do, and we are adding on top the analysis that enables higher multiplexing capabilities."
The method seems to be quite durable as well. "We are obtaining really great results very fast, which is not common in science," he said.
The team developed and validated its single-channel nonaplex dPCR assay for nine mobilized colistin resistance, or mcr genes, quite quickly and without much fuss, he said. "We designed the assay, performed a few experiments to evaluate its performance, trained the algorithms, and it worked perfectly."
Now, at his lab in Imperial's department of medicine, Rodriguez Manzano is working to port the method to standard qPCR instruments, adapt it for use with TaqMan probes, and evaluate it in a clinical setting using an assay for genes conferring carbapenem resistance.
It has also patented the method and is interested in potentially commercializing it through a partnership with a qPCR instrument maker. "Our data-driven approach enables the enhancement of these platforms without the need of hardware modifications," he said.
The method could potentially be developed as a piece of software that a user can couple to a qPCR instrument, which will process the data and provide a target classification or a diagnostic report.
In addition, all of the code is available on GitHub for public use, and the links are provided within the manuscripts. "It requires some expertise to train the algorithms for new applications, but it is a really straightforward method once this has been done," he said.
Rodriguez Manzano is eager to bring this new approach to the clinic. "I don't want this methodology to stay in academia only, I would like to deploy it," he said.
A similar data-based approach to multiplexing is deployed by ChromaCode. The Carlsbad, California-based firm markets infectious disease panels — for tick-borne illness, multidrug-resistant infections, and COVID-19 — that can be run on standard qPCR instrumentation. The core technology, called high-definition PCR, employs novel chemistries and machine-learning algorithms, including detecting multiple targets in a single fluorescence channel by restricting the concentrations of probes, essentially expanding the multiplexing capabilities of instruments to up to 20 targets.
Aditya Rajagopal, cofounder and chief technology officer at ChromaCode, said the Imperial College methods are "a clever approach." He began developing what would become ChromaCode's core approach as a graduate student at CalTech.
An electrical engineer and physicist by training, he reasoned that PCR-based diagnostics are facing the same sorts of problems as the telecom industry, namely a reliance on hardware infrastructure built for the needs of a prior era. Data compression and mathematical approaches were an obvious way to improve upon existing technologies. "These types of data science or data compression approaches to life science measurements are absolutely necessary," Rajagopal said.
From a bird's eye view, companies like Illumina, building sequencers, and companies like IDT DNA, NEB, or Roche, inventing new enzymes and chemistries, are just part of the diagnostics puzzle, he said.
"The complete puzzle comes when you marry a fractal-like complexity to match the complexity of biology itself," Rajagopal said, adding that "you have to have a nexus of great tools, great chemistry, as well as great analytics" to achieve improvements in clinical utility.
The COVID-19 pandemic seems to be leading to increased knowledge and attention for diagnostic technologies in general, as well as a new era of awareness among the general public. "I've heard people talking about PCR in a grocery store," Rajagopal said.
Rodriguez Manzano also noted that although point-of-care testing has been somewhat neglected in the past, for example, now everyone can more clearly see the potential benefits of home tests.
"This is golden time for diagnostics," he said.