Skip to main content

Counsyl, Natera at Odds Over Simulation Study of NIPT Performance at Low Fetal Fraction


NEW YORK (GenomeWeb) – Genetic testing firms Counsyl and Natera are sparring over a study published by Counsyl in March that used mathematical simulations to explore how two noninvasive prenatal testing methods that are used by the companies compare in cases with very low fetal DNA fractions.

Appearing in Prenatal Diagnosis, the study juxtaposed simulated results from the shotgun sequencing approach at the heart of Counsyl's commercial NIPT and the SNP-based method commercialized by Natera.

Natera does not return results from samples that do not meet a predetermined fetal fraction threshold of 2.8 percent. However, some labs that use whole-genome sequencing (WGS) methods report aneuploidy calls for all samples, regardless of fetal fraction. 

According to Counsyl, the purpose of its study was to use a simulation to compare the performance of the two methods in order to determine whether WGS may detect more aneuploidies in low fetal fraction samples than a SNP-based method.

The authors of the study argue that their results do indeed suggest their own WGS approach is sensitive enough to return results for samples that would be no-calls for Natera, with the implication that Natera's method leads to a higher incidence of invasive follow ups, and in turn, higher rates of complications and pregnancy loss.

Natera this week published a rebuttal in the same journal contesting the modeling methods used by Counsyl. The Counsyl study authors also added their own response to this letter.]

At the heart of the debate is whether the Counsyl authors' calculations are valid, but also whether they support the conclusions drawn with regard to clinical practice.

Howard Cuckle, a professor of obstetrics at Columbia University and at Tel Aviv University, said that a simulation study of this kind is a widely used tool, and necessary to answer questions like this that can't be addressed using scarce clinical data. However, he said, this type of comparison can "at best" suggest that one method performs better than another at low fetal fraction "given a particular set of assumptions."

"The conclusion drawn by the authors that they have proven a superior performance is too strong. It requires caveat," he said.

Cuckle used to be a paid consultant for several NIPT companies, including Natera, Ariosa, and Vanadis. He is a consultant for PerkinElmer, which owns Vanadis, and a director of a company called "This is My," which provides NIPT testing in the UK.     

Despite Cuckle's view of the limited takeaway from this type of modeling study, Natera and Counsyl are still opposed over the merits of the analysis.

Natera CEO Matt Rabinowitz said that seeing the initial study published this spring, he viewed it as a willful betrayal of science in favor of marketing, calling the approach taken in the study "gratuitously wrong."

According to Natera's response letter, authored by the company's senior director of statistics research and development Allison Ryan, Counsyl's analysis has two main flaws. First, it inaccurately models WGS data in a way that paints Counsyl in a better light. Secondly, it inaccurately recreates Natera's method in a way that paints it in a poorer light.

More specifically, Natera said the modeling done in the study does not inject enough variation into the WGS data, which, if included, would have reduced the simulated efficacy of the WGS method.

On the other hand, the modeling of Natera's SNP method disregards the influence of linkage between SNPs, thereby underestimating its performance, Ryan and her colleagues wrote.

WGS variance

According to Rabinowitz, the literature is replete with evidence that Counsyl's simulation of its own WGS approach is overly optimistic.

"Their paper makes a completely artificial determination of what the sensitivity and specificity of their method is, when … we know from the literature that shotgun sequencing suffers from false positive and negatives not just at low fetal fractions but at much higher fractions," he said.

Coupled with what it argues is an oversimplification of WGS data to the point that the only source of variation comes from depth of sequencing, Natera also took strong issue with the fact that the Counsyl study did not include a step directly comparing its simulated WGS data to actual sequencing data in order to prove that its modeling hems close enough to real-world results.

According to Ryan, this is standard practice for simulation-based analyses, and in the absence of this type of validation, there is no justification for using the simulation to make claims about clinical performance.

In their own response to Natera's letter this week, Dale Muzzey and other coauthors of the initial study argued that Natera's criticism in this respect is invalid.

For example, they wrote, a figure in the supplemental material of their study "directly refutes" Natera's claim that they failed to incorporate sources of variance other than number of reads into their simulation.

"The figure demonstrates that the WGS method retains higher analytical sensitivity than the SNP method at variance levels that are actually far in excess of what was experimentally observed even in the infancy of WGS-based NIPS," they wrote.

Natera's Allison Ryan agreed that while the figure in question does simulate WGS data with increased variance, but said it still doesn't address other sources that may be important.

"This figure shows what would happen (in simulation) if rather than behaving according to a Poisson distribution, the WGS data behaved according to an over-dispersed negative binomial," Ryan said. "It shows that as variance increases beyond what is in the Poisson model, you see more variance in the reads per bin and corresponding lower sensitivity in the simulated trisomy detection."

However, she added, "they did not use these over-dispersed methods in their simulation [and] the analysis does not address [the] concern that the authors' WGS simulation does not generate realistic data. It just shows a method that could possibly be used to generate more realistic data. Their simulation uses the Poisson model which is the most optimistic one shown in the figure. The reader is asked to assume that WGS data fits this best-performing model, without being shown evidence."

According to Ryan, a responsible practice would be to follow up this type of analysis of possible variance with a comparison of the simulated data with real WGS data. "If the variance of the real data is most similar to the Poisson model, then we would know that it is realistic," she explained.

The Counsyl authors did not do this in their study but they did reference an article by researchers from Stephen Quake's lab at Stanford University, which shows that by following suitable corrections, bin counts can be produced with "near-ideal Poisson" expectations.

Muzzey argued that because this Quake paper used real-world data, by training their own simulations on the findings of this paper, he and his colleagues were effectively comparing to "real-world" data.

Moreover, he explained, the team did run their simulation with a range of different variance levels — up to what would be expected to be an unreasonably high noise based on the findings of the Quake study, and far beyond what has been experimentally observed even in early use of WGS.

According to these simulations, the WGS method still outperformed the SNP method at low fetal fractions even at the highest levels of noise.

SNP linkage

Beside the use of what they still contend is artificially clean data in modeling WGS, the Natera authors also argued that the way Counsyl replicated Natera's own SNP-based NIPT method also misrepresents its accuracy.

According to the company, Counsyl's simulated SNP method assumes, incorrectly, that each SNP is inherited independently from adjacent loci, ignoring linkage. In contrast, Natera’s actual NIPT incorporates inheritance across haplotype blocks, which it argues is critical to accurately detect trisomies.

"As it is missing this critical component, the SNP-based classifier implemented by the authors cannot be used to predict performance limitations of a fully-implemented method," Ryan and her coauthors wrote.

In their response to Natera's argument, Counsyl's Muzzey and colleagues countered that, rather than handicapping the SNP method, their omission of linkage actually portrays the SNP method in its best-case scenario, where no crossing over occurs at all. This would mean that if anything, their simulation would overestimate the performance of Natera's test.

"Our approach may seem counterintuitive, but it is based precisely on Natera's published disclosures of their implementation of the SNP method: according to Natera's SNP-method patent application, in the absence of crossovers — that is, with deterministic rather than probabilistic linkage information — the SNP method's equation for handling linked SNPs reduces to a sum of log-likelihoods over individual SNPs. Therefore, both to evaluate the upper bound of SNP-method performance and to simplify the model to be maximally transparent, our simulations assumed the absence of crossovers," the Counsyl authors explained in their response.

Muzzey said the fact that there was a close correspondence between the 2.8 percent no-call threshold that their simulation yielded for the SNP-method and the clinical threshold Natera actually uses — 3 percent — also supports their belief that they accurately represented the SNP approach, despite ignoring crossovers or linkage.

But Natera's Ryan said that this is fundamentally false. Rather than representing the upper bound of SNP-method performance, the absence of crossover actually significantly handicaps their approach. "Recombination is not a source of noise, it’s a source of information," she explained.

"They are calculating based on a model in which each SNP inherits its genotype from the parents independently of its neighbor, but in real life, alleles are present in haplotypes."

"We know a lot more about what a trisomy would look like because of these [linkage] constraints and if you throw away that information — even though it is consistent in the data system and the classifier — you are making your world of trisomies much more nebulous," she added.

Peter Benn, a University of Connecticut professor of genetics who is also a paid consultant to Natera, offered the analogy of a plane crash.

"Suppose you are in an airplane and you come down and you are unsure if you are in England or Ireland. The only way to figure it out is by what people look like, and you know that Irish people have more red hair, freckles, and fair skin."

In Counsyl's approach to SNP-based analysis, they looked at these three factors individually. But Natera's actual method looks at them combined.

"If you see a person with red hair, freckles, and fair skin, with that linked information, you don’t have to look at as many people before you make a decision about which country you are in," Benn said.

According to Rabinowitz, by ignoring this aspect of Natera's method, the Counsyl study becomes tautological.

Both methods are reduced to counting reads, but in one case, you are counting reads across the genome, and in the other, you are limited to SNPs, he said. "They are doing the same thing, but in the SNP method, they are using fewer positions, so of course it's less sensitive."


Muzzey asserted that he and his coauthors' aims in the study were mainly academic. However, a significant proportion of the study is dedicated to linking the simulation results to distinct clinical implications.

The main conclusion of the report is that Counsyl's WGS method maintains high enough sensitivity for test results to be returned in cases where Natera would have had a no-call. This would mean fewer unnecessary invasive procedures and potential complications for patients.

According to Cuckle, while simulations of the type conducted by Counsyl include quantitative predictions, it is generally understood that these values are not robust because of the assumptions made about the model parameters.

"The utility of the models lies in providing qualitative conclusions," he said. "In other words, while the results may show something about the relative performance of the two methods, the conclusions in no way should be taken as quantitative evidence of the performance of either method."

This is also an aspect of Natera's objection to the Counsyl study results. In particular, Ryan argued, the company's calculation of an 80 percent sensitivity for its own method in low fetal fraction cases cannot be relied upon.

According to Ryan, this is bolstered by the fact that the claimed accuracies for WGS and SNP methods in the paper are incongruent with results of at least some real-world observational studies.

Counsyl has not published validation data on the clinical sensitivity of its own WGS-based test, so at this point, many of the claims and counterclaims of the two companies in regard to the accuracy of the modeling remain abstract.

However, Rabinowitz went as far as to argue that Counsyl is endangering patients by using the simulation as a way of persuading physicians that its test is more sensitive that Natera's, and that its simulated sensitivity in fetal fractions below 3 percent in this simulation reflect real clinical sensitivity of WGS testing.

Muzzey and his coauthors meanwhile suggested that by routinely not calling low-fetal-fraction samples, Natera may have inflated its own publicized clinical performance.

In other words, sensitivity would be expected to drop if a certain proportion of no-calls are instead counted as false negatives.

This would depend, though, on the ultimate accuracy of Counsyl's simulation of SNP-based NIPT performance in low fetal fraction samples, something Natera vehemently argues is incorrect.