NEW YORK – The Consortium for Analytic Standardization (CASI) has unveiled its plan, funded by a $2 million grant from the National Cancer Institute, to standardize immunohistochemistry tests, particularly for biomarkers tied to US Food and Drug Administration-approved precision oncology products.
The group, spearheaded by Boston Cell Standards CEO Steve Bogen and with participation from lab experts in the US and other countries, plans to kick off the program by establishing reference standards for IHC-based HER2 testing, including establishing limits of detection for HER2-low expression, which can identify patients eligible for treatment with AstraZeneca and Daiichi Sankyo's Enhertu (trastuzumab deruxtecan). Enhertu was approved by the FDA in August for patients with unresectable or metastatic HER2-low breast cancer who have received prior chemotherapy. In the pivotal DESTINY-Breast04 trial that led to Enhertu's approval, patients were considered HER2-low if they had an IHC score of 1+ or if they were IHC 2+, and they also had to be negative for HER2 gene amplification by in situ hybridization (ISH).
After HER2, the remaining tests CASI plans to standardize this year are PD-L1, p53, and BRAF, according to David Dabbs, consortium member and chief of pathology at PreludeDx. Dabbs said advancing standards for these IHC tests and potentially dozens of others that pathologists and oncologists readily rely on to identify treatments patients are likely to respond to will be a "practice changer."
"The tests for biomarkers will be more likely to give the same results in [a lab in] Dubuque, Iowa, as they would in Pittsburgh, Pennsylvania, or San Francisco or Tallahassee," said Dabbs. Standards that improve accuracy and reproducibility of results, he added, may also improve costs and reduce spending on drugs that are unlikely to benefit patients.
While clinical laboratory tests such as blood glucose, cholesterol, or A1c have well-calibrated reference standards that make their results highly reliable between laboratories, IHC tests have never had analytical standards. Instead, IHC tests use a control sample known to be positive for the analyte of interest run in tandem whenever a patient sample is tested.
Interpretation of IHC tests depends on a visual comparison of spots on the slide and can be influenced by factors such as the strength of the stain, slight variations in incubation times, and even the temperature in the laboratory, according to Keith Miller, former director of the UK National External Quality Assessment Scheme for Immunohistochemistry and In Situ Hybridization, and a CASI steering committee member.
A lab in the UK, without adequate air conditioning during last summer's heat wave, may be prone to over-staining, Miller said, adding that in "laboratories that don't have proper heating systems, the opposite can happen." Miller describes IHC as an "archaic" practice, dating back to the 19th century, when the method of formalin fixation and paraffin wax embedding of tissue samples was invented.
With increasing demand for quantitative biomarker tests, which ask not only if the biomarker is present, but how much, these older methods are being pushed to the limits of their intended use. The fact that laboratories are using different standards combined with the challenge of distinguishing different shades of brown-stained spots with the human eye makes test results unreliable, as well as the treatment recommendations doctors make using them.
"In the 1990s, we had a couple handfuls of antibodies that were being used for detecting cell lineage," said Dabbs. "But now we're being asked to look at biomarkers that are critically important and linked to treatments that cost enormous amounts of money. We want to be sure that our tests are going to be fit for purpose, accurate, and reproducible."
As an example of how current IHC tests can fall short in precision medicine applications, Dabbs noted that in the Destiny-Breast04 trial, upon which the FDA based its approval decision for Enhertu in HER2-low breast cancers, investigators used Roche's Ventana HER2 IHC test to select patients for treatment who had IHC 1+ or 2+ with negative ISH results. They excluded patients with IHC 0 results. Dabbs pointed out, however, that the limit of detection between IHC 0 and 1+ is unknown, and pathologists cannot reliably distinguish between the two when looking at slides.
In fact, the Ventana test was not designed to gauge HER2-low expression but was repurposed to detect it in the pivotal trial, Dabbs noted. "While the trial results were very positive, indicating that the drug was better than the standard-of-care chemotherapy for that group of patients, it's quite possible that a substantial number of patients were not on the wagon for treatment because we literally don't know what was happening at the low end of that assay," he said.
CASI's recent publication in Archives of Pathology and Laboratory Medicine outlines its plan to address these uncertainties, develop reference standards, and then provide labs the tools for calibrating their tests to those standards. The group will use calibrators developed by Boston Cell Standards comprising purified analytes, such as HER2 or PD-L1, bound to glass microbeads, which are spotted onto microarray slides in a range of concentrations from low to high.
The analytes will be traceable to National Institute of Standards and Technology standard reference material 1934, meaning they will be produced according to well-defined NIST criteria. They will stain the microbeads using the same method used for formalin-fixed paraffin-embedded tissue samples, then analyze digital images of the slides to correlate the intensity of the staining to concentration and calculate a lower limit of detection for each analyte.
"What's the limit of detection for a 3+ or 2+ HER2 result? The way we plan on going about that is by doing an analysis of known positive and negative results on a tissue microarray," said Dabbs. "We'll have a slide, for example, that might have 80 breast cancer samples on it that are about a millimeter-and-a-half in diameter. And at the other end of the slide, there's going to be a series of calibrators at different concentrations."
The slides will then be sent to a participating laboratory, which will stain them and send them back. CASI will then determine sensitivity, specificity, and limits of detection of the laboratory's results. The organization will provide the calibrators free of charge to up to 100 IHC laboratories for each calibrated test and will help laboratories with poorly performing assays improve.
For the HER2 assay, CASI will use gene amplification as a comparator assay to validate its results. For p53 and BRAF, it will use mutation analysis by DNA sequencing. And for PD-L1, the comparator assay will be the FDA-cleared Agilent PharmDx PD-L1 22C3 assay.
While standardization of IHC tests appears long overdue, at the same time, Miller said, "it couldn't have been done before now, because digital pathology systems need to have been adopted across the world first. Now is the right time."
Miller said that the number of biomarkers in histopathology that need this type of standardization is limited because many biomarker tests are based on genomic sequencing rather than IHC testing. In addition to HER2, PD-L1, p53, and BRAF, the consortium is considering standardizing IHC tests for estrogen receptor, ALK, and NTRK.
The priority, though, according to Miller, is to develop standards for HER2 quickly. In Dabbs' view, the development of reference standards for IHC-based biomarker tests will be a "new sunrise" for IHC. "For the first time we'll know where our guardrails are for the test and tie it to a NIST standard calibrator that can be reproduced across laboratories."