High coverage and noise characterization of CEBPA for sensitive and specific variant detection

By Laura Griffin, PhD on Wed Aug 1, 2018


Summary

  • Biallelic CEBPA mutations are important prognostic indicators in AML, as they are associated with favorable outcomes
  • Detecting CEBPA mutations by NGS is challenging due to GC content of the gene
  • VariantPlex® myeloid NGS assays are based on Anchored Multiplex PCR (AMP™), a target enrichment strategy that enables stand-specific amplification, providing bidirectional coverage of GC-rich regions in CEBPA
  • 95 MDAF is the minimum detectable allele frequency (AF) by which a variant can be distinguished from the underlying noise at a probability of 0.95
  • Thus, 95 MDAF reveals the sensivity at each base position, providing more useful information than average panel-wide sensitivity information

CEBPA in Acute Myeloid Leukemia

CCAAT enhancer binding protein alpha (CEBPA) is a single-exon gene that encodes a leucine zipper transcription factor with an important role in myeloid differentiation, a necessary step in the development of acute myeloid leukemia (AML) (1). CEBPA mutations occur in 8-15% of AML cases and are one of the most common classes of mutations in cytogenetically-normal AML (2-4). The majority of CEBPA mutations are biallelic and are associated with favorable outcomes, as no wildtype protein is expressed to allow myeloid differention (5,6). It is believed that these mutations cause myeloid cell differentiation arrest resulting in a tumor suppressor effect (3). As of 2016, the WHO requires characterization of CEBPA mutations in the classification of AML (7). Therefore, reliable methods to detect these mutations is crucial for tumor classification.

CEBPA mutation detection methods

CEBPA mutations are highly variable and can occur anywhere throughout the gene. Therefore, it is important to be able to detect mutations across the entire coding region. Sanger sequencing is the current gold standard technique, detecting CEBPA mutations down to 20% AF, but this approach lacks scalability. Next-generation sequencing (NGS) assays can detect multiple mutation types across multiple target regions. However, detecting mutations in CEBPA can be particularly challenging for these assays due to the high GC content (8).

Detection of CEBPA mutations by Anchored Multiplex PCR and NGS

Anchored Multiplex PCR (AMP™) is a target enrichment strategy for NGS that uses molecular barcode (MBC) adapters and single gene-specific primers (GSPs) for amplification, permitting open-ended capture of DNA fragments from a single end. Based on AMP technology, we developed Archer® VariantPlex® myeloid targeted NGS assays to detect variants in CEBPA, FLT3-ITDs and other important variants in myeloid cancers from clinical-type genomic DNA samples. Because the MBC adapters contain universal primer binding sites, amplification from GSPs is unrestricted by opposing primers and can amplify both large and small fragments without prior knowledge of downstream sequences. This approach enables flexible and strand-specific primer design to provide better coverage of challenging regions. As such, anchored reads originating from bidirectional, yet independent, GSPs contained in the VariantPlex myeloid panels provide excellent coverage across CEBPA, even in GC-rich regions (Figure 1).

Plot Depicting Unique Coverage of CEBPA.
Figure 1. Plot Depicting Unique Coverage of CEBPA.

As shown in Figure 1, the VariantPlex Myeloid panel achieves full coverage of CEBPA at depths that allow variant calling at relevant allele frequencies. It’s important to note that Figure 1 shows unique molecule coverage, not PCR duplicated reads. Confidence in variant detection over a region is mainly dictated by the number of unique molecules interrogated. Since MBCs are ligated prior to PCR, the Archer Analysis bioinformatics software is able to deduplicate, error correct, and analyze unique reads over every position.

95 MDAF enables per-base variant detection sensitivity across CEBPA

Variant sensitivity is a function of both coverage depth and noise at that depth, which is accounted for using normalization datasets to assess noise at each variant position. Using this approach, background noise at each base position is assessed across a cohort of diverse wild-type samples. From this information, the sensitivity of identifying a true positive variant over a range of allele frequencies is calculated for each position. Therefore, one can determine the limit of detectability for a variant at a base position and for a particular base substitution. This metric is referred to as 95 MDAF, the minimum detectable allele frequency by which a variant can be distinguished from the underlying noise at a probability of 0.95.

As CEBPA mutations frequently occur in GC-rich regions, it is important to be able to assess the sensitivity and coverage at each position rather than averaging the entire coding region. In Figure 2 below, the 95 MDAF was calculated for the least sensitive of any of the three possible base substitutions for each base position (blue line). This plot reveals the considerable noise variation between base positions and highlights the need for a per-base sensitivity threshold. The plot also enables more confident variant calling in GC-rich regions by identifying where variant calling may be more sensitive than a generic, panel-wide limit-of-detection (Figure 2).

Sensitivity of variant detection across CEBPA (95 MDAF).
Figure 2. Sensitivity of variant detection across CEBPA (95 MDAF).

Summary

CEBPA mutations are important prognostic indicators in AML, as they are associated with favorable outcomes. However, detecting CEBPA mutations by NGS is challenging due to GC content of the gene. By combining MBCs and single gene-specific primers, AMP enables strand-specific amplification to provide bidirectional coverage of CEBPA, thus providing better coverage of GC-rich areas. MBCs enable post-sequencing deduplication and error correction, and normalization datasets are used to assess the noise at each base position. This information together has been used to determine the 95 MDAF, which is the minimum detectable allele frequency by which a variant can be distinguished from the underlying noise at a probability of 0.95. The 95 MDAF essentially reveals the lower limit of detection at each base position in a sample, providing more useful information than panel-wide sensitivity information. This is particularly useful for genes like CEBPA that have difficult regions, where every variant call can be compared to the 95 MDAF and assigned a p-value for high confidence variant detection and optimal sensitivity and specificity.

References

  1. D. G. Tenen, R. Hromas, J. D. Licht, D. E. Zhang, Transcription factors, normal myeloid development, and leukemia. Blood. 90, 489–519 (1997).
  2. R. F. Schlenk et al., Mutations and treatment outcome in cytogenetically normal acute myeloid leukemia. N Engl J Med. 358, 1909–1918 (2008).
  3. T. Pabst et al., Dominant-negative mutations of CEBPA, encoding CCAAT/enhancer binding protein-alpha (C/EBPalpha), in acute myeloid leukemia. Nat Genet. 27, 263–270 (2001).
  4. S. Fröhling et al., CEBPA mutations in younger adults with acute myeloid leukemia and normal cytogenetics: prognostic relevance and analysis of cooperating mutations. Journal of Clinical Oncology. 22, 624–633 (2004).
  5. C. L. Green et al., Prognostic significance of CEBPA mutations in a large cohort of younger adult patients with acute myeloid leukemia: impact of double CEBPA mutations and the interaction with FLT3 and NPM1 mutations. J. Clin. Oncol. 28, 2739–2747 (2010).
  6. A. Fasan et al., The role of different genetic subtypes of CEBPA mutated AML. Leukemia. 28, 794–803 (2014).
  7. D. A. Arber et al., The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood. 127, 2391–2405 (2016).
  8. C. W. S. Ng et al., CEBPA mutational analysis in acute myeloid leukaemia by a laboratory-developed next-generation sequencing assay. J. Clin. Pathol. 71, 522–531 (2018).


About Laura Griffin, PhD

Laura Griffin earned her PhD in Microbiology from the University of Colorado Denver, Anschutz Medical Campus. Her research focused on cancer virology, dissecting virus-host interactions during Human Papillomavirus infection. Laura is passionate about cancer research as well as effective education and communication of scientific ideas. Laura joined the ArcherDX team as Scientific Editor in January, 2016.

How to contact us

Address

2477 55th Street, Suite 202

Boulder, CO 80301

Phone

Phone: (877) 771 1093

Phone: (303) 357 9001

All content © 2018 ArcherDX, Inc.

For Research Use Only. Not for use in diagnostic procedures. For Research Use Only. Not for use in diagnostic procedures.