Breast Cancer Study: Comprehensive Methodologies Unveiled
This document details the exhaustive methodologies employed in a significant breast cancer study, spanning patient cohorts, advanced genomic sequencing, rigorous data analysis, and the use of sophisticated in vivo models.
Study Cohorts and Sequencing Protocols
The core of this research involved a primary study cohort comprising 6,927 tumor samples from 5,881 breast cancer patients. All patients underwent prospective clinical tumor and normal DNA sequencing between February 2014 and September 2021. The study was conducted with MSK Institutional Review Board (IRB) approval, and all participants provided informed consent.
Genomic sequencing was performed on DNA extracted from formalin-fixed, paraffin-embedded (FFPE) tumor tissue and peripheral blood mononuclear cells. This utilized various versions of the MSK-Integrated Mutation Profiling of Actionable Cancer Targets (IMPACT) panel, covering 341 to 506 genes. Tumors were sourced nearly equally from primary sites (50.1%) and metastatic sites (49.8%).
An external cohort further supplemented the study, including tumor samples from gBRCA2 primary tumors. These samples underwent whole-exome sequencing (WES) from collections at the University of Pennsylvania and Mayo Clinic, also with requisite IRB approval and informed consent.
Genomic and Germline Variant Analysis
Germline variant calling was executed using a validated pipeline, focusing on variants with a population allele frequency below 2% in gnomAD exomes and genomes. Pathogenicity assessments followed ACMG guidelines, identifying putative loss-of-function (LoF) variants. A manual review process reconciled any discordant interpretations, with specific low-risk CHEK2 variants consciously excluded.
Somatic zygosity for germline pathogenic variants was inferred through locus-specific copy number and ASCN inference via FACETS. This process included manual review of all ASCN solutions to ensure accuracy. Cases were excluded if zygosity was indeterminate, which occurred when variants were homozygous in the germline, read depth was low, copy numbers were unevaluable, an optimal solution was not found, or there was a suggestion of loss of the mutant allele.
Data Anonymization and Clinical Annotation Practices
All 5,881 patients had consented to an IRB-approved protocol for somatic and clinical data analysis. For germline variant analyses, data were meticulously anonymized using unique IDs, aligning with MSK IRB guidelines to permit germline calls irrespective of specific consent status. Clinical and somatic data were binned to proactively prevent re-identification. A separate study-specific identifier and fully anonymized clinical data were utilized for analyses that did not require germline information.
Clinical data for the MSK cohort were acquired from the Breast Translational Program’s dedicated clinical database. Receptor status (HR+/HER2-, triple-negative, HER2+) was assigned per patient, with definitions adjusted based on whether metastatic or primary tumor sequencing was performed. Cases involving external consultation sequencing, DCIS without invasive cancer, or multiple distinct primary tumors were carefully excluded. Progression events were rigorously defined by radiographic or clinical progression leading to a treatment change, or documented clinician assessment.
Enrichment and HRD Analysis Techniques
Enrichment analysis involved compiling functionally significant mutations, fusions, and copy number alterations predicted by the OncoKB database. Putative RB1 homozygous deletions underwent a manual review to confirm biological plausibility. Firth-penalized logistic regression was the statistical method of choice to analyze associations between germline and somatic gene alterations, with adjustments made for receptor status and sample type, particularly considering data sparsity.
ASCN definitions for RB1 loss of heterozygosity (LOH) were derived from manual FACETS review of pre-CDK4/6i treatment samples, with RB1 LOH specifically defined as a lesser copy number (LCN) of 0. Homologous Recombination Deficiency (HRD) was inferred using IMPACT-HRD, which quantifies genomic scars—such as telomeric allelic imbalances, large-scale transitions, and losses of heterozygosity—based on ASCN alterations.
Clinical Outcome Analysis Framework
The study aimed to determine the association between genomic alterations and progression-free survival (PFS) and overall survival (OS) in patients receiving CDK4/6i therapy. Cox proportional hazard models were employed, adjusted for factors like the endocrine therapy (ET) partner and treatment line. For OS, a left-truncated model was used to account for immortal time bias.
A matched-pairs analysis was conducted, incorporating patients with paired pre- and post-CDK4/6i sequencing data from MSK-IMPACT, MSK-ACCESS, or Guardant360.
Acquired tumor suppressor loss (e.g., RB1, PTEN, LATS2, FAT1, TP53, ARID1A, LATS1, NF1) was assessed across HRD, BRCA2, and non-HRD groups. Cases with pre-existing biallelic LoF in immediate resistance genes were excluded from this specific analysis.
In Vivo Patient-Derived Xenograft (PDX) Models
Patient-Derived Xenograft (PDX) models were a crucial component, sourced from MSK, Institut Curie, and AstraZeneca. These models were established by implanting patient tumor biopsies into immunocompromised mice. All animal studies strictly adhered to institutional guidelines and ethical protocols.
Targeted sequencing of post-mortem and PDX samples involved DNA extraction, library preparation, and IMPACT assay sequencing. Whole-genome sequencing of post-CDK4/6i PDX samples further entailed DNA extraction, library preparation, and sequencing on Illumina platforms. A comprehensive suite of bioinformatics pipelines was utilized for variant calling, copy number alteration detection, and structural variant identification, including bwa mem, Picard, Mutect, Varscan, Strelka, Scalpel, Platypus, Facets, CNVkit, Manta, SvABA, GRIDSS, and Paragraph.
Immunoblotting was employed to assess protein expression (BRCA2, pRB, Rb) in frozen PDX tumors. Efficacy studies in PDX models involved randomization into various treatment arms (vehicle, fulvestrant, ribociclib, saruparib, camizestrant, palbociclib), with tumor size measured over time. Linear mixed-effect models were then used for robust growth curve comparison across treatment groups.