GIRIRAJAN LAB
Publications
Selected publications

A general framework for identifying oligogenic combinations of rare variants in complex disorders
Pounraja VK, Girirajan S*.

Genetic studies of complex disorders such as autism and intellectual disability (ID) are often based on enrichment of individual rare variants or their aggregate burden in affected individuals compared to controls. However, these studies overlook the influence of combinations of rare variants that may not be deleterious on their own due to statistical challenges resulting from rarity and combinatorial explosion when enumerating variant combinations, limiting our ability to study oligogenic basis for these disorders. Here, we present RareComb, a framework that combines the Apriori algorithm and statistical inference to identify specific combinations of mutated genes associated with complex phenotypes. RareComb overcomes computational barriers and exhaustively evaluates variant combinations to identify nonadditive relationships between simultaneously mutated genes. Using RareComb, we analyzed 6189 individuals with autism and identified 718 combinations significantly associated with ID, and carriers of these combinations showed lower IQ than expected in an independent cohort of 1878 individuals. These combinations were enriched for nervous system genes such as NIN and NGF, showed complex inheritance patterns, and were depleted in unaffected siblings. We found that an affected individual can carry many oligogenic combinations, each contributing to the same phenotype or distinct phenotypes at varying effect sizes. We also used this framework to identify combinations associated with multiple comorbid phenotypes, including mutations of COL28A1 and MFSD2B for ID and schizophrenia and ABCA4, DNAH10 and MC1R for ID and anxiety/depression. Our framework identifies a key component of missing heritability and provides a novel paradigm to untangle the genetic architecture of complex disorders.
Combinatorial patterns of gene expression changes contribute to variable expressivity of the developmental delay-associated 16p12.1 deletion
Jensen M, Tyryshkina A, Pizzo L, Smolen C, Das M, Huber E, Krishnan A, Girirajan S*.

Recent studies have suggested that individual variants do not sufficiently explain the variable expressivity of phenotypes observed in complex disorders. For example, the 16p12.1 deletion is associated with developmental delay and neuropsychiatric features in affected individuals, but is inherited in >90% of cases from a mildly-affected parent. While children with the deletion are more likely to carry additional "second-hit" variants than their parents, the mechanisms for how these variants contribute to phenotypic variability are unknown. We performed detailed clinical assessments, whole-genome sequencing, and RNA sequencing of lymphoblastoid cell lines for 32 individuals in five large families with multiple members carrying the 16p12.1 deletion. We found that the deletion dysregulates multiple autism and brain development genes such as FOXP1, ANK3, and MEF2. Carrier children also showed expression changes that were inherited as well as de novo compared with their parents, which matched with 39/47 observed developmental phenotypes. We identified significant enrichments for 13/25 classes of "second-hit" variants in genes with expression changes, where 7/25 variant classes were only enriched when inherited from the non-carrier parent, including missense SNVs and large deletions. In 11 instances, including for ZEB2 and SYNJ1, gene expression was synergistically altered by both the deletion and inherited "second-hits" in carrier children. Finally, brain-specific interaction network analysis showed strong connectivity between genes carrying "second-hits" and genes with transcriptome alterations, including differential expression, alternative splicing, and allele-specific expression. Our study shows that family-based assessments of transcriptome data are highly relevant towards understanding the genetic mechanisms associated with complex disorders.
Functional assessment of the "two-hit" model for neurodevelopmental defects in Drosophila and X. laevis
Pizzo L, Lasser M, Yusuff T, Jensen M, Ingraham P, Huber L, Singh MD, Monahan C, Iyer J, Desai I, Karthikeyan S, Gould DJ, Yennawar S, Weiner AT, Pounraja VK, Krishnan A, Rolls M, Lowery LA, Girirajan S*.

We previously identified a deletion on chromosome 16p12.1 that is mostly inherited and associated with multiple neurodevelopmental outcomes, where severely affected probands carried an excess of rare pathogenic variants compared to mildly affected carrier parents. We hypothesized that the 16p12.1 deletion sensitizes the genome for disease, while "second hits" in the genetic background modulate the phenotypic trajectory. To test this model, we examined how neurodevelopmental defects conferred by knockdown of individual 16p12.1 homologs are modulated by simultaneous knockdown of homologs of "second hit" genes in Drosophila melanogaster and Xenopus laevis. We observed that knockdown of 16p12.1 homologs affect multiple phenotypic domains, leading to delayed developmental timing, seizure susceptibility, brain alterations, abnormal dendrite and axonal morphology, and cellular proliferation defects. In contrast to genes within the 16p11.2 deletion, which has higher de novo occurrence, 16p12.1 homologs additively interacted and were less connected to each other in a human brain-specific interaction network, suggesting that interactions with second-hit genes confer higher impact towards neurodevelopmental phenotypes. Assessment of 358 pairwise interactions in Drosophila between 16p12.1 homologs and 76 homologs of patient-specific "second-hit" genes (such as ARID1B and CACNA1A), genes within neurodevelopmental pathways (such as PTEN and UBE3A), and transcriptomic targets (such as DSCAM and TRRAP) identified both additive (47%) and epistatic (53%) effects. In 11 out of 15 families, homologs of patient-specific "second-hits" showed distinct patterns of interactions, enhancing or suppressing the phenotypic effects of one or many 16p12.1 homologs. In fact, homologs of SETD5 synergistically interacted with homologs of MOSMO in both Drosophila and X. laevis, leading to modified cellular and brain phenotypes, as well as axon outgrowth defects that were not observed with knockdown of either individual homolog. Our results suggest that several 16p12.1 genes sensitize the genome towards neurodevelopmental defects, and complex interactions with "second-hit" genes determine the ultimate phenotypic manifestation.
Drosophila models of pathogenic copy-number variant genes show global and non-neuronal defects during development.
Yusuff T, Jensen M, Yennawar S, Pizzo L, Karthikeyan S, Gould DJ, Sarker A, Gedvilaite E, Matsui Y, Iyer J, Lai ZC, Girirajan S*.

While rare pathogenic copy-number variants (CNVs) are associated with both neuronal and non-neuronal phenotypes, functional studies evaluating these regions have focused on the molecular basis of neuronal defects. We report a systematic functional analysis of non-neuronal defects for homologs of 59 genes within ten pathogenic CNVs and 20 neurodevelopmental genes in Drosophila melanogaster. Using wing-specific knockdown of 136 RNA interference lines, we identified qualitative and quantitative phenotypes in 72/79 homologs, including 21 lines with severe wing defects and six lines with lethality. In fact, we found that 10/31 homologs of CNV genes also showed complete or partial lethality at larval or pupal stages with ubiquitous knockdown. Comparisons between eye and wing-specific knockdown of 37/45 homologs showed both neuronal and non-neuronal defects, but with no correlation in the severity of defects. We further observed disruptions in cell proliferation and apoptosis in larval wing discs for 23/27 homologs, and altered Wnt, Hedgehog and Notch signaling for 9/14 homologs, including AATF/Aatf, PPP4C/Pp4-19C, and KIF11/Klp61F. These findings were further supported by tissue-specific differences in expression patterns of human CNV genes, as well as connectivity of CNV genes to signaling pathway genes in brain, heart and kidney-specific networks. Our findings suggest that multiple genes within each CNV differentially affect both global and tissue-specific developmental processes within conserved pathways, and that their roles are not restricted to neuronal functions.
Gene discoveries in autism are biased towards comorbidity with intellectual disability.
Jensen M, Smolen C, Girirajan S.

Background: Autism typically presents with highly heterogeneous features, including frequent comorbidity with intellectual disability (ID). The overlap between these phenotypes has confounded the diagnosis and discovery of genetic factors associated with autism. We analysed pathogenic de novo genetic variants in individuals with autism who had either ID or normal cognitive function to determine whether genes associated with autism also contribute towards ID comorbidity. Methods: We analysed 2290 individuals from the Simons Simplex Collection for de novo likely gene-disruptive (LGD) variants and copy-number variants (CNVs), and determined their relevance towards IQ and Social Responsiveness Scale (SRS) measures. Results: Individuals who carried de novo variants in a set of 173 autism-associated genes showed an average 12.8-point decrease in IQ scores (p=5.49×10-6) and 2.8-point increase in SRS scores (p=0.013) compared with individuals without such variants. Furthermore, individuals with high-functioning autism (IQ >100) had lower frequencies of de novo LGD variants (42 of 397 vs 86 of 562, p=0.021) and CNVs (9 of 397 vs 24 of 562, p=0.065) compared with individuals who manifested both autism and ID (IQ <70). Pathogenic variants disrupting autism-associated genes conferred a 4.85-fold increased risk (p=0.011) for comorbid ID, while de novo variants observed in individuals with high-functioning autism disrupted genes with little functional relevance towards neurodevelopment. Conclusions: Pathogenic de novo variants disrupting autism-associated genes contribute towards autism and ID comorbidity, while other genetic factors are likely to be causal for high-functioning autism.
NCBP2 modulates neurodevelopmental defects of the 3q29 deletion in Drosophila and Xenopus laevis models.
Singh MD, Jensen M, Lasser M, Huber E, Yusuff T, Pizzo L, Lifschutz B, Desai I, Kubina A, Yennawar S, Kim S, Iyer J, Rincon-Limas DE, Lowery LA, Girirajan S*.

The 1.6 Mbp deletion on chromosome 3q29 is associated with a range of neurodevelopmental disorders, including schizophrenia, autism, microcephaly, and intellectual disability. Despite its importance towards neurodevelopment, the role of individual genes, genetic interactions, and disrupted biological mechanisms underlying the deletion have not been thoroughly characterized. Here, we used quantitative methods to assay Drosophila melanogaster and Xenopus laevis models with tissue-specific individual and pairwise knockdown of 14 homologs of genes within the 3q29 region. We identified developmental, cellular, and neuronal phenotypes for multiple homologs of 3q29 genes, potentially due to altered apoptosis and cell cycle mechanisms during development. Using the fly eye, we screened for 314 pairwise knockdowns of homologs of 3q29 genes and identified 44 interactions between pairs of homologs and 34 interactions with other neurodevelopmental genes. Interestingly, NCBP2 homologs in Drosophila (Cbp20) and X. laevis (ncbp2) enhanced the phenotypes of homologs of the other 3q29 genes, leading to significant increases in apoptosis that disrupted cellular organization and brain morphology. These cellular and neuronal defects were rescued with overexpression of the apoptosis inhibitors Diap1 and xiap in both models, suggesting that apoptosis is one of several potential biological mechanisms disrupted by the deletion. NCBP2 was also highly connected to other 3q29 genes in a human brain-specific interaction network, providing support for the relevance of our results towards the human deletion. Overall, our study suggests that NCBP2-mediated genetic interactions within the 3q29 region disrupt apoptosis and cell cycle mechanisms during development.
A machine-learning approach for accurate detection of copy-number variants from exome sequencing.
Pounraja VK, Jayakar G, Jensen M, Kelkar N, Girirajan S*.

Copy-number variants (CNVs) are a major cause of several genetic disorders, making their detection an essential component of genetic analysis pipelines. Current methods for detecting CNVs from exome sequencing data are limited by high false positive rates and low concordance due to the inherent biases of individual algorithms. To overcome these issues, calls generated by two or more algorithms are often intersected using Venn-diagram approaches to identify "high-confidence" CNVs. However, this approach is inadequate, as it misses potentially true calls that do not have consensus from multiple callers. Here, we present CN-Learn, a machine-learning framework (https://github.com/girirajanlab/CN_Learn) that integrates calls from multiple CNV detection algorithms and learns to accurately identify true CNVs using caller-specific and genomic features from a small subset of validated CNVs. Using CNVs predicted by four exome-based CNV callers (CANOES, CODEX, XHMM and CLAMMS) from 503 samples, we demonstrate that CN-Learn identifies true CNVs at higher precision (~90%) and recall (~85%) rates while maintaining robust performance even when trained with minimal data (~30 samples). CN-Learn recovers twice as many CNVs compared to individual callers or Venn diagram-based approaches, with features such as exome capture probe count, caller concordance and GC content providing the most discriminatory power. In fact, about 58% of all true CNVs recovered by CN-Learn were either singletons or calls that lacked support from at least one caller. Our study underscores the limitations of current approaches for CNV identification and provides an effective method that yields high-quality CNVs for application in clinical diagnostics.
An interaction-based model for neuropsychiatric features of copy-number variants.
Jensen M, Girirajan S*.

Variably expressive copy-number variants (CNVs) are characterized by extensive phenotypic heterogeneity of neuropsychiatric phenotypes. Approaches to identify single causative genes for these phenotypes within each CNV have not been successful. Here, we posit using multiple lines of evidence, including pathogenicity metrics, functional assays of model organisms, and gene expression data, that multiple genes within each CNV region are likely responsible for the observed phenotypes. We propose that candidate genes within each region likely interact with each other through shared pathways to modulate the individual gene phenotypes, emphasizing the genetic complexity of CNV-associated neuropsychiatric features.
Rare variants in the genetic background modulate cognitive and developmental phenotypes in individuals carrying disease-associated variants.
Pizzo L, Jensen M, Polyak A, Rosenfeld JA, Mannik K, Krishnan A, McCready E, Pichon O, Le Caignec C, Van Dijck A, Pope K, Voorhoeve E, Yoon J, Stankiewicz P, Cheung SW, Pazuchanics D, Huber E, Kumar V, Kember R, Mari F, Curró A, Castiglia L, Galesi O, Avola E, Mattina T, Fichera M, Mandarà L, Vincent M, Nizon M, Mercier S, Bénéteau C, Blesson S, Martin-Coignard D, Mosca-Boidron A, Caberg JH, Bucan M, Zeesman S, Nowaczyk MJM, Lefebvre M, Faivre L, Callier P, Skinner C, Keren B, Perrine C, Prontera P, Marle N, Renieri A, Reymond A, Kooy RF, Isidor B, Schwartz C, Romano C, Sistermans E, Amor DJ, Andrieux J, Girirajan S*.

PURPOSE: To assess the contribution of rare variants in the genetic background toward variability of neurodevelopmental phenotypes in individuals with rare copy-number variants (CNVs) and gene-disruptive variants. METHODS: We analyzed quantitative clinical information, exome sequencing, and microarray data from 757 probands and 233 parents and siblings who carry disease-associated variants. RESULTS: The number of rare likely deleterious variants in functionally intolerant genes ("other hits") correlated with expression of neurodevelopmental phenotypes in probands with 16p12.1 deletion (n=23, p=0.004) and in autism probands carrying gene-disruptive variants (n=184, p=0.03) compared with their carrier family members. Probands with 16p12.1 deletion and a strong family history presented more severe clinical features (p=0.04) and higher burden of other hits compared with those with mild/no family history (p=0.001). The number of other hits also correlated with severity of cognitive impairment in probands carrying pathogenic CNVs (n=53) or de novo pathogenic variants in disease genes (n=290), and negatively correlated with head size among 80 probands with 16p11.2 deletion. These co-occurring hits involved known disease-associated genes such as SETD5, AUTS2, and NRXN1, and were enriched for cellular and developmental processes. CONCLUSION: Accurate genetic diagnosis of complex disorders will require complete evaluation of the genetic background even after a candidate disease-associated variant is identified.
Pervasive genetic interactions modulate neurodevelopmental defects of the autism-associated 16p11.2 deletion in Drosophila melanogaster.
Iyer J, Singh MD, Jensen M, Patel P, Pizzo L, Huber E, Koerselman H, Weiner AT, Lepanto P, Vadodaria K, Kubina A, Wang Q, Talbert A, Yennawar S, Badano J, Manak JR, Rolls MM, Krishnan A, Girirajan S*.

As opposed to syndromic CNVs caused by single genes, extensive phenotypic heterogeneity in variably-expressive CNVs complicates disease gene discovery and functional evaluation. Here, we propose a complex interaction model for pathogenicity of the autism-associated 16p11.2 deletion, where CNV genes interact with each other in conserved pathways to modulate expression of the phenotype. Using multiple quantitative methods in Drosophila RNAi lines, we identify a range of neurodevelopmental phenotypes for knockdown of individual 16p11.2 homologs in different tissues. We test 565 pairwise knockdowns in the developing eye, and identify 24 interactions between pairs of 16p11.2 homologs and 46 interactions between 16p11.2 homologs and neurodevelopmental genes that suppress or enhance cell proliferation phenotypes compared to one-hit knockdowns. These interactions within cell proliferation pathways are also enriched in a human brain-specific network, providing translational relevance in humans. Our study indicates a role for pervasive genetic interactions within CNVs towards cellular and developmental phenotypes.
Novel metrics to measure coverage in whole exome sequencing datasets reveal local and global non-uniformity.
Wang Q, Shashikant CS, Jensen M, Altman NS, Girirajan S*.

Whole Exome Sequencing (WES) is a powerful clinical diagnostic tool for discovering the genetic basis of many diseases. A major shortcoming of WES is uneven coverage of sequence reads over the exome targets contributing to many low coverage regions, which hinders accurate variant calling. In this study, we devised two novel metrics, Cohort Coverage Sparseness (CCS) and Unevenness (UE) Scores for a detailed assessment of the distribution of coverage of sequence reads. Employing these metrics we revealed non-uniformity of coverage and low coverage regions in the WES data generated by three different platforms. This non-uniformity of coverage is both local (coverage of a given exon across different platforms) and global (coverage of all exons across the genome in the given platform). The low coverage regions encompassing functionally important genes were often associated with high GC content, repeat elements and segmental duplications. While a majority of the problems associated with WES are due to the limitations of the capture methods, further refinements in WES technologies have the potential to enhance its clinical applications.
Quantitative Assessment of Eye Phenotypes for Functional Genetic Studies Using Drosophila melanogaster.
Iyer J, Wang Q, Le T, Pizzo L, Gronke S, Ambegaokar S, Imai Y, Srivastava A, Llamusi Troisi B, Mardon G, Artero R, Jackson GR, Isaacs AM, Partridge L, Kumar JP, Girirajan S*.

About two-thirds of the vital genes in the Drosophila genome are involved in eye development, making the fly eye an excellent genetic system to study cellular function and development, neurodevelopment/degeneration, and complex diseases such as cancer and diabetes. We developed a novel computational method, implemented as Flynotyper software (https://flynotyper.sourceforge.net), to quantitatively assess the morphological defects in the Drosophila eye resulting from genetic alterations affecting basic cellular and developmental processes. Flynotyper utilizes a series of image processing operations to automatically detect the fly eye and the individual ommatidium, and calculates a phenotypic score as a measure of the disorderliness of ommatidial arrangement in the fly eye. As a proof of principle, we tested our method by analyzing the defects due to eye-specific knockdown of Drosophila orthologs of 12 neurodevelopmental genes to accurately document differential sensitivities of these genes to dosage alteration. We also evaluated eye images from six independent studies assessing the effect of overexpression of repeats, candidates from peptide library screens, and modifiers of neurotoxicity and developmental processes on eye morphology, and show strong concordance with the original assessment. We further demonstrate the utility of this method by analyzing 16 modifiers of sine oculis obtained from two genome-wide deficiency screens of Drosophila and accurately quantifying the effect of its enhancers and suppressors during eye development. Our method will complement existing assays for eye phenotypes and increase the accuracy of studies that use fly eyes for functional evaluation of genes and genetic interactions.
An assessment of sex bias in neurodevelopmental disorders.
Polyak A, Rosenfeld JA, Girirajan S*.

Neurodevelopmental disorders such as autism and intellectual disability have a sex bias skewed towards boys; however, systematic assessment of this bias is complicated by the presence of significant genetic and phenotypic heterogeneity of these disorders.To assess the extent and characteristics of sex bias, we analyzed the frequency of comorbid features, the magnitude of genetic load, and the existence of family history within 32,155 individuals ascertained clinically for autism or intellectual disability/developmental delay (ID/DD), including a subset of 8,373 individuals carrying rare copy-number variants (CNVs). We find that girls were more likely than boys to show comorbid features within both autism and ID/DD cohorts. The frequency of comorbid features in ID/DD was higher in boys (1q21.1 deletion, 15q11.2q13.1 duplication) or girls (15q13.3 deletion, 16p11.2 deletion) carrying specific CNVs associated with variable expressivity while such differences were the smallest for syndromic CNVs (Smith-Magenis syndrome, DiGeorge syndrome). The extent of the male sex bias also varied according to the specific comorbid feature, being most extreme for autism with psychiatric comorbidities and least extreme for autism comorbid with epilepsy. The sex ratio was also specific to certain CNVs, from an 8:1 male:female ratio observed among autistic individuals carrying the 22q11.2 duplication to 1.3:1 male:female ratio in those carrying the 16p11.2 deletion. Girls carried a higher burden of large CNVs compared to boys for autism or ID/DD, and this difference diminished when severe comorbidities were considered. Affected boys showed a higher frequency of neuropsychiatric family histories such as autism or specific learning disability, while affected girls showed a higher frequency of developmental family histories such as growth abnormalities. The sex bias within neurodevelopmental disorders is influenced by the presence of specific comorbidities, specific CNVs, mutational burden, and pre-existing family history of neurodevelopmental phenotypes.
Comorbidity of intellectual disability confounds ascertainment of autism: implications for genetic diagnosis.
Polyak A, Kubina RM, Girirajan S*.

While recent studies suggest a converging role for genetic factors towards risk for nosologically distinct disorders including autism, intellectual disability (ID), and epilepsy, current estimates of autism prevalence fail to take into account the impact of comorbidity of these disorders on autism diagnosis. We aimed to assess the effect of comorbidity on the diagnosis and prevalence of autism by analyzing 11 years (2000-2010) of special education enrollment data on approximately 6.2 million children per year. We found a 331% increase in the prevalence of autism from 2000 to 2010 within special education, potentially due to a diagnostic recategorization from frequently comorbid features such as ID. The decrease in ID prevalence equaled an average of 64.2% of the increase of autism prevalence for children aged 3-18 years. The proportion of ID cases potentially undergoing recategorization to autism was higher among older children (75%) than younger children (48%). Some US states showed significant negative correlations between the prevalence of autism compared to that of ID while others did not, suggesting state-specific health policy to be a major factor in categorizing autism. Further, a high frequency of autistic features was observed when individuals with classically defined genetic syndromes were evaluated for autism using standardized instruments. Our results suggest that current ascertainment practices are based on a single facet of autism-specific clinical features and do not consider associated comorbidities that may confound diagnosis. Longitudinal studies with detailed phenotyping and deep molecular genetic analyses are necessary to completely understand the cause of this complex disorder.
Global increases in both common and rare copy number load associated with autism.
Girirajan S*, Johnson RL, Tassone F, Balciuniene J, Katiyar N, Fox K, Baker C, Srikanth A, Yeoh KH, Khoo SJ, Nauth TB, Hansen R, Ritchie M, Hertz-Picciotto I, Eichler EE, Pessah IN, Selleck SB.

Children with autism have an elevated frequency of large, rare copy number variants (CNVs). However, the global load of deletions or duplications, per se, and their size, location and relationship to clinical manifestations of autism have not been documented. We examined CNV data from 516 individuals with autism or typical development from the population-based Childhood Autism Risks from Genetics and Environment (CHARGE) study. We interrogated 120 regions flanked by segmental duplications (genomic hotspots) for events >50 kbp and the entire genomic backbone for variants >300 kbp using a custom targeted DNA microarray. This analysis was complemented by a separate study of five highly dynamic hotspots associated with autism or developmental delay syndromes, using a finely tiled array platform (>1 kbp) in 142 children matched for gender and ethnicity. In both studies, a significant increase in the number of base pairs of duplication, but not deletion, was associated with autism. Significantly elevated levels of CNV load remained after the removal of rare and likely pathogenic events. Further, the entire CNV load detected with the finely tiled array was contributed by common variants. The impact of this variation was assessed by examining the correlation of clinical outcomes with CNV load. The level of personal and social skills, measured by Vineland Adaptive Behavior Scales, negatively correlated (Spearman's r = -0.13, P = 0.034) with the duplication CNV load for the affected children; the strongest association was found for communication (P = 0.048) and socialization (P = 0.022) scores. We propose that CNV load, predominantly increased genomic base pairs of duplication, predisposes to autism.
All publications