Identifying Recurring and Causative Transcriptomic Alterations in Paediatric Brain Tumours

Primary brain tumours are a leading cause of cancer-related mortality (31%) in children less than 15 years old. In particular, those affected by high-grade astrocytomas (HGA) have a less than 10% expected survival at 3 years despite aggressive therapy. Through the ICHANGE consortium, our group uncovered recurrent somatic gain-of-function mutations in H3 histone variants in paediatric and young adult HGA. There is therefore great interest in uncovering novel, downstream cancer- driving mechanisms that are easier to therapeutically target. Transcriptome profiling of tumours with RNA sequencing is particularly worthy of our focus as it provides crucial information that could otherwise not be exacted from genomic and epigenomic data alone. This includes gene expression/regulation, splice variants, expressed fusion transcripts and sequence variants, all of which have been shown to be involved in oncogenesis.

The overall goal of my project is to identify recurring and causal transcriptomic alterations in paediatric brain tumours by means of comparative analyses (e.g. case-control paradigm) and unsupervised analyses (e.g. classification) on a large number of samples.

Aim 1: Develop a highly scalable, comprehensive and automated high-throughput RNA-Seq pipeline. RNA-Seq data analyses rely heavily on the integration of different, possibly non- interoperable bioinformatics tools. This requires the development of a highly scalable, comprehensive and automated pipeline to extract gene expression/regulation, splice variants, expressed fusion transcripts and sequence variants information from RNA-Seq data.

Aim 2: Integrate multiple large datasets. International efforts have led to the creation of cancer genome projects (i.e. TCGA and ICGC) involving a high volume tumour sample data. The various brain tumour and normal brain tissue samples of TCGA and ICGC will be processed and analyzed alongside those of ICHANGE as described in Aim 1.

Aim 3: Apply approach to ICHANGE datasets. The methods described in Aim 1 are to be applied to the joint ICHANGE/ICGC/TCGA paediatric tumour dataset for which (1) the drivers remain unknown, (2) the known drivers are unsuitable for drug therapy and (3) there are expected changes in the transcriptome incurred by genetic or epigenetic changes (i.e. histone mutations) in an effort to identify recurring and causal transcriptomic alterations in paediatric brain tumours.