Deciphering gene regulatory mechanisms through the analysis of high-throughput expression data is a challenging computational problem. Previous computational studies have used large expression datasets in order to resolve fine patterns of coexpression, producing clusters or modules of potentially coregulated genes. These methods typically examine promoter sequence information, such as DNA motifs or transcription factor occupancy data, in a separate step after clustering. We needed an alternative and more integrative approach to study the oxygen regulatory network in Saccharomyces cerevisiae using a small dataset of perturbation experiments. Mechanisms of oxygen sensing and regulation underlie many physiological and pathological processes, and only a handful of oxygen regulators have been identified in previous studies. We used a new machine learning algorithm called MEDUSA to uncover detailed information about the oxygen regulatory network using genome-wide expression changes in response to perturbations in the levels of oxygen, heme, Hap1, and Co2+. MEDUSA integrates mRNA expression, promoter sequence, and ChIP-chip occupancy data to learn a model that accurately predicts the differential expression of target genes in held-out data. We used a novel margin-based score to extract significant condition-specific regulators and assemble a global map of the oxygen sensing and regulatory network. This network includes both known oxygen and heme regulators, such as Hap1, Mga2, Hap4, and Upc2, as well as many new candidate regulators. MEDUSA also identified many DNA motifs that are consistent with previous experimentally identified transcription factor binding sites. Because MEDUSA's regulatory program associates regulators to target genes through their promoter sequences, we directly tested the predicted regulators for OLE1, a gene specifically induced under hypoxia, by experimental analysis of the activity of its promoter. In each case, deletion of the candidate regulator resulted in the predicted effect on promoter activity, confirming that several novel regulators identified by MEDUSA are indeed involved in oxygen regulation. MEDUSA can reveal important information from a small dataset and generate testable hypotheses for further experimental analysis.
A predictive model of the oxygen and heme regulatory network in yeast.
No sample metadata fieldsView Samples
To evaluate gene expression in human peripheral blood derived monocytes over the course of an LPS stimulation time-series.
Statistical analysis of MPSS measurements: application to the study of LPS-activated macrophage gene expression.
No sample metadata fieldsView Samples
In order to determine the role of the transcription factor Arntl2 in regulating metastatic ability and identify Arntl2-dependent transcriptonal targets in metastatic lung adenocarcinoma, we sequenced the mRNA from 3 mouse metastasis cell lines. Each of these cell lines (482N1shLuciferase, 482N1shArntl2#1, and 482N1shArntl2#2) were derived from the same parental cell line, 482N1. 482N1 was derived from a lymph node metastasis of a Kras LSL G12D, p53 flox/flox 129S1/SvlmJ mouse model of metastatic lung adenocarcinoma. A comparison of shLuciferase and shArntl2 cell lines reveals Arntl2-dependent changes in the metastatic transcriptome. Overall design: This study includes 6 samples: 2 biological replicates of 482N1 shLuciferase, 2 biological replicates of 482N1 shArntl2#1, and 2 biological replicates of 482N1shArntl2#2. Poly-A RNA was isolated and prepared for sequencing using the Illumina TruSeq RNA kit (v2) to generate 100bp paired end reads. Reads were aligned to mm10.
An Arntl2-Driven Secretome Enables Lung Adenocarcinoma Metastatic Self-Sufficiency.
Cell line, SubjectView Samples
Induced pluripotent stem cells (iPSCs) are an essential tool for studying cellular differentiation and cell types that are otherwise difficult to access. Here we investigate the use of iPSCs and iPSC-derived cells to study the impact of genetic variation across different cell types and as models for the genetics of complex disease. We established a panel of iPSCs from 58 well-studied Yoruba lymphoblastoid cell lines (LCLs); 14 of these lines were further differentiated into cardiomyocytes. We characterized regulatory variation across individuals and cell types by measuring RNA, chromatin accessibility and DNA methylation. Regulatory variation between individuals is lower in iPSCs than in the differentiated cell types, consistent with the intuition that developmental processes are generally canalized. While most cell-type- specific regulatory effects lie in chromatin that is open only in the affected cell-types, we find that 20% of cell-type specific effects are in shared open chromatin. Finally, we developed deep neural network models to predict open chromatin regions in these cell types from DNA sequence alone and were able to use the sequences of segregating haplotypes to predict the effects of common SNPs on tissue-specific chromatin accessibility. Our results provide a framework for using iPSC technology to study regulatory variation in cell types that are otherwise inaccessible. Keywords: Expression profiling by high throughput sequencing Overall design: Immortalized lymphoblastoid cell lines from 58 African individuals were reprogrammed into induced pluripotent stem cells
Impact of regulatory variation across human iPSCs and differentiated cells.
Specimen part, SubjectView Samples
To uncover the gene expression alterations that occur during lung cancer progression, we interrogated the gene expression state of neoplastic cells at different stages of malignant progression. We initiated tumors in KrasLSL-G12D/+;p53flox/flox;R26LSL-tdTomato (KPT) mice with a pool of barcoded lentiviral-Cre vectors and purified Tomatopositive cancer cells away from the diverse and variable stromal cell populations. Five to nine months after tumor initiation, cancer cells were isolated from individual primary tumors and metastases using fluorescence-activated cell sorting. Sequencing of the barcode region of the integrated lentiviral vectors established primary tumor-metastasis and metastasis-metastasis relationships. Tumor barcoding allowed us to unequivocally distinguish non-metastatic primary tumors (TnonMet) from those primary tumors that had seeded metastases (TMet). We profiled 10 TnonMet samples as well as TMet and metastasis (Met) samples representing 12 metastatic events. To examine additional earlier stages of lung cancer development, we also analyzed premalignant cells from hyperplasias that develop in KPT mice shortly after tumor initiation (KPT-Early; KPT-E), as well as tumors from KrasG12D;R26LSL-tdTomato (KT) mice which rarely gain metastatic ability Overall design: This study includes 52 samples: 3 KP late samples, 3KPT early samples,10 non-metastatic primary tumors, 9 metastatic primary tumors, and 27 metastasis in different organs. total RNA was isolated and prepared for sequencing using the OvationÂ® RNA-Seq system and Illumina TruSeq DNA kit (v2) to generate 100bp paired end reads. Reads were aligned to mm10.
Molecular definition of a metastatic lung cancer state reveals a targetable CD109-Janus kinase-Stat axis.
This dataset is part of the manuscript titled "The metabolic regulator ERRalpha, a downstream target of HER2/IGF1, as a therapeutic target in breast cancer" (in review). The expression data obtained in human mammary epithelial cells were used to generate a list of ERRalpha-regulated genes that was later refined in clinical breast cancer datasets to generate a clinically relevant signature of ERalpha activity (referred to as Cluster 3 signature). Using this signature of the estrogen-related receptor alpha (ERRa) to profile more than eight-hundred breast tumors, we found that patients with tumors exhibiting higher ERRa activity were predicted to have shorter disease free survival. Further, the ability of an ERRa antagonist, XCT790, to inhibit breast cancer cell proliferation correlates with the cells intrinsic ERRa activity. These findings highlight the potential of using the ERRa signature and antagonists in targeted therapy for breast cancer. Using a chemical genomic approach we determined that activation of the HER2/IGF1 signaling pathways upregulates the expression of PGC-1b, an obligate cofactor for ERRa activity. Knockdown of PGC-1b in HER2 positive breast cancer cells impaired ERRa signaling and reduced cell proliferation, implicating a functional role of PGC1b/ERRa in the pathogenesis HER2 positive breast cancer.
The metabolic regulator ERRα, a downstream target of HER2/IGF-1R, as a therapeutic target in breast cancer.
Specimen partView Samples
Medulloblastoma is a malignant brain tumor that occurs predominantly in children. Current risk stratification based on the clinical parameters is inadequate for accurate prognostication. In order to get a better understanding of medulloblastoma biology, miRNA profiling of medulloblastomas was carried out in parallel with the expression profiling of protein- coding genes.
Distinctive microRNA signature of medulloblastomas associated with the WNT signaling pathway.
Profiling of the transcriptome of FITChigh/FSCdim and FITCdim/FSChigh sub-populations. Three biological replicates were profiled for each cell type. Overall design: Profiling of the transcriptome of FITChigh/FSCdim and FITCdim/FSChigh sub-populations. Three biological replicates were profiled for each cell type.
An autofluorescence-based method for the isolation of highly purified ventricular cardiomyocytes.
Specimen part, Cell line, SubjectView Samples
TGF-beta3 produced by developing Th17 cells induces highly pathogenic T cells that are functionally and molecularly distinct from TGF-beta1-induced Th17 cells. The microarray data represent a distinct molecular signature for pathogenic versus non-pathogenic Th17 cells.
Induction and molecular signature of pathogenic TH17 cells.
Sex, Specimen partView Samples
Methylation at 5-cytosine (5-mC) is a fundamental epigenetic DNA modification associated recently with cardiac disease. In contrast, the role of 5-hydroxymethylcytosine (5-hmC) – 5-mC's oxidation product – is unknown in the context of the heart. Here, we assess the hydroxymethylome in embryonic, neonatal, adult and hypertrophic mouse cardiomyocytes, showing that dynamic modulation of hydroxymethylated DNA is associated with specific transcriptional networks during heart development and failure. DNA hydroxymethylation marks gene bodies of highly expressed genes and distal regulatory regions with enhanced activity. Pathological hypertrophy is characterized by a partial shift towards a fetal-like distribution pattern. We further demonstrate a regulatory function of TET2 and provide evidence that the expression of key cardiac genes, such as Myh7 is modulated by TET2-mediated 5-hmC deposition on the gene body and at enhancers in cardiac cells. We thus provide the first genome-wide analysis of 5-hmC in the cardiomyocyte, and establish the role of this epigenetic modification in heart development and disease Overall design: Profiling of the transcriptome of embryonic, neonatal, adult, 1 week hypertrophic cardiomyocytes, sh-control and sh-TET2 cardiomyocytes. Two biological replicates were profiled for each cell type.
DNA hydroxymethylation controls cardiomyocyte gene expression in development and hypertrophy.
Specimen part, Cell line, SubjectView Samples