NextSeq 500 paired end sequencing; GSM3305231: SortSeq_Mixture 383 samples; Homo sapiens; RNA-Seq

SortSeq_Mixture 383 samples

Single cell RNA sequencing (scRNA-seq) technology has undergone rapid development in recent years and brings new challenges in data processing and analysis. This has led to an explosion of tailored analysis methods for scRNA-seq to address various biological questions. However, the current lack of gold-standard benchmarking datasets makes it difficult for researchers to evaluate the performance of the many methods available in a systematic manner. Here, we designed and generated a cross-platform benchmark dataset that has in-built truth in various forms and varying levels of biological noise. We used this dataset to compare different protocols and data analysis methods.  We found that different protocols have different data quality and ERCC spike-in works independently to endogenous RNA. We found significant differences in the results from the methods compared and we associated the results with data characteristics to identify methods that perform well in different situations. Our dataset and analysis provide a valuable resource for algorithm selection in different biological settings. Overall design: our experiment utilized the 3 human lung adenocarcinoma cell lines H2228, H1975 and HCC827. The experiment included mixtures of RNA and single cells from these cell lines. For the single cell designs, the three cell lines were mixed equally and processed by 10X chromium, Drop-seq and CEL-seq2, referred to as sc_10X, sc_Drop-seq and sc_CEL-seq2 respectively in analysis that follows. For the mixture designs, we used plate-based protocols to mix and dilute samples in 2 different ways. 9 cell mixtures from the 3 cell lines were sorted in different combinations in the cell mixture experiment and data were generated by CEL-seq2, the material after pooling from 384 wells were subsampled in either 1/9 or 1/3 to simulate cells of different sizes, with different PCR product clean up ratios ranging from 0.7 to 0.9, referred to as cellmix1 to cellmix4. For the cell mixture experiment, we also sorted wells with 10 times more cells (90 cells) to provide a pseudo bulk reference for each mixture (referred to as cellmix5).  Distinct RNA mixtures which were diluted down to create single cell equivalents (ranging from 3.75, 7.5, 15 to 30 pg per well) were generated using CEL-seq2 and SORT-seq (referred to as RNAmix_CEL-seq2 and RNAmix_Sort-seq.

Designing a single cell RNA sequencing benchmark dataset to compare protocols and analysis methods (RNAmix_Sort-seq)

Submitted on 26-JUL-2018

For the three cell lines, cells were dissociated into single cell suspensions in FACS buffer and sorted for 9-cell-mixture and single cell experiment (see below for sorting strategy).   The remaining cells were centrifuged and frozen at -80C for later RNA extraction. The RNA was then extracted using a Qiagen RNA miniprep kit. The amount of RNA was quantified using both Invitrogen Qubit fluorometric quantitation and an Agilent 4200 bioanalyzer to get an accurate estimation. The extracted RNA was then diluted to 60 ng\ul and then mixed in different proportions, according to the study design. The different mixtures were further diluted to create an RNA series that ranged from 3.75pg to 30pg that was dispensed into CEL-seq2 and SORT-seq primer plates using a Nanodrop II dispenser. Prepared RNA mixture plates were sealed and immediately frozen upside down at -80C. For CEL-seq2, Single cells were flow sorted into chilled 384-well PCR plates containing 1.2ul of primer/ERCC mix using a BD FACSAria III flow cytometer. Sorted plates were sealed and immediately frozen upside down at -80C. Then these plates, together with the mRNA mixture plates, were taken from -80C and processed using an adapted Cel-Seq2 protocol with the following variations. The second strand synthesis was performed using NEBNext Second Strand Synthesis module in a final reaction volume of 8 ul and NucleoMag NGS Clean-up and Size select magnetic beads were used for all DNA purification and size selection steps. For the 9-cell-mixture plates, clean up of the PCR product was performed with 2X0.7-0.9 bead/DNA ratio. For the single cell and mRNA mixture plates, two different clean up ratios for the PCR product were used (0.8 followed by 0.9). The choice of clean-up ratio was optimized from the QC results of the 9-cell-mixture data and the SORT-seq protocols.