Our goal is to quantify the diversity of cell types in the adult mouse brain using large-scale single-cell transcriptomics. Towards that goal, we have generated a dataset that includes close to 75,000 single cells from multiple cortical areas and the hippocampus. Samples were collected from fine dissections of brain regions from male and female mice. For most brain regions, we isolated labeled cells from pan-GABAergic, pan-glutamatergic, and pan-neuronal transgenic lines. For primary visual cortex (VISp) and anterolateral motor cortex (ALM), we sampled additional cells using driver lines that label more specific and rare types. To investigate the correspondence between transcriptomic types and neuronal projection properties, we collected cells labeled from retrograde injections for select combinations of target injection sites and dissection regions. Labeled cells were collected by fluorescence activated sorting (FACS) of single cells. We also collected cells without fluorescent label to sample non-neuronal cell types. Isolated single cells were processed for RNA sequencing using SMART-Seq v4. This dataset reveals the molecular architecture of the neocortex and hippocampal formation, with a wide range of shared and unique cell types across areas. It provides the basis for comparative studies of cellular diversity in development, evolution, and diseases.
Additional RNA-Seq data is available as part of the Brain Cell Data Center (BCDC) portal.
Tissue samples were obtained from adult (postnatal day P53-P59) mice, both male and female, carrying one or two recombinase transgenes (Cre, FlpO) and a recombinase-dependent reporter transgene. Detailed descriptions of recombinase and reporter lines can be found in Transgenic Characterization. In addition, retrogradely labeled cells were isolated from reporter mice infected with a Cre-dependent virus, or from wild-type mice infected with a reporter virus.
We injected AAV2-retro-EF1a-Cre (Tervo et al., 2016), RV∆GL-Cre (Chatterjee et al., 2018), or CAV-Cre (gift of Miguel Chillon Rodrigues, Universitat Autònoma de Barcelona) (Hnasko et al., 2006) into brains of heterozygous or homozygous Ai14 mice using established procedures (Tasic et al., 2016, 2018). For ALM experiments, we also injected AAV2-retro-CAG-GFP or AAV2-retro-CAG-tdTomato (Tervo et al., 2016) into wild-type mice. Mice were anesthetized with 5% isoflurane and then placed into a stereotaxic alignment instrument (Kopf, model 1900). Anesthesia was maintained for the duration of the surgery by administering isoflurane at 1-2% through a nose cone. The skin along the midline of the skull was opened using a scalpel, and a surgical drill was used to create a small hole in the skull. A pulled glass pipette prefilled with virus solution was lowered into the brain, and 165-500 nl of the virus solution was delivered to the targeted brain area using a pressure injection system (NanoJect II, Drummond Scientific Company, Catalog# 3-000-204). Stereotaxic coordinates were obtained from Paxinos adult mouse brain atlas (Paxinos and Franklin, 2008). For two VISp experiments, we injected into SCs by inserting the needle through the cerebellum at a 45° angle in the posterior to anterior direction. After the delivery of virus solution into the brain, the glass pipette was retracted and the incision in the scalp was closed using sutures. The animal was removed from the stereotaxic frame and allowed to recover from anesthesia. Mice were sacrificed 7−21 days after surgery for single cell isolation. TdT+ or GFP+ single cells were isolated from cortical areas as described below.
Mice were anesthetized with 5% isoflurane and intracardially perfused with ice-cold, oxygenated artificial cerebral spinal fluid (ACSF). The brain was then rapidly dissected and mounted for coronal slice preparation on the chuck of a Compresstome VF-300 vibrating microtome (Precisionary Instruments). Using a custom photodocumentation system (Mako G125B PoE camera with custom integrated software), a blockface image of the coronal or semi-coronal brain surface was acquired before each section was sliced at 250 μm intervals. The slice was then hemisected along the midline, and typically both hemispheres for cortical samples were transferred to ACSF.
Each slice-hemisphere was transferred into a Sylgard-coated dissection dish containing chilled, oxygenated ACSF. Brightfield and fluorescent images between 1X and 20X were obtained of the intact tissue with a Nikon Digital Sight DS-Fi1 or a Sentech STC-SC500POE camera mounted to a Nikon SMZ1500 dissecting microscope. To guide anatomical targeting for dissection, boundaries were identified by trained anatomists, comparing the blockface image and the slice image to a matched plane of the Allen Reference Atlas. In general, three to five slices were sufficient to capture the targeted region of interest, allowing for expression analysis along the anterior-posterior axis. The region of interest was then dissected and both brightfield and fluorescent images of the dissections were acquired for secondary verification. The dissected regions were transferred in ACSF to a microcentrifuge tube and stored on ice. This process was repeated for all slices containing the target region of interest, with each region of interest deposited into a new microcentrifuge tube.
After all regions of interest were dissected, the tissue pieces were digested in an ACSF solution containing 2 mg/ml of pronase (before 6/21/2018) or 30 U/ml of papain (after 6/21/2018). With pronase, the tissue was incubated at room temperature (approximately 22°C) for a duration that consisted of adding 15 minutes to the age of the mouse (in days; i.e., P53 specimen had a digestion time of 68 minutes). With papain, the tissue was incubated in a dry oven at 35°C (target solution temperature of 30°C) for 30 minutes. After digestion, the enzymatic solution was removed and a quenching buffer (1% FBS or 1% BSA) was added. The tissue was washed two more times with the quenching solution with the third wash being 500 μl for final sample volume. The sample was then triturated using fire-polished glass pipettes of decreasing bore sizes (600, 350 and 150 μm). The cell suspension was incubated on ice in preparation for fluorescence-activated cell sorting (FACS).
Note: Samples collected after 12/16/2016 had 0.0132M trehalose added to all solutions used after the point of slicing to improve cell viability and yield.
Slice Preparation with Tissue Dissociation
Samples were prepared for FACS by addition of DAPI to the single cell suspension to a final concentration of 2 µg/ml (cells). The suspension was then filtered through a fine-mesh cell strainer (35 µm for cell samples collected before 2/22/2017, 70 µm for cell samples collected after 2/22/2017).
Single cells were sorted by excluding DAPI positive events and debris, and gating to include red fluorescent events (tdTomato-positive cells) or green fluorescent events (GFP-positive cells). Single cells were collected into strip tubes containing 11.5 μl of collection buffer (SMART-Seq v4 lysis buffer 0.83x (Takara #634894), RNase Inhibitor (0.17U/µl)). After sorting, the samples were subjected to centrifugation and then stored at -80°C.
SMART-Seq v4 Ultra Low Input RNA Kit for Sequencing (Takara #634894) was used per the manufacturer’s instructions for cDNA synthesis of single-cell RNA and subsequent amplification. Sequencing libraries were prepared using the NexteraXT DNA Library Preparation kit (Illumina FC-131-1096) with NexteraXT Index Kit V2 Set A, B, C, or D (FC-131-2001, 2002, 2003, or 2004) or custom 8-base or 10-base Unique Design index primers designed and manufactured by IDT (Integrated DNA Technologies). NexteraXT DNA Library prep was done at either 0.5x volume manually or 0.4x or 0.2x volume on the Mantis instrument (Formulatrix). Pooled sequencing libraries were sent to an outside vendor for sequencing on an Illumina HiSeq 2500 instrument. All of the library pools were run using Illumina High Output V4 chemistry. RNA sequencing services were provided by Covance Genomics Laboratory, Seattle subsidiary of LabCorp Group of Holdings, and The Broad Institute Genome Sequencing Platform.
Nextera XT at 0.2X on the Mantis
SMART-Seq v4 (1x) amplification
SMART-Seq v4 (0.5x) amplification
From raw reads, the Illumina sequencing adapters were clipped using the fastqMCF program. After clipping, the paired-end reads were mapped using Spliced Transcripts Alignment to a Reference (STAR v2.5.3) (Dobin, et al., 2013) using the default settings. These reads were aligned to the mm10 mouse genome sequence (Genome Reference Consortium, 2011) with the RefSeq transcriptome version GRCm38.p3 (current as of 01/15/2016) and updated by removing duplicate Entrez gene entries from the gtf reference file. STAR uses and builds its own suffix array index, which considerably accelerates the alignment step while improving on sensitivity and specificity due to its identification of alternative splice junctions. Reads that did not map to the genome were then aligned to synthetic construct (i.e. ERCC) sequences and the E.coli genome (version ASM584v2). The output files included quantification of the uniquely mapped reads (raw exon and intron counts for the transcriptome-mapped reads). The vast majority of reads could be mapped uniquely to the reference, with a median of 85.5% of uniquely mapped reads per cell (range: 45.5-91.9%). The output files further contained the percentages of reads mapped to the transcriptome, to ERCC spike-in controls, and to E.coli. These metrics were used for quality control assessments.
Quantification of mapped reads was performed using summerizeOverlaps from the R package GenomicAlignments. Read alignments to the genome (exonic, intronic, and intergenic counts) were visualized as beeswarm plots using the R package beeswarm. Expression levels were calculated as counts per million (CPM) of exonic plus intronic reads. Gene detection was calculated as the number of genes expressed in each sample with CPM > 0.
Cells were included in downstream analysis if they passed all of the following QC thresholds:
Cells were grouped into transcriptomic cell types using the iterative clustering procedure described in Tasic et al. 2018. Briefly, intronic and exonic read counts were summed to compute CPM (counts per million) values. Predicted gene models (gene names that start with Gm), genes from the mitochondrial chromosome, ribosomal genes, sex-specific genes, as well as genes that are were detected in fewer than four cells were removed from downstream analysis. All quality control qualified cells were clustered following the steps of high variance gene selection, dimensionality reduction, dimension filtering, Jaccard–Louvain or hierarchical (Ward) clustering, and cluster merging. Differential gene expression (DGE) was computed for every pair of clusters, and pairs that did not meet the DGE criteria were merged. Differentially expressed genes were defined using two criteria: 1) significant differential expression (> 2-fold; Benjamini-Hochberg false discovery rate < 0.01) using the R package limma and 2) binary expression (CPM > 1 in more the half of cells in one cluster and < 30% of this proportion in the other cluster). We define the deScore as the sum of the −log10(false discovery rate) of all differentially expressed genes (each gene contributes to no more than 20), and pairs of clusters with deScore < 150 were merged.
This process was repeated within each resulting cluster until no more child clusters met DGE or cluster size criteria (minimum of 4 cells). The entire clustering procedure was repeated 100 times using 80% of all cells sampled at random, and the frequency with which cells co-cluster was used to generate a final set of clusters, again subject to differential gene expression and cluster size termination criteria.
The clustering pipeline is implemented in an R package publicly available at github. The clustering method is provided by run_consensus_clust function.
Hierarchical, iterative clustering for analysis of transcriptomics data in R
Data generation was supported by multiple awards, including award U01MH105982 from the National Institute of Mental Health and the Eunice Kennedy Shriver National Institute of Child Health & Human Development, Brain Initiative Cell Census Network (BICCN) award U19MH114830 from the National Institute of Neurological Disorders and Stroke and the National Institute of Mental Health, and by the Allen Institute for Brain Science.
Chatterjee, S., et al. (2018). "Nontoxic, double-deletion-mutant rabies viral vectors for retrograde targeting of projection neurons." Nat Neurosci 21(4): 638-646. doi: 10.1038/s41593-018-0091-7. Epub 2018 Mar 5. PMID PMCID DOI
Hnasko, T. S., et al. (2006). "Cre recombinase-mediated restoration of nigrostriatal dopamine in dopamine-deficient mice reverses hypophagia and bradykinesia." Proc Natl Acad Sci U S A 103(23): 8858-8863. doi: 10.1073/pnas.0603081103. Epub 2006 May 24. PMID PMCID DOI
Paxinos, G. and K. B. J. Franklin (2008). "Mouse brain in stereotaxic coordinates 3rd edition." (Academic Press, Cambridge, MA, 2008).
Tervo, D. G., et al. (2016). "A Designer AAV Variant Permits Efficient Retrograde Access to Projection Neurons." Neuron 92(2): 372-382. doi: 10.1016/j.neuron.2016.09.021. Epub 2016 Oct 6. PMID PMCID DOI