Data summary

The organism is mmusculus_gene_ensembl, and gene id type is ensembl_gene_id.

Cell barcode statistics

Plot barcode match statistics in pie chart:

Read alignment statistics

## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`
## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`

Summary and distributions of QC metrics

Datatable of all QC metrics:

## [1] "no ERCC spike-in. Skip `non_ERCC_percent`"

Summary of all QC metrics:

Number of reads mapped to exon before UMI deduplication VS number of genes detected:

## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`

Quality control

Detect outlier cells

A robustified Mahalanobis Distance is calculated for each cell then outliers are detected based on the distance. However, due to the complex nature of single cell transcriptomes and protocol used, such a method can only be used to assist the quality control process. Visual inspection of the quality control metrics is still required. By default we use comp = 1 and the algorithm will try to separate the quality control metrics into two gaussian clusters.

The number of outliers:

## the following QC metrics not found in colData from sce:
## non_ERCC_percent
## 
## FALSE  TRUE 
##   366    18

Pairwise plot for QC metrics, colored by outliers:

Plot high expression genes

Remove low quality cells and plot highest expression genes.

## [1] "Number of NA in new gene id: 3. Duplicated id: 8.5"
## [1] "First 5 duplicated:"
##         ensembl_gene_id external_gene_name
## 454  ENSMUSG00000111063            Zkscan7
## 1174 ENSMUSG00000063488            Zkscan7
## 2901 ENSMUSG00000103532             Gm4430
## 2814 ENSMUSG00000075470              Deaf1
## 3792 ENSMUSG00000058886              Deaf1
## 4832 ENSMUSG00000073079              Srp54
##                                                                                                  description
## 454                             zinc finger with KRAB and SCAN domains 7 [Source:MGI Symbol;Acc:MGI:3040678]
## 1174                            zinc finger with KRAB and SCAN domains 7 [Source:MGI Symbol;Acc:MGI:3040678]
## 2901                                                 predicted gene 4430 [Source:MGI Symbol;Acc:MGI:3782614]
## 2814 asparagine-linked glycosylation 10B (alpha-1,2-glucosyltransferase) [Source:MGI Symbol;Acc:MGI:2146159]
## 3792            Deformed epidermal autoregulatory factor 1 homolog  [Source:UniProtKB/Swiss-Prot;Acc:Q9Z1T5]
## 4832                    Signal recognition particle 54 kDa protein  [Source:UniProtKB/Swiss-Prot;Acc:P14576]
## Note that the names of some metrics have changed, see 'Renamed metrics' in ?calculateQCMetrics.
## Old names are currently maintained for back-compatibility, but may be removed in future releases.

Remove low abundance genes

Plot the log10 average count for each gene:

As a loose filter we keep genes that are expressed in at least two cells and for cells that express that gene, the average count larger than 1. This is not

## [1] 510 366

We have 510 genes left after removing low abundance genes.

Data normalization

Sample normalization

We perform normalization using scater and scran,

5-point summary of size factors:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.04606 0.35525 0.70192 1.00000 1.40136 4.90235

PCA plot using gene expressions as input, colored by the number of genes.

## Warning in .disambiguate_args(...): non-plotting arguments like
## 'exprs_values' should go in 'run_args'

Normalize the data using size factor and get high variable genes

The highly variable genes are chosen based on trendVar from scran with FDR > 0.05 and biological variation larger than 0.5. If the number of highly variable genes is smaller than 100 we will select the top 100 genes by biological variation. If the number is larger than 500 we will only keep top 500 genes by biological variation.

Heatmap of high variable genes

Dimensionality reduction using high variable genes

Dimensionality reduction by PCA

## Warning in .disambiguate_args(...): non-plotting arguments like
## 'exprs_values' should go in 'run_args'

Dimensionality reduction by t-SNE

## Warning in .disambiguate_args(...): non-plotting arguments like
## 'exprs_values' should go in 'run_args'

## Warning in .disambiguate_args(...): non-plotting arguments like
## 'exprs_values' should go in 'run_args'

## Warning in .disambiguate_args(...): non-plotting arguments like
## 'exprs_values' should go in 'run_args'

Session information

## R version 3.5.0 (2018-04-23)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.3
## 
## Matrix products: default
## BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] Rtsne_0.13                  scran_1.8.0                
##  [3] scater_1.8.0                DT_0.4                     
##  [5] plotly_4.7.1                readr_1.1.1                
##  [7] scales_0.5.0                Rsubread_1.30.0            
##  [9] scPipe_1.3.2                SingleCellExperiment_1.2.0 
## [11] SummarizedExperiment_1.10.0 DelayedArray_0.6.0         
## [13] BiocParallel_1.14.0         matrixStats_0.53.1         
## [15] Biobase_2.40.0              GenomicRanges_1.32.0       
## [17] GenomeInfoDb_1.16.0         IRanges_2.14.1             
## [19] S4Vectors_0.18.1            BiocGenerics_0.26.0        
## [21] ggplot2_2.2.1               stringr_1.3.0              
## [23] magrittr_1.5               
## 
## loaded via a namespace (and not attached):
##  [1] bitops_1.0-6             bit64_0.9-7             
##  [3] RColorBrewer_1.1-2       progress_1.1.2          
##  [5] httr_1.3.1               rprojroot_1.3-2         
##  [7] dynamicTreeCut_1.63-1    tools_3.5.0             
##  [9] backports_1.1.2          R6_2.2.2                
## [11] vipor_0.4.5              DBI_1.0.0               
## [13] lazyeval_0.2.1           colorspace_1.3-2        
## [15] gridExtra_2.3            prettyunits_1.0.2       
## [17] GGally_1.3.2             curl_3.2                
## [19] bit_1.1-12               compiler_3.5.0          
## [21] labeling_0.3             DEoptimR_1.0-8          
## [23] robustbase_0.93-0        digest_0.6.15           
## [25] rmarkdown_1.9            XVector_0.20.0          
## [27] pkgconfig_2.0.1          htmltools_0.3.6         
## [29] limma_3.36.0             htmlwidgets_1.2         
## [31] rlang_0.2.0              RSQLite_2.1.0           
## [33] FNN_1.1                  shiny_1.0.5             
## [35] DelayedMatrixStats_1.2.0 bindr_0.1.1             
## [37] jsonlite_1.5             crosstalk_1.0.0         
## [39] mclust_5.4               dplyr_0.7.4             
## [41] RCurl_1.95-4.10          GenomeInfoDbData_1.1.0  
## [43] Matrix_1.2-14            Rhdf5lib_1.2.0          
## [45] ggbeeswarm_0.6.0         Rcpp_0.12.16            
## [47] munsell_0.4.3            viridis_0.5.1           
## [49] stringi_1.2.2            yaml_2.1.19             
## [51] edgeR_3.22.0             MASS_7.3-50             
## [53] zlibbioc_1.26.0          rhdf5_2.24.0            
## [55] org.Hs.eg.db_3.6.0       plyr_1.8.4              
## [57] grid_3.5.0               blob_1.1.1              
## [59] promises_1.0.1           shinydashboard_0.7.0    
## [61] lattice_0.20-35          hms_0.4.2               
## [63] locfit_1.5-9.1           knitr_1.20              
## [65] pillar_1.2.2             igraph_1.2.1            
## [67] rjson_0.2.15             reshape2_1.4.3          
## [69] biomaRt_2.36.0           XML_3.98-1.11           
## [71] glue_1.2.0               evaluate_0.10.1         
## [73] data.table_1.11.0        httpuv_1.4.1            
## [75] org.Mm.eg.db_3.6.0       gtable_0.2.0            
## [77] purrr_0.2.4              tidyr_0.8.0             
## [79] reshape_0.8.7            assertthat_0.2.0        
## [81] mime_0.5                 xtable_1.8-2            
## [83] later_0.7.2              viridisLite_0.3.0       
## [85] tibble_1.4.2             beeswarm_0.2.3          
## [87] AnnotationDbi_1.42.0     memoise_1.1.0           
## [89] tximport_1.8.0           bindrcpp_0.2.2          
## [91] statmod_1.4.30           Rhtslib_1.12.0