Cellranger: Difference between revisions
No edit summary |
No edit summary |
||
Line 17: | Line 17: | ||
This unpacks Cell Ranger, its dependencies, and the cellranger script into a new directory called cellranger-x.y.z. | This unpacks Cell Ranger, its dependencies, and the cellranger script into a new directory called cellranger-x.y.z. | ||
Download and unpack any of the reference data files in a convenient location: | #Download and unpack any of the reference data files in a convenient location: | ||
# [ download file from downloads page ] | # [ download file from downloads page ] | ||
# Example human reference transcriptome | # Example human reference transcriptome | ||
Line 23: | Line 23: | ||
This creates a new directory called refdata-gex-GRCh38-2020-A that contains a single reference (in this case, GRCh38). Each reference contains a set of pre-generated indices and other data required by Cell Ranger. | This creates a new directory called refdata-gex-GRCh38-2020-A that contains a single reference (in this case, GRCh38). Each reference contains a set of pre-generated indices and other data required by Cell Ranger. | ||
Prepend the Cell Ranger directory to your $PATH. This will allow you to invoke the cellranger command. | #Prepend the Cell Ranger directory to your $PATH. This will allow you to invoke the cellranger command. | ||
export PATH=/opt/cellranger-x.y.z:$PATH | export PATH=/opt/cellranger-x.y.z:$PATH | ||
You may wish to add this command to your .bashrc for convenience. | You may wish to add this command to your .bashrc for convenience. | ||
Run cellranger | #Run cellranger | ||
[name@server ~]$ cellranger | [name@server ~]$ cellranger | ||
Process 10x Genomics Gene Expression, Feature Barcode, and Immune Profiling data | Process 10x Genomics Gene Expression, Feature Barcode, and Immune Profiling data |
Revision as of 14:37, 19 August 2024
This is not a complete article: This is a draft, a work in progress that is intended to be published into an article, which may or may not be ready for inclusion in the main wiki. It should not necessarily be considered factual or authoritative.
Cellranger Description
Cellranger is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more. Note: This page does not cover all features of Cellranger.
Please refer to the official documentation for the complete list of all subtools.
Download and installation
Cell Ranger is licensed so the users have to register and download the file from https://www.10xgenomics.com/support/software/cell-ranger/downloads/eula?closeUrl=%2Fsu
- Download and unpack the cellranger-x.y.z.tar.gz tar file.
# [ download file from downloads page ] tar -xzvf cellranger-x.y.z.tar.gz
If you downloaded Cell Ranger in the .xz compression format, be sure to use the correct file extension tar.xz and tar flags to unpack:
tar -xvf cellranger-x.y.z.tar.xz
This unpacks Cell Ranger, its dependencies, and the cellranger script into a new directory called cellranger-x.y.z.
- Download and unpack any of the reference data files in a convenient location:
# [ download file from downloads page ] # Example human reference transcriptome tar -xzvf refdata-gex-GRCh38-2020-A.tar.gz
This creates a new directory called refdata-gex-GRCh38-2020-A that contains a single reference (in this case, GRCh38). Each reference contains a set of pre-generated indices and other data required by Cell Ranger.
- Prepend the Cell Ranger directory to your $PATH. This will allow you to invoke the cellranger command.
export PATH=/opt/cellranger-x.y.z:$PATH
You may wish to add this command to your .bashrc for convenience.
- Run cellranger
[name@server ~]$ cellranger Process 10x Genomics Gene Expression, Feature Barcode, and Immune Profiling data USAGE: cellranger <SUBCOMMAND>
General usage
Demultiplexing
Cellranger mkfastq demultiplexes raw base call (BCL) files generated by Illumina sequencers into FASTQ files. It is a wrapper around Illumina's bcl2fastq, with additional features that are specific to 10x Genomics libraries and a simplified sample sheet format.
A simple csv sample sheet is recommended for most sequencing experiments. The simple csv format has only three columns (Lane, Sample, Index)
Lane,Sample,Index 1,test_sample,SI-TT-D9
If you have multiple library types (e.g., Gene Expression, Feature Barcode, and Cell Multiplexing) that all have the same type of indexing (e.g., dual-indexing), the samples can be demultiplexed together and the CSV could be formatted as follows
Lane,Sample,Index 1,GEX_sample,SI-TT-D9 1,FB_sample,SI-NT-A1 1,CMO_sample,SI-NN-A1
You can run the mkfastq pipeline as follows
[name@server ~] cellranger mkfastq --id=$ID --run=/path/to/bcl --csv=test_sample.csv
Counting
Cellranger count takes FASTQ files from cellranger mkfastq and performs alignment, filtering, barcode counting, and UMI counting. It uses the Chromium cellular barcodes to generate feature-barcode matrices, determine clusters, and perform gene expression analysis. The count pipeline can take input from multiple sequencing runs on the same GEM well. Cellranger count also processes Feature Barcode data alongside Gene Expression reads.
[name@server ~] cellranger count --id=$ID \ --transcriptome=refdata-gex-GRCh38-2020-A \ --fastqs=$FASTQS \ --sample=mysample \ --create-bam=true \ --localcores=8 \ --localmem=64
Cell Ranger provides a set of analysis pipelines that process Chromium Single Cell Gene Expression data to align reads, generate Feature Barcode matrices, perform clustering and other secondary analysis, and more. The required input files for running Cell Ranger vary depending on the chosen pipeline. To select the appropriate pipeline for your needs, please refer to the Choosing a pipeline page. Choosing a pipeline
Aggregating
Cellranger aggr aggregates outputs from multiple runs of cellranger count, normalizing those runs to the same sequencing depth and then recomputing the feature-barcode matrices and analysis on the combined data. The aggr pipeline can be used to combine data from multiple samples into an experiment-wide feature-barcode matrix and analysis.
To aggregate the datasets, you need to create a CSV containing the following columns
sample_id,molecule_h5 Sample1,/opt/runs/outs/per_sample_outs/Sample1/count/sample_molecule_info.h5 Sample2,/opt/runs/outs/per_sample_outs/Sample2/count/sample_molecule_info.h5
You can run the aggr pipeline as follows
[name@server ~] cellranger aggr --id=$ID --csv=aggr.csv
Running Cellranger in the alliance clusters
#!/bin/bash #SBATCH --account=def-someprof #SBATCH -N 1 #SBATCH --ntasks-per-node=8 #SBATCH --mem=64g #SBATCH --time=24:00:00 FASTQS=$1 ID=$2 WORK_DIR=$3 cd $WORK_DIR cellranger count --id=$ID \ --fastqs=$FASTQS \ --transcriptome=refdata-gex-GRCh38-2020-A \ --create-bam=true \ --localcores=8 \ --localmem=64