Personal tools
You are here: Home Documentation Data Analysis

Data Analysis

NGS-to-Spreadsheet Analysis Service

IIGB's bioinformatics facility offers several pipelined analysis services for NGS data. The Basic Read Processing is free of charge if performed as part of the NGS sequencing service. The downstream NGS-to-Spreadsheet analysis requires an extra service charge (see below for details). The latter is currently available for standard RNA-Seq, small RNA-Seq, SNP-Seq and ChIP-Seq samples that can be mapped against well annotated reference genomes. The following data processing steps and result reporting mechanisms are included in these services. Most of them use the systemPipeR NGS workflow and report generation environment available here.


  • Basic Read Processing (free of charge as part of sequencing service)
    • Base calling and download of NGS files (FASTQ format)
    • QC run report: provides a detailed summary of the number and quality of the sequences generated per sample in a sequencing run
    • De-multiplexing: upon customer request, multiplexed samples can be sorted into separate FASTQ files based on their barcodes/indices.

Please note, in certain cases an additional fee may apply:

      •  If additional barcodes are submitted for demultiplexing after the data is processed.
      •  If incorrect barcodes were submitted previously and new barcodes are submitted for demultiplexing.
    • NGS-to-Spreadsheet Analysis Pipelines
      • RNA-Seq & small RNA-Seq (up to 10 samples for $416*)
        • Read mapping with Tophat (other aligner upon request)
        • Generation of count table of reads overlapping with chosen annotation feature (e.g. genes, exons, intergenic regions, etc.)
        • Read counts converted into RPKM/FPKM values 
        • Statistical identification of differentially expressed genes (DEGs) and exons (DEXs) with edgeR, DESeq or DEXSeq (requires replicates!)
        • Output of splice junction/variant information
        • Download of spreadsheet containing expression values, statistical analysis and GO term enrichment results
        • Viewing of read mappings, analysis results and annotation data in IGV genome browser
        • Detailed analysis protocol and pipeline script to rerun analysis


      • SNP-Seq** (up to 4 samples for $520*)
        • Read mapping with BWA (other aligner upon request)
        • Variant calling: SNPs and short indels 
        • Annotation of detected variants: synonymous/non-synonymous SNPs, genomic context information (e.g. variant mapping to genes, UTRs, intergenic regions, etc.)
        • Download of spreadsheet containing annotated variants and read/statistical support information
        • Viewing of read mappings, analysis results and annotation data in IGV genome browser
        • Detailed analysis protocol and pipeline script to rerun analysis
        • ** pipeline handles short indel and EMS mutation projects


      • ChIP-Seq (up to 4 samples for $416*)
        • Read mapping with Bowtie (other aligner upon request)
        • Peak calling (e.g. BayesPeak or other peak callers) 
        • Annotation of peaks with genomic context information (e.g. peaks overlapping with promoters, intergenic regions, UTRs, etc.)
        • Differential peak analysis with edgeR or DESeq (requires control and test samples with replicates)
        • Download of spreadsheet containing annotated peaks, read coverage information and statistical analysis results
        • Viewing of read mappings, analysis results and annotation data in IGV genome browser
        • Detailed analysis protocol and pipeline script to rerun analysis


    *The prices listed apply to NGS data sets with the number of sampels specified (e.g. 1-10 RNA-Seq samples corresponding to the same number of demultiplexed FASTQ files) having combined not more then 500 million reads. Extra charges apply to more complex data sets or to incremental experimental strategies, such as adding more samples to an analysis set over time. All prices listed are for customers from UC institutions. The rates for customer from external educational institutions are about twice the listed amount and for customers from commercial organizations it is three times the listed amount. 

    Customers are expected to provide a detailed outline of the design of their experiment (e.g. treatments, replicates, comparisons), and the download information for the appropriate reference genome sequence and annotation data required for these analyses. For questions please Contact Us


    Downloading Sequencing Results in Bulk

    To download all of your sequencing results in bulk, you can use wget. Most Linux distributions come with wget preinstalled, but you can install it here if you are using Windows.


    wget -nH --cut-dirs=1 -r -np -A 'fq.gz,fastq.gz,txt' --accept-regex '(flowcell|Undetermined)'$FC_ID/
    (where $FC_ID is your flowcell number) will download all of your FASTQ files into a folder named "$FC_ID".
    wget -nH --cut-dirs=1 -r -np -R 'index.html*'$FC_ID/fastq_report/
    (where $FC_ID is your flowcell number) will download your FastQC report into the "$FC_ID/fastq_report" folder.
    wget -nH --cut-dirs=1 -r -np$FC_ID/qc/

    (where $FC_ID is your flowcell number) will download your QC report into the "$FC_ID/qc" folder.

    Custom Data Analysis Service

    Customized data analyses are available upon request @ $52 per hour labor time. For questions please contact Contact Us.



    Document Actions