Bioinformatics

 

GGBC provides a wide range of bioinformatics services. The analysis workflow has been divided into basic modules. This approach allows customers to purchase only the services they need. It also creates very clear deliverables for each step in the analysis.

Services
  • Team consultation: Prior to grants and experiments (2 hours free)
  • You-can-do-it: Recommendation on software, scripts, pipelines, and resources (BBB)(request fund for bioinformatics helpline)
  • Tutorial: Module-based tutoring sessions
  • A la carte: Order specific analysis workflow or modules
Bioinformatics workflow
  • Experimental design
  • Data suitability assessment
  • Quality and variables assessment
  • Reference assessment and preparation
  • Analysis workflows
  • Training in how to perform the above steps (if desired)
Guidelines for modules of data assessment and analysis workflow

Analysis of Data Suitability for Goal Assessment

This is primarily for clients who have generated data without any input from GGBC.

  • Check the data suitability for the intended experimental objectives
  • Check the experimental design and level of sequence coverage
  • Suggest suitable algorithms and pipelines

Quality control and experimental variables assessment (trimming and cleaning)

  • De-multiplexing
  • Quality-based and adapter trimming
  • Removal of artifacts, homopolymers
  • Assess the difference in the quality and quantity between the libraries and replicates in the same experiments
  • Assess the difference between libraries sequenced on different lanes, flowcells or platforms.
  • Assess the difference between the different kinds of libraries

Data normalization and fitting

The available DE algorithms (TSPM, GLM, EdgR, DESeq, baySeq, Cuffdiff, and rDiff) apply different models of distribution (Poisson, quasi-poisson, generalized-linear, and negative binomial distribution models). It is important to test data fitting to different models and choose the best model. This analysis includes:

  • Assess data dispersion by comparing the five numbers statistic summary between all variables, i.e. individuals, conditions, and replicates.
  • Draw distribution and scatter-plots to compare variables.
  • Assess the ratio of high to low abundant reads with all variables (this module requires a reasonable background in statistics).

Genome/transcriptome reference assessment and preparation

This is for clients who don’t have the reference sequence data and need us to obtain, evaluate and assess the reference suitability from data stored in public or other data repositories/sources.

  • Locate and download the reference genome, transcriptome, annotation files, or any required reference data.
  • Test and modify, if needed, the files’ format.

Read mapping to a reference genome

This is only the mapping component and includes:

  • Testing different mapping algorithms and selection of the best one for the particular data set.
  • Optimization of the mapping parameters.
  • Generation of mapping files in a common format, e.g. SAM/BAM.
  • Generation of files for mapped and un-mapped reads.

Mapping to a reference for transcriptome assembly

Transcriptome reference assembly (mapping, clustering and exporting consensus transcript sequences). Testing of different mapping algorithms and selection of the best for the particular data set.

  • Optimization of the mapping parameters.
  • Generation of mapping files in a common format – Generation of files for mapped and un-mapped reads.
  • Generation of a sequence file containing the assembled sequences.

Mapping to a reference transcriptome(s) for transcriptional analysis

This is mapping to a reference for gene expression profiling and isoform detection.Testing different mapping algorithms and selection of the best for the particular data set.

  • Optimization of the mapping parameters.
  • Generation of mapping files in a common format.
  • Generation of files for mapped and un-mapped reads.
  • Generation of counts (RPM, RPKM, or FBKM) file per each replicate, condition, sample, and experiment.
  • Generation of a new isoforms file (only if a nicely annotated reference is available).

Mapping to a reference for exome-capture analysis

This analysis is primarily a custom analysis for enrichment and captures experiments.Mapping of the captured reads from 100’s of samples (individuals) to a common reference.

  • Assembly of the individual exomes from each sample.
  • Comparison of the exomes to each other and to the distant reference.
  • Performance of either a phylogenetic or an expression analysis.

De novo genome assembly (viral/bacterial)

De novo assembly of viral/bacterial genomes:

  • Generation of contigs.
  • Statistical assessment of the assembly.
  • Comparison to public database or other reference.
  • Assessment of the ortholog gene regions (benchmarks).

De novo genome assembly, assembly stats, and ortholog benchmarking (<50 Mb eukaryotic genomes)

De novo assembly of small eukaryotic genomes:

  • Assembly of contigs/scaffolds.
  • Statistical assessment of the assembly.
  • Comparison to public databases.
  • Assessment of the ortholog gene regions (benchmarks).

De novo genome assembly & assessment (large, complex genomes)

A very complex project and require teamwork and extensive discussion with the client to specifically define the deliverables, timeframe, and cost.

Automated genome annotation (Bacterial/viral genome)

Blast to reference databases and parse results OR RAST pipeline.

Automated eukaryotic genome annotation (large complex genome, custom)

A very complex project and require teamwork and extensive discussion with the client to specifically define the deliverables, timeframe, and cost.

De novo transcriptome assembly

  • Assemble using different k-mers and select the best range of k-mers.
  • Re-assemble using overlapping-based method.
  • Generate contigs/scaffolds.
  • Statistical assessment of the assembly.
  • Comparison to public databases (mRNA and protein).
  • Assessment of the ortholog gene regions (benchmarks).

Transcriptome annotation

  • Reciprocal BLAST to nucleotide and protein database.
  • Parsing the results.
  • Generation of a spreadsheet for the gene description.

GO tag and InterProScan annotation

  • Assignment of gene ontology tags to transcript or protein sequences.
  • Assignment of InterProScan derived tags, e.g. Pfam and other HMM tags.
  • Parsing GO and IPS results into spreadsheet format for gene description.

Differential expression analysis

  • Assuming reads are already mapped and available in “SAM” or “BAM” format.

SNPs detection/calling/filtering

  • Assuming reads are already mapped and available in “SAM” or “BAM” format.

*Logical grouping refers to multiple samples generated in the same experiment under the same data generation conditions (e.g. each of 96 wells in the same plate) such that the same analysis would apply to each sample.

Prices
Service UGA FeeNon-UGA FeeCommercial Fee
Analysis of data suitability for goal assessment$150$173$300
Quality control and experimental variables assessment$150$173$300
Data normalization and fitting$300$345$600
Genome/transcriptome reference assessment and preparation$150$173$300
Read mapping to a reference genome$250 / $75$288 / $86$500 / $150
Mapping to a reference for transcriptome assembly$500 / $75$575 / $86$1,000 / $150
Mapping to a reference transcriptome(s) for transcriptional analysis$500 / $75$575 / $86$1,000 / $150
Mapping to a reference for exome-capture analysis$1,500$1,725$3,000
De novo genome assembly (viral/bacterial genomes)$750 / $175$863 / $201$1,500 / $350
De novo genome assembly (< 50 Mb eukaryotic genomes)$1,500$1,725$3,000
De novo genome assembly & assessment (large, complex genomes)Quoted by requestQuoted by requestQuoted by request
Automated genome annotation (viral/bacterial)$75$86$150
Automated eukaryotic genome annotation (large, complex genomes)Quoted by requestQuoted by requestQuoted by request
De novo transcriptome assembly$1,500$1,725$3,000
Transcriptome annotation (basic)$500$575$1,000
GO tag and InterProScan annotation$300 / $150$345 / $173$600 / $300
Differential expression analysis$500$575$1,000
SNPs detection/calling/filtering$500$575$1,000
Bacterial draft genome: sequencing/assembly/annotation$1000$1,150$2,000
Bacterial genome submission to NCBI$250$288$500
Hourly rate for custom jobs and/or personnel training$75$86$150

Some services require an associated service, e.g. differential expression and SNP analyses require mapping to a reference genome or transcriptome.

For services with two prices listed, there is a base fee for the first sample, and subsequent samples are charged at a reduced rate up to 10 samples. Pricing resets to the base fee after each 10 samples. Quantity discounts are available for large projects.

Many services are based on a logical-grouping standard. This refers to multiple samples generated in the same experiment under the same data generation conditions such that the same analysis would apply to each sample, e.g. each of 96 wells in the same plate or 20 samples from 2 lanes of Illumina sequencing.

Rates effective July 1, 2015.