Bioinformatics

 

GGBC provides a wide range of bioinformatics services. The analysis workflow has been divided into basic modules. This approach allows customers to purchase only the services they need. It also creates very clear deliverables for each step in the analysis.

Services
  • Team consultation: Prior to grants and experiments (2 hours free)
  • Tutorial: Module-based tutoring sessions
  • A la carte: Order specific analysis workflow or modules
Bioinformatics workflow
  • Experimental design
  • Data suitability assessment
  • Quality and variables assessment
  • Reference assessment and preparation
  • Analysis workflows
  • Training in how to perform the above steps (if desired)
Guidelines for modules of data assessment and analysis workflow

Analysis of Data Suitability for Goal Assessment

This is primarily for clients who have generated data without any input from GGBC.

  • Check the data suitability for the intended experimental objectives
  • Check the experimental design and level of sequence coverage
  • Suggest suitable algorithms and pipelines

Quality control and experimental variables assessment (trimming and cleaning)

  • De-multiplexing
  • Quality-based and adapter trimming
  • Removal of artifacts, homopolymers
  • Assess the difference in the quality and quantity between the libraries and replicates in the same experiments
  • Assess the difference between libraries sequenced on different lanes, flowcells or platforms.
  • Assess the difference between the different kinds of libraries

Data normalization and fitting

The available DE algorithms (TSPM, GLM, EdgR, DESeq, baySeq, Cuffdiff, and rDiff) apply different models of distribution (Poisson, quasi-poisson, generalized-linear, and negative binomial distribution models). It is important to test data fitting to different models and choose the best model. This analysis includes:

  • Assess data dispersion by comparing the five numbers statistic summary between all variables, i.e. individuals, conditions, and replicates.
  • Draw distribution and scatter-plots to compare variables.
  • Assess the ratio of high to low abundant reads with all variables (this module requires a reasonable background in statistics).

Genome/transcriptome reference assessment and preparation

This is for clients who don’t have the reference sequence data and need us to obtain, evaluate and assess the reference suitability from data stored in public or other data repositories/sources.

  • Locate and download the reference genome, transcriptome, annotation files, or any required reference data.
  • Test and modify, if needed, the files’ format.

Read mapping to a reference genome

This is only the mapping component and includes:

  • Testing different mapping algorithms and selection of the best one for the particular data set.
  • Optimization of the mapping parameters.
  • Generation of mapping files in a common format, e.g. SAM/BAM.
  • Generation of files for mapped and un-mapped reads.

Mapping to a reference for transcriptome assembly

Transcriptome reference assembly (mapping, clustering and exporting consensus transcript sequences). Testing of different mapping algorithms and selection of the best for the particular data set.

  • Optimization of the mapping parameters.
  • Generation of mapping files in a common format – Generation of files for mapped and un-mapped reads.
  • Generation of a sequence file containing the assembled sequences.

Mapping to a reference transcriptome(s) for transcriptional analysis

This is mapping to a reference for gene expression profiling and isoform detection.Testing different mapping algorithms and selection of the best for the particular data set.

  • Optimization of the mapping parameters.
  • Generation of mapping files in a common format.
  • Generation of files for mapped and un-mapped reads.
  • Generation of counts (RPM, RPKM, or FBKM) file per each replicate, condition, sample, and experiment.
  • Generation of a new isoforms file (only if a nicely annotated reference is available).

Mapping to a reference for exome-capture analysis

This analysis is primarily a custom analysis for enrichment and captures experiments.Mapping of the captured reads from 100’s of samples (individuals) to a common reference.

  • Assembly of the individual exomes from each sample.
  • Comparison of the exomes to each other and to the distant reference.
  • Performance of either a phylogenetic or an expression analysis.

De novo genome assembly (viral/bacterial)

De novo assembly of viral/bacterial genomes:

  • Generation of contigs.
  • Statistical assessment of the assembly.
  • Comparison to public database or other reference.
  • Assessment of the ortholog gene regions (benchmarks).

De novo genome assembly, assembly stats, and ortholog benchmarking (<50 Mb eukaryotic genomes)

De novo assembly of small eukaryotic genomes:

  • Assembly of contigs/scaffolds.
  • Statistical assessment of the assembly.
  • Comparison to public databases.
  • Assessment of the ortholog gene regions (benchmarks).

De novo genome assembly & assessment (large, complex genomes)

A very complex project and require teamwork and extensive discussion with the client to specifically define the deliverables, timeframe, and cost.

Automated genome annotation (Bacterial/viral genome)

Blast to reference databases and parse results OR RAST pipeline.

Automated eukaryotic genome annotation (large complex genome, custom)

A very complex project and require teamwork and extensive discussion with the client to specifically define the deliverables, timeframe, and cost.

De novo transcriptome assembly

  • Assemble using different k-mers and select the best range of k-mers.
  • Re-assemble using overlapping-based method.
  • Generate contigs/scaffolds.
  • Statistical assessment of the assembly.
  • Comparison to public databases (mRNA and protein).
  • Assessment of the ortholog gene regions (benchmarks).

Transcriptome annotation

  • Reciprocal BLAST to nucleotide and protein database.
  • Parsing the results.
  • Generation of a spreadsheet for the gene description.

GO tag and InterProScan annotation

  • Assignment of gene ontology tags to transcript or protein sequences.
  • Assignment of InterProScan derived tags, e.g. Pfam and other HMM tags.
  • Parsing GO and IPS results into spreadsheet format for gene description.

Differential expression analysis

  • Assuming reads are already mapped and available in “SAM” or “BAM” format.

SNPs detection/calling/filtering

  • Assuming reads are already mapped and available in “SAM” or “BAM” format.

*Logical grouping refers to multiple samples generated in the same experiment under the same data generation conditions (e.g. each of 96 wells in the same plate) such that the same analysis would apply to each sample.

Prices
Service UGA FeeNon-UGA FeeCommercial Fee
De novo transcriptome assembly from illumina short reads$1,500$1,770$2,250
De novo transcriptome assembly from PacBio Iso-Seq data$1,500$1,770$2,250
Assembly and annotation of mid-sized genomes from PacBio long reads $3,500$4,130$5,250
Assembly and annotation of 1-10 bacterial genomes from PacBio long reads$2,000$2,360$3,000
Bacterial draft genome: sequencing using Illumina short reads/assembly/annotation$1,000$1,180$1,500
Transcriptome annotation (basic)$500$590$750
Differential expression analysis (up to 24 samples)$2,000$2,360$3,000
Small RNA analysis (up to 24 samples)$2,000$2,360$3,000
GO tag and InterProScan annotation$300$354$450
SNPs detection/calling/filtering$500$590$750
GBS analysis using STACKS (up to 96 samples)$1,500$1,770$2,250
Microbiome analysis (up to 12 samples) [$25 for each additional sample]$350$413$525
Bacterial genome submission to NCBI$250$295$375
Hourly rate for custom jobs and/or personnel training$75$89$113

 

We have many more analyses available. Please contact us for more information.

Rates effective 1/1/2018.