Bioinformatics

 

Next-generation (NGS) and Third-Generation sequencing approaches are ubiquitous in biological, genetic and clinical experimentation.  Researchers requiring manipulation and/or evaluation of these data often realize that either specialized guidance or higher-level bioinformatics expertise is needed to expand their capabilities and accomplish their goals.

The GGBC Bioinformatics team at UGA is comprised of experienced research faculty and graduate student interns, who apply best-practice methodologies and employ open-source and custom-built software for data processing, data analysis and visualization options for a wide range of NGS datasets.

Our consultation services are free for input on experimental design and proposal development.  Some of the more common bioinformatics workflows, e.g. RNA-Seq, bacterial genome assembly, transcriptome assembly and variant analysis, are priced in an accessible, modular fashion (see Prices below).  More complex analyses, e.g. eukaryotic genome assembly, comparative genomics and many customized workflows are priced using hourly rates, since it’s often tough to be accurate with estimates until the work starts.

Offerings
  • Team consultation
  • Experimental design
  • Variety of computational and bioinformatics analyses
  • Training on specific analysis upon request
  • Customized analysis pipelines
Guidelines for Different Analysis Workflows

Microbiome 16S/18S/ITS Sequencing Data Analysis

  • Type of Data: Targeted sequencing of the any of the 16S or 18S or ITS regions.
  • Analysis methods: 
    Quality-based sequence trimming and removal of adapters and specific primer sequences, Removal of chimeric sequences, Joining the forward and reverse reads, OTUs identification, OTUs classification, Statistical Analysis.
  • Deliverables: 
    Species Richness per sample, Species Relative Abundance among samples, Core Microbiomes, Alpha-diversity, Beta-diversity, Differential Analysis

 RNA-Seq

  • Type of Data:  Illumina PE 75, stranded library.  Required: annotated (GFF3/GTF) genome or transcriptome.
  • Analysis methods: 
    Data Quality Assessment (Raw and Trimmed): FastQC
    Data Trimming: Trimmomatic
    Mapping:  Bowtie2/Tophat/RSEM, depending on input dataset.
    Expression Analysis: Cuffdiff/DESeq2/edgR, depending on dataset.
  • Deliverables: 
    Quality assessment for all samples (raw and trimmed), PCA and/or BCV analysis of samples, MA plots, list of differentially expressed genes in Excel format including fold change, P values, FDR, normalized counts, etc… as well as ancillary files, e.g. read mapping metrics, BAM files.

Small RNA (sRNA) Analysis

  • Type of Data:  Illumina SE75 (CAP-miRSeq requirements: reference genome, miRbase species accession)
  • Analysis methods:
    Data Quality Assessment (Raw and Trimmed): FastQC
    Data Trimming: CutAdapt
    Expression Analysis: CAP-miRSeq Pipeline (Bowtie, Randfold, HTSEQ, MIRDEEP2)
  • Deliverables:
    Quality assessment for all samples (raw and trimmed), trimmed read distributions, profile of all small RNAs present in each sample, prediction of novel miRNAs.
    General expression results: Excel files with mature, raw, normalized and novel miRNA counts.List of differentially expressed miRNAs in Excel format including fold change, P values, FDR, normalized counts.

SNP Analysis

  • Type of Data:  Illumina PE75.  Mapping reference required.
  • Analysis methods: 
    Data Quality Assessment (Raw and Trimmed): FastQ Data Trimming: Trimmomatic
    Mapping and mark duplicates:  BWA, Picard
    Variant Calling and filtering: GATK
  • Deliverables:
    Quality assessment for all samples (raw and trimmed), trimmed read metrics, mapping and read duplication stats, SNP and Indel variant call files (VCF) for both raw and filtered datasets.

Bacterial Genome Assembly & Annotation (short read)

  • Type of Data:  Illumina PE150, PE300
  • Analysis methods: 
    Data Quality Assessment (Raw and Trimmed): FastQC
    Data Trimming: Trimmomatic
    Assembly: SPAdes
    Benchmarking: Quast, BUSCO, BlastN, Mauve
    Automated Annotation & prophage discovery: RASTtk, PHASTER
  • Deliverables:
    Assembly fasta file, Quality assessments for all samples (raw and trimmed,  Quast summary metrics (plus/minus reference), BUSCO identification of core gene set, Mauve alignment to closest genome reference and ordering of contigs, BlastN (tabular output), RASTtk annotation (xls, gff, gbk, peptide.fa), PHASTER identification of prophage sequence(s).

Bacterial Genome Assembly & Annotation (long read)

  • Type of Data:  PacBio
  • Analysis methods: 
    Error correction, assembly and contig polishing: Canu, BLASR, Arrow
    Benchmarking: Quast, BUSCO.
  • Deliverables:
    Assembly fasta file, Contig coverage plots, Quast summary metrics (plus/minus reference), BUSCO identification of core gene set, Mauve alignment to closest genome reference and ordering of contigs, BlastN (tabular output), RASTtk annotation (xls, gff, gbk), PHASTER identification of prophage sequence(s).

Eukaryotic Genome Assembly (de novo, long read)/Custom

  • Type of Data:  Illumina PE
  • Analysis methods: 
    Data Quality Assessment (Raw and Trimmed): FastQC
    Data Trimming: Trimmomatic
    Assembly: Velvet/Soapdenovo2/ABySS (depending on dataset)
    Benchmarking: Quast, BUSCO, BlastN
  • Deliverables:
    Assembly fasta file, Quast summary metrics (plus/minus reference), BUSCO identification of core gene set, BlastN (tabular output).

Transcriptome Assembly (short read)

  • Type of Data: PE75, PE150
  • Analysis methods: 
    Data Quality Assessment (Raw and Trimmed): FastQC
    Data Trimming: Trimmomatic
    Assembly: Trinity
    Benchmarking: Quast, BUSCO
  • Deliverables:
    Assembly fasta file, Quast summary metrics (plus/minus reference), BUSCO identification of core gene set.

Transcriptome Assembly (long read)

  • Type of Data: PacBio
  • Analysis methods: 
    Error Correction, assembly and contig polishing: IsoSeq3
    Transcript clustering: Minimap2, Cupcake (+ reference genome), Cogent (-/+ reference genome)
    Benchmarking: Quast, BUSCO
  • Deliverables:
    High quality and low quality fasta and fastq transcript files, locus collapsed and 5’ degradation filtered assembly fastas and gff file. Quast summary metrics (plus/minus reference), BUSCO identification of core gene set.
Prices
Service UGA FeeNon-UGA FeeCommercial Fee
De novo transcriptome assembly from illumina short reads$1,500$1,770$2,250
De novo transcriptome assembly from PacBio Iso-Seq data$1,500$1,770$2,250
Assembly and annotation of mid-sized genomes from PacBio long reads $3,500$4,130$5,250
Assembly and annotation of 1-10 bacterial genomes from PacBio long reads$2,000$2,360$3,000
Bacterial draft genome: sequencing using Illumina short reads/assembly/annotation$1,000$1,180$1,500
Transcriptome annotation (basic)$500$590$750
Differential expression analysis (up to 24 samples)$2,000$2,360$3,000
Small RNA analysis (up to 24 samples)$2,000$2,360$3,000
GO tag and InterProScan annotation$300$354$450
SNPs detection/calling/filtering (up to 24 samples)$1,500$1,770$2,250
GBS analysis using STACKS (up to 96 samples)$1,500$1,770$2,250
Microbiome analysis (up to 12 samples) [$25 for each additional sample]$350$413$525
Bacterial genome submission to NCBI$250$295$375
Hourly rate for custom jobs and/or personnel training$75$89$113

 

We have many more analyses available. Please contact us for more information.

Rates effective 1/1/2018.