Special Genomic Libraries
Hi-C Libraries & Sequencing
Summary: Hi-C is a library preparation technique that is designed to allow the investigation of the 3D organization of the genome. Briefly, samples are cross linked with formaldehyde, and then digested to generate ‘aggregates’ that consist of covalently linked segments of chromatin that were physically near each other in the nucleus. These chromatin aggregates are ligated under very dilute conditions to promote intra-aggregate ligation. Libraries are then constructed using conventional NGS library preparation techniques, and library molecules containing the desired proximity ligation events are selected for using streptavidin beads. We prepare Hi-C libraries using a kit from Dovetail Genomics.
For questions about Hi-C library preparation and sample submission, please contact Magdy Alabady (malabady@uga.edu).
Sample Preparation
Sample Type | Recommendations | Input Amount Needed |
---|---|---|
Tissue | • Tissues with high cellularity and low fat content are preferred, e.g. Brain, muscle, heart, or spleen • Samples should be collected from a live or recently deceased specimen and snap frozen in liquid nitrogen. • The protocol does not support fat, bone, or similar tissues. • Do not preserve samples using RNAlater, EtOH, or lyophilization. Samples should be stored at -80C and shipped on dry ice. | 20-40mg per sample |
Cells | • Any cell culture is compatible with Hi-C library preparation. • Adherent cells should be dissociated with Trypsin. | 0.5x106 cells |
Blood | • Blood samples should be collected from a live or recently deceased specimen. • An anti-coagulant must be added. EDTA is the preferred anti-coagulant. Heparin and Citrate (ACD-A) are acceptable alternatives. • Flash freeze samples in liquid nitrogen and store them at -80C. Samples should be shipped overnight on dry ice. | 300uL-1mL of Blood. Samples are normalized to 0.5x106 cells. |
Plants | • Leaves collected from plants at the one or two leaf seedling stage are preferred. • Young leaves from mature plants and plant tissue culture can also be used. • Snap-freeze samples in liquid nitrogen and store them at -80C. Samples should be shipped overnight on dry ice. | 250mg of flash frozen tissue per sample |
Multiplexing: The Dovetail Hi-C Library Preparation Kit supports the multiplexing of up to 8 samples for sequencing on one Illumina flow cell.
Sequencing: Sequencing of Hi-C libraries is done in two stages. In the first stage, the library is sequenced to generate ~2 million paired-end reads. These reads are used to QC the library using Dovetail’s HiRise software. This step ensures that the libraries have the desired proximity ligations between portions of the genome that are physically near each other in the cell, but distant in the genome assembly. If the library passes this QC step, a full-scale sequencing run is performed to reach the desired sequencing depth. PE75 reads are sufficient for Hi-C analysis, but longer read lengths can be used if desired.
The sequencing depth and number of Hi-C libraries required for Hi-C analysis depends on genome complexity. The table below lists the recommendations from the kit manufacturer, Dovetail Genomics:
Dovetail Genomics
Simple Genomes | Complex Genomes | |||
---|---|---|---|---|
Genome Size (Gb) | No. of Hi-C Libraries | No. of Read Pairs to sequence (Millions) | No. of Hi-C Libraries | No. of Read Pairs to sequence (Millions) |
1 | 1 | 100 | 1 | 100 |
2 | 1 | 200 | 2 | 250 |
3 | 1 | 300 | 3 | 450 |
4 | 2 | 400 | 4 | 500 |
5 | 2 | 500 | 5 | 850 |
Simple Genomes: Are dipoid or haploid, have repetitive content of less than 30%, and heterozygosity of less than 0.005%. Humans, many mammals, and some fish and birds are examples of simple genomes.
Complex Genomes: Have any of the following: polyploidy, repeat content above 30%, or heterozygosity above 0.005%. Many plants, salmonid fishes, and amphibians are examples of complex genomes. If you are unsure which category your genome of interest is in, Dovetail recommends following the guidelines for complex genomes.
For HiRise analysis, users should provide a draft genome assembly with an N50 greater than 1Mb and an N90 greater than 20kb.
Example QC data from Hi-C libraries made at the GGBC:
Zea Mays Ab10 Strain
Basic Assembly Statistics | |
---|---|
Total Length | 2,106,338,117 bp |
Scaffold N50 | 223,902,240 bp |
Scaffold N90 | 159,769,782 bp |
Largest Scaffold | 307,041,717 bp |
Basic Library Statistics | |
Total Read Pairs Analyzed | 4,237,074 |
Library Read Length | 75bp |
Profile of Read Insert Distribution | |
0 bp < Insert <= 1 kbp | 14.96% |
1 kbp < Insert <= 100 kbp | 1.48% |
100 kbp < Insert <= 1 Mbp | 0.62% |
1 Mbp < Insert <= 3 Mbp | 0.3% |
3 Mbp < Insert <= 5Mbp | 0.16% |
5 Mbp < Insert | 1.63% |
Oryza Sativa
Basic Assembly Statistics | |
---|---|
Total Length | 373,245,519 bp |
Scaffold N50 | 29,958,434 bp |
Scaffold N90 | 23,207,287 bp |
Largest Scaffold | 43,270,923 bp |
Basic Library Statistics | |
Total Read Pairs Analyzed | 2,914,137 |
Library Read Length | 75bp |
Profile of Read Insert Distribution | |
0 bp < Insert <= 1 kbp | 34.2% |
1 kbp < Insert <= 100 kbp | 3.31% |
100 kbp < Insert <= 1 Mbp | 2.09% |
1 Mbp < Insert <= 3 Mbp | 1.37% |
3 Mbp < Insert <= 5Mbp | 0.84% |
5 Mbp < Insert | 2.65% |
Homo sapiens
Basic Assembly Statistics | |
---|---|
Total Length | 3,257,330,713 bp |
Scaffold N50 | 145,138,636 bp |
Scaffold N90 | 58,617,616 bp |
Largest Scaffold | 248,956,422 bp |
Basic Library Statistics | |
Total Read Pairs Analyzed | 5,955,150 |
Library Read Length | 75bp |
Profile of Read Insert Distribution | |
0 bp < Insert <= 1 kbp | 43.48% |
1 kbp < Insert <= 100 kbp | 5.21% |
100 kbp < Insert <= 1 Mbp | 0.05% |
1 Mbp < Insert <= 3 Mbp | 0.04% |
3 Mbp < Insert <= 5Mbp | 0.03% |
5 Mbp < Insert | 1.12% |
TnSeq Libraries & Sequencing
For questions about TnSeq library preparation and sample submission, please contact Magdy Alabady (malabady@uga.edu)