Event Date
This is a ‘LIVE COURSE’ – the instructor will be delivering lectures and coaching attendees through the accompanying computer practical’s via video link, a good internet connection is essential.
TIME ZONE – UK local time (GMT) – however all sessions will be recorded and made available allowing attendees from different time zones to follow.
Genome assembly is the process of piecing together fragments of DNA to recontruct the original genome. The genome provides crucial information for understanding genetic structure, function and variation.
In recent years, long-read sequencing technologies have revolutionized genome assembly. These long reads can span repetitive sequences and structural variations making genome assembly simpler but also reducing gaps and fragments in the genome, resolve repeats, help with the detection of structural variation as well as improved haplotype phasing.
During this course we will look at data generated using PacBio and Oxford Nanopore, discuss the pros and cons of both sequencing technologies and the effect they might have on genome assembly. During the course we will look at different tools available to generate assemblies, focussing on de novo genome assembly. Polishing using short or long reads and the introduction of Hi-C sequencing can increase completeness of the genomes. At the difference steps during the assembly process we will look at the contiguity, completeness and correctness of the generated genomes, thereby evaluation the status of the genome.
Once a genome has been assembled the next step is annotation. Genome annotation involves identifying and mapping locations of genes and other functional elements within the sequenced genome. We will take a look at the differences between prokaryote and eukaryote genomes and the tools available for annotation. We will talk about steps to improve annotation once the automatic annotation has been made.
By the end of the course, participants should:
Academics and post-graduate students working on projects related to spatial data and applied researchers and analysts in public, private or third-sector organizations who need the reproducibility, speed and flexibility of a command-line language
Availability – 22
Duration – 3 days
Contact hours – Approx. 16 hours
ECT’s – Equal to 2 ECT’s
Language – English
Intermediate-level lectures interspersed with hands-on mini practicals. Access to Linux VM and data sets for practicals will be provided by the instructors. Time will be available during the course for participants to ask questions regarding their own projects.
Good familiarity of genomics studies.
Good familiarity with Linux will be helpful.
COMING SOON…
PLEASE READ – CANCELLATION POLICY
Cancellations/refunds are accepted as long as the course materials have not been accessed,.
There is a 20% cancellation fee to cover administration and possible bank fess.
If you need to discuss cancelling please contact oliverhooker@prstatistics.com.
Day 1 Classes from 10:00 – 15:30
Data QC and preprocessing and genome assembly
• Data QC and preprocessing using Nanopack
• Genome assembly using Redbean, Shasta, Canu
• PacBio assembly using hifiasm
• Genome evaluation
Day 2 Classes from 10:00 – 15:30
Genome polishing and introduction into Hi-C
• Polishing created genomes using Racon
• Assembly using Hi-C data
Day 3 Classes from 10:00 – 15:30
Genome annotation
• Genome annotation using Prokka
• Look at genome annotation using AUGUSTUS/BRAKER
Kathryn recently joined the Edinburgh Genomics team as the Genomics and Bioinformatics Training Coordinator. With a diverse background in bioinformatics and molecular biology, she specializes in phylogenetics and viral classification.