Omics! Omics!: SFAF & I'm Not Dead Yet Technologies

Jonathan Jacobs posted his annual reminder that the Sequencing, Finishing and Analysis in the Future Meeting (SFAF) will be this week. Alas, that meeting hasn't had many more tweeters in the past than Jonathan, but perhaps this year there will be more. There's a glut of genomics conferences to track, compile tweets and opine on -- besides London Calling, there's been SMRT Leiden and Biology of Genomes, all in the span of two weeks! This post is going to be a bit short on actual writing and more to just flag some talks at SFAF that grabbed my attention. What I realized is that the talks at SFAF illustrate that a number of technologies I consider effectively dead retain significant attention.

#ImBiased, but… Best conf. of 2017: #SFAF2017 #infectiousdisease #inherited #disease #agrigenomics #human #genomics https://t.co/yTu2MxKc41 pic.twitter.com/FCoSmTp6an
— Jonathan Jacobs (@bioinformer) May 10, 2017

I'll admit that I tend to write technologies off after a certain point; once I've decided that a better tech exists I expect everyone else to see it the same way. But that isn't reasonable. Sometimes I'm just plain premature in my assessment, and other times there may be a variety of factors which preserve a tech. For example, once a large pipeline is set up to handle a certain type of data, there is a significant switching cost (both actual and psychological) to changing to a different technology.

Fluorescent Sanger sequencing is not a technology I have fully written off, but one I see as being confined to a few niches. One of those is sequencing in small batches that just aren't cost effective for any newer technologies due to library construction costs (artisanal sequencing?). Have one small construct to validate? Sanger is likely the route.

I'm not involved in clinical genomics, but clearly there is significant debate over the utility of validating by Sanger variants found by Illumina or other parallel sequencing approaches. The anti-Sanger argument is that sufficient depth of coverage ensures good calls. For the pro-Sanger argument I've seen several justifications. First, it is largely orthogonal in nature and so can scotch false positives caused by various flavors of Illumina artifacts. Second, by running a completely separate analysis on a different patient sample the risk of sample swaps can be mitigated. The genomics group at Baylor College of Medicine has a talk on large-scale primer design for variant validation.

Mate-pair technology really is behind my eight ball, as it never really solved any problems for me. We tried it a few times with very modest success. Whatever my personal feelings, mate pairs are, in my opinion, simply being erased in value by long read technologies. It is certainly the case in the small genome world that for similar input material and dollar cost, a long read library can be generated which will match or exceed the results from a conventional mate pair library.

For larger genomes or metagenomes, a mate pair library may still be able to gain greater physical coverage than long reads on PacBio or nanopore. But in that space, linked reads from 10x Genomics or optical maps or HiC methods are far more likely in my view to deliver significant improvements in scaffolding than the typical mate pairs. Even with long read technologies, if $1K can buy you 8X coverage or so of a human-class genome with 20kb+ reads, that's likely to have a significant impact on contig N50 of a short read assembly. I guess that's another bias on my part: scaffolds are nice but contigs are more important. You may feel differently. In any case, an abstract on insect genomes that has an interesting bit about sequencing museum specimens inexpensively also talks quite a lot about mate pair construction. Another abstract discusses new algorithms for scaffolding.

As noted above, BioNano optical maps are currently a good way to generate long-range information for de novo assemblies. The conference features several talks using Nabsys 2.0's electronic mapping technology. In the microbial genome space I am skeptical of the value of mapping technologies, as long read sequencing has become so routine and sequences will always outclass maps for biological value. For the counter argument, Nabsys has a talk on using their technology for Bordtella genomes and someone from the Centers for Disease Control will be speaking on their experience with optical mapping with over 2000 maps generated.

Another talk illustrating a tug-of-war between technologies is one trying to optimize primer design for pathogen detection. A very valuable goal, but given the plummeting cost of metagenomic sequencing methods, any such PCR-based method would need to be tested head-to-head against the sequencing methods to determine speed-to-result and sensitivity/specificity. Perhaps the day of sequencing only isn't here yet, but that day shouldn't be many years off.

There's a lot of other interesting sounding talks at SFAF, both from a technology standpoint and biology. My rough notes on SFAF abstract book are below, but they are just a sampling of the abstract titles. I'll probably generate storify pages of the tweets from the conference, though I won't fight anyone for that glory :-)

Notes

SFAF has sponsorship from many vendors, who get slots. Not only the big players like Illumina, but also companies such as Nabsys.

Full abstract book is online, as well as a hyperlinked program.

Susan Dutcher talk on ciliopathies -- if I remember correctly, she once worked on Chlamydomonas, which I worked with as an undergraduate

Fabric Genomics will be talking about their indel calling strategy: Accurate Identification of de novo structural variants in a trio using a reference agnostic, rapidly queryable format to reduce the proportion of unsolved cases

Kevlar: reference-free variant discovery in human genomes and beyond -- from Daniel Standage

The BD CLiC – a fully integrated miniaturized library prep system enabling PCR-free whole genome library prep -- I saw CLiC at Marco Island back when it was still a separate Irish company. Interesting fluidics technology, not quite microfluidics (1ul droplets).

Library-free, targeted sequencing of native gDNA from FFPE samples using Hyb & SeqTM technology – the hybridization based single molecule sequencing system -- Michael Rhodes from NanoString.
Accurate targeted assembly and variant calling from NanoString Hyb & Seq data -- from the St. Petersburg group

High Density Electronic Maps To Verify Structural Variation in Human Genomes (focused on NA12878) -- John Oliver from NABsys 2.0 -- see also De Novo Assembly of High Density Electronic Maps Reveal Structural Diversity in Bordetella pertussis

Applying Sequel to genomic datasets -- JGI
Best Practices for Whole Genome Sequencing Using the Sequel System -- PacBio

Sequencing the largest existing collection of historic commercial solventogenic clostridia strains to dissect industrial acetone-butanol-ethanol (ABE) fermentations -- historical note, it was such strains that kept Britain in WWI by enabling the production of key explosives

De novo assembly of complete chloroplast genomes -- short read assemblies. Single nanopore read chloroplast genomes are certainly plausible, but that's part of an embryonic post.

MinION Nanopore Sequencer for Human Identification or Sample Source Attribution -- nanopore talk, from Rachel Spurbeck at Battelle. Also Winston Timp is giving Assembly and Analysis of Concurrent XDR and HMV K. pneumo Substrains Using Nanopore Sequencing -- I think last year the conference had no nanopore talks.

Next Generation Sequencing: Signature Sequence Detection For In Silico Primer Design -- always a battle between just sequencing-the-heck out of things and trying to use clever primer design to enrich. Here's the latest on clever primer design -- from Noblis.

Robust and High-throughput PCR-Free whole genome sequencing pipeline for Illumina HiSeq X and NovaSeq platforms -- Baylor experience making huge numbers of Illumina libraries

SPAdes Family of Tools for Genome Assembly and Analysis: What's New? -- but the other SPAdes group talk has a better title -- SPAdes: is there anything new we could develop? -- that talk covers dealing with hairy repeats particularly those in non-ribosomal peptide synthetases and polyketide synthases (very near and dear to my mission)

Exception to the rule: Bacterial Chromosomes with Two Origins of Replication- a Genomic Approach to Determine Origin Functionality

Rapid and Portable Genome Classification System -- Bloom filters

Zika virus, drug discovery, and student projects in bioinformatics -- really cool sounding curriculum from Sandra Porter

A Method to Streamline and Miniaturize Library Preparation for Next-Gen Sequencing Using the Labcyte Echo® 525 Liquid Handler -- one of several talks from Labcyte on the Echo -- I sooooo wish these machines were more affordable, as they are utterly cool.

I'm not dead yet technology talk #1: Primer Design Pipeline for Large Scale Sanger Validation

I'm not dead yet technology talk #2: Sequencing insect genomes on a budget -- abstract has a lot of talk of mate pairs -- but also ~$200/sample to prepare & sequence butterflies from museum collections

I'm not dead yet technology talk #3: A Global Optimization Approach for Scaffolding and Completing Genome Assemblies -- scaffolding using mate pairs

I'm not dead yet technology talk #4: Six years of microbial genomes Optical mapping in CDC, Atlanta GA

Whole Genome Sequence Data Anonymization Using Bloom Cipher Application

Bacterial genome reduction as a result of short read sequence assembly -- cautionary talk on how short read assemblies can lose important information

Read Cloud Wrangling: Targeted Assemblies from Linked Reads for Genome Editing Applications

Assembly of heterozygous genomes -- yet another de Bruijn assembler (Bwise), claiming to better leverage paired end reads and other long range information and claiming to assemble human with only 50GB of RAM.

Pet dogs, citizen science and the genomics of behavior -- Darwin's Dogs!!

Using Linked-Reads to enable efficient de novo, diploid assembly & Improving genome analysis with phased de novo assemblies -- 10X

Building High-quality, De Novo Genome Assemblies by Scaffolding Next-Generation Sequencing Assemblies with Bionano's Next- Generation Mapping & Next-Generation Mapping for Genome Assembly and Structural Variation Analysis -- BioNano

Heaps of Chromosomes: New Scales and Evolving Paradigms in Genome Assembly -- Dovetail

Unprecedented Genome and Metagenome Assemblies with the Power of Hi-C -- Phase Genomics

1 comment:

Jonathan Jacobs said...: Thanks for the props Keith! I would have missed this post if it weren't for one of my staff members pointing me to it. HA!

SFAF is an awesome conference. Great mix of deep dive technical computational biology talks, high level "real world use" key notes, and tons of Gov. PM's making an appearance (mainly because the conference is Free) to learn more about genomics and looking for new tech to fund/support.; Thursday, May 25, 2017 11:29:00 AM

Omics! Omics!

Sunday, May 14, 2017

SFAF & I'm Not Dead Yet Technologies

Notes

1 comment:

Google meta tag

Get new posts by email: