Research

Ember supercomputing cluster

As a result of technological advances, sequencing across species and across individuals is proceeding at an extremely rapid pace, and the resulting explosion of genomic data is becoming difficult to manage. Many algorithmic approaches in bioinformatics rely on direct comparisons of nucleotide sequences and optimization techniques that do not scale to the massive data.

We believe that breakthroughs can be achieved by considering how features of big biological data, and especially genomic sequence data, can inspire new computational platforms. Such an approach will lead to major advances in scientific understanding through a combination of improved statistical models, algorithmic advances, efficient parallel computation, innovative hardware extensions, and genomic data management.

Learn more about three projects in this area currently underway:

MRI Grant (NSF)

Big Data to Knowledge KnowEnG Center (NIH)

CCBGM (NSF)