Computational Genomics Laboratory (CGL)
  • Home
  • Team
  • Projects
  • News
  • Publications
  • Contact Us
  • Jobs
  • Donate
Select Page

A Flow Procedure for Linearization of Genome Sequence Graphs

Oct 5, 2018 | Publications | 0 comments

By: David Haussler, Maciej Smuga-Otto, Jordan M. Eizenga, Benedict Paten, Adam M. Novak, Sergei Nikitin, Maria Zueva, and Dmitrii Miagkov

Abstract
Efforts to incorporate human genetic variation into the reference human genome have converged on the idea of a graph representation of genetic variation within a species, a genome sequence graph. A sequence graph represents a set of individual haploid reference genomes as paths in a single graph. When that set of reference genomes is sufficiently diverse, the sequence graph implicitly contains all frequent human genetic variations, including translocations, inversions, deletions, and insertions. In representing a set of genomes as a sequence graph, one encounters certain challenges. One of the most important is the problem of graph linearization, essential both for efficiency of storage and access, and for natural graph visualization and compatibility with other tools. The goal of graph linearization is to order nodes of the graph in such a way that operations such as access, traversal, and visualization are as efficient and effective as possible. A new algorithm for the linearization of sequence graphs, called the flow procedure (FP), is proposed in this article. Comparative experimental evaluation of the FP against other algorithms shows that it outperforms its rivals in the metrics most relevant to sequence graphs.

[Read more.]

Citation: David Haussler, Maciej Smuga-Otto, Jordan M. Eizenga, Benedict Paten, Adam M. Novak, Sergei Nikitin, Maria Zueva, and Dmitrii Miagkov. Journal of Computational Biology. Jul 2018. ahead of print http://doi.org/10.1089/cmb.2017.0248

Share this post:

Share on X (Twitter) Share on Facebook Share on Pinterest Share on LinkedIn Share on Email

Recent Posts

  • Human Pangenome Named a GA4GH Driver Project
  • New protocols make long-read sequencing feasible on larger scale
  • Human pangenome reference will enable more complete and equitable understanding of genomic diversity
  • UC Santa Cruz to lead data collection center for major federal project on genetic underpinnings of neurological conditions
  • Genome of famed sled dog Balto reveals genetic adaptations of working dogs

Recent Comments

    Archives

    • September 2023
    • May 2023
    • April 2023
    • January 2023
    • November 2022
    • May 2022
    • April 2022
    • March 2022
    • January 2022
    • November 2021
    • September 2021
    • June 2021
    • April 2021
    • January 2021
    • November 2020
    • October 2020
    • September 2020
    • July 2020
    • January 2020
    • November 2019
    • August 2019
    • June 2019
    • October 2018
    • August 2018

    Categories

    • News
    • Publications

    Meta

    • Log in
    • Entries RSS
    • Comments RSS
    • UCSC Faculty WordPress
    • UC Santa Cruz
    • Baskin School of Engineering

    Link to Computational Genomics Laboratory Google Scholar Link to Computational Genomics Laboratory X Link to Computational Genomics Laboratory Facebook Link to Computational Genomics Laboratory YouTube Link to Computational Genomics Laboratory LinkedIn Link to Computational Genomics Laboratory Instagram Link to Computational Genomics Laboratory Mail

    UCSC Genomics institute link

    • Facebook
    • X
    • Instagram
    • RSS

    Designed by Elegant Themes | Powered by WordPress