These are alignments for the trio of comparative genomics papers published in Nature in November 2020 by the Zoonomia Project:
- Zoonomia Consortium., Genereux, D.P., Serres, A. et al. A comparative genomics multitool for scientific discovery and conservation. Nature 587, 240–245 (2020). https://doi.org/10.1038/s41586-020-2876-6
- Feng, S., Stiller, J., Deng, Y. et al. Dense sampling of bird diversity increases power of comparative genomics. Nature 587, 252–257 (2020). https://doi.org/10.1038/s41586-020-2873-9
The resources for these papers are:
- 241-way mammalian alignment (Zoonomia), update (V2)
- Important note:
- A second domestic dog sample (GCA_004027395) is incorrectly named Canis_lupus in the data files.
- HAL alignment: (806 gigabytes)
- URL: https://cgl.gi.ucsc.edu/data/cactus/241-mammalian-2020v2.hal
- MD5: bdd6f70aebfd325f459485ffe43f50c8
- Newick tree (with PHAST estimated branch lengths from 242-way tree): https://cgl.gi.ucsc.edu/data/cactus/241-mammalian-2020v2.phast-242.nh
- UCSC MAF Alignment (1.0 terabytes), human reference
- URL: https://cgl.gi.ucsc.edu/data/cactus/241-mammalian-2020v2b.maf.gz
- MD5: 7dde48fe2c94df13930fd70e9417dbe9
- Note: this is an update of 241-mammalian-2020v2.maf.gz to include the MAF header
- UCSC Browser hub:
- Human PhyloP scores:
- Spreadsheet describing data in Zoonomia alignment:
- Important note:
- 363-way avian alignment
- HAL alignment : (389 gigabytes)
- URL: https://cgl.gi.ucsc.edu/data/cactus/363-avian-2020.hal
- MD5: baefcc9713674e2b4cd8156b7186aec6
- Newick tree (with PHAST estimated branch lengths): https://cgl.gi.ucsc.edu/data/cactus/363-avian-2020-phast.nh
- UCSC MAF Alignment (160 gigabytes), Gallus gallus reference
- URL: https://cgl.gi.ucsc.edu/data/cactus/363-avian-2020-hub/Gallus_gallus/Gallus_gallus.maf.gz
- MD5: edde7b99cd6c1a5153a2d028a4215844
- UCSC Browser hub:
- MAF Alignment, re-exported with Cactus v2.8.4, that has correct coordinates for non-reference genomes (222 gigabtyes), Gallus gallus reference
- Log (including command line): https://cgl.gi.ucsc.edu/data/cactus/363-avian-2020.fix.maf.gz.log
- URL: https://cgl.gi.ucsc.edu/data/cactus/363-avian-2020.fix.maf.gz
- MD5: 7a6aa811a85154e1f6726362d5017e0d
- URL (duplications filtered with mafDuplicateFilter): https://cgl.gi.ucsc.edu/data/cactus/363-avian-2020.fix.single-copy.maf.gz
- MD5 (duplications filtered with mafDuplicateFilter): b1618e3fc008d462faf9d09166381d5b
- HAL alignment : (389 gigabytes)
- 605-way combined mammalian and avian alignment (1.1 terabytes) [important note: this includes the deprecated 242-way mammalian alignment with a mislabeled primate assembly, see below]
- HAL alignment: (160 gigabytes)
- URL: lhttps://cgl.gi.ucsc.edu/data/cactus/605-vertebrate-2020.hal
- MD5: 8dcf28e4d475db4af66caa026090e10e
- Newick tree (with PHAST estimated branch lengths): https://cgl.gi.ucsc.edu/data/cactus/605-vertebrate-2020-phast.nh
- UCSC Browser hub:
- HAL alignment: (160 gigabytes)
- Deprecated 242-way mammalian alignment
The alignment used in the paper analysis is here 242-way mammalian alignment. Subsequently, an error was discovered that impacts the alignment quality between primates and non-primates. This is corrected in the 241-way mammalian alignment listed above, which removed a duplicate genome (Carlito_syrichta) from primates.
The alignment files are in HAL format. Also included are UCSC Browser assembly hubs, and the Human PhyloP scores are in BigWig format. The hub links are UCSC Genome Browser assembly hubs. Follow the link to open the hub in the UCSC Browser or see the section on using Unlisted hubs in the browser documentation.
For other Zoonomia alignments and data sets, please see https://zoonomiaproject.org/the-data/.
Contacts
Benedict PatenMark Diekhans
Recent Comments