User:Wayne Decatur/Sequence analysis tools

From Proteopedia

Jump to: navigation, search


Have not Categorized Yet




  • Circos on Jupyter - Circos in your browser-based Jupyter enviroment served from Circos so it is actively available in a browser with one click to launch Jupyter environment for Circos via Binder. That page also links to the main Circos resources there. The launched notebooks illustrate ways to easily work with the output in Python.


Random sequence generators

Sequence shufflers

Extract physico-chemical data from Protein or DNA sequences

  • Seq2Feature webserver is a comprehensive web-based feature extraction tool which computes protein and DNA sequence driven features. It can calculate 252 protein- based and 42 DNA- based descriptors. Major protein sequence based descriptors include physico-chemical, energetic and conformational properties, mutation matrices and contact potentials. There is a corresponding article here.


Pattern Matching

Infernal builds consensus RNA secondary structure profiles called covariance models (CMs), and uses them to search nucleic acid sequence databases for homologous RNAs, or to create new sequence- and structure-based multiple sequence alignments.

Some sequence analysis but mostly OTHER

  • BioCyc Database Collection - "BioCyc is a collection of 3530 Pathway/Genome Databases (PGDBs), with tools for understanding their data. Cellular Overview image generated by Pathway Tools. Explore Metabolic Maps for Thousands of Organisms. RouteSearch: Search for Paths through the Metabolic Network. Cross-Organism Search form generated by Pathway Tools. New: Search All of BioCyc for Genes, Proteins, Pathways. Search all of BioCyc or designated taxonomic groups for named genes, proteins, metabolites, pathways. Multiple Sequence Alignment results generated by Pathway Tools using MUSCLE. PatMatch query and results by Pathway Tools. SmartTable display generated by Pathway Tools. Metabolomics Data Analysis. Cellular Overview Omics Viewer image generated by Pathway Tools. Gene Expression Data Analysis. Multi-Genome Browser. Comparative Genome Analysis."

Good E. coli database

  • - EcoProDB E. coli protein database (EcoProDB) integrates protein information identified on 2-D gels along with other resources to provide the comparative platform for the expression levels of many heterogeneous proteins under different genetic and environmental conditions using the interactive interface and search mechanism.


  • HOMER - "Software for motif discovery and next-gen sequencing analysis". Nice in that it actually explains some of the details and advantages of the browsers and file types.

Nucleic acid system building and DNA structure design

  • NUPACK - "NUPACK is a growing software suite for the analysis and design of nucleic acid structures, devices, and systems." Seems to be able to do melting temperature and free energy calculations as well, etc..

Fungal Genome Resources

1011 Saccharomyces cerevisiae genomes , associated with Genome evolution across 1,011 Saccharomyces cerevisiae isolates. Peter J, De Chiara M, Friedrich A, Yue JX, Pflieger D, Bergström A, Sigwalt A, Barre B, Freel K, Llored A, Cruaud C, Labadie K, Aury JM, Istace B, Lebrigand K, Barbry P, Engelen S, Lemainque A, Wincker P, Liti G, Schacherer J. Nature. 2018 Apr;556(7701):339-344. doi: 10.1038/s41586-018-0030-5. Epub 2018 Apr 11. PMID: 29643504.

332 budding yeasts associated with Tempo and Mode of Genome Evolution in the Budding Yeast Subphylum. Shen XX, Opulente DA, Kominek J, Zhou X, Steenwyk JL, Buh KV, Haase MAB, Wisecaver JH, Wang M, Doering DT, Boudouris JT, Schneider RM, Langdon QK, Ohkuma M, Endoh R, Takashima M, Manabe RI, Čadež N, Libkind D, Rosa CA, DeVirgilio J, Hulfachor AB, Groenewald M, Kurtzman CP, Hittinger CT, Rokas A. Cell. 2018 Nov 29;175(6):1533-1545.e20. doi: 10.1016/j.cell.2018.10.023. Epub 2018 Nov 8. PMID: 30415838. (Figshare corresponding to the paper) (about it –> <— nice graphic of situation related to 1000 fungal genomes project <– how current is it???

For genomic arrangement (synteny) comparisons/Fungal Genomics Resources

Synteny Viewer listed under every SGD gene on Sequence tab, near bottom of page

Yeast Gene Order Browser (YGOB)

RNA Structure Analysis

  • rna-tools - (previously known as ' rna-pdb-tools'): a toolbox to analyze sequences, structures and simulations of RNA. (Takes some navigating around to find what you want because a lot is there.)

Analyze DNA curvature

  • bendit-binder - use the software to predict DNA curvature from DNA sequences with the power of the Jupyter ecosystem served via

Sequence Logo Generation

Installable software for fine-tuning sequence alignments

Windows equivalent is here but I have NOT tried it.

Python-based utilities

  • seqmagick-An imagemagick-like frontend to Biopython SeqIO. For example, it can convert from fasta to phylip, remove gaps from a fasta-formatted sequence, and describe all FASTA files in the current directory. Requires Biopython.
  • see also earlier on this page 'Binder'/notebook-related items as I usually have worked out Python code to shuttle other command-line based software output to Python and notebook-related items here as I sometimes demonstrate script usage in launchable notebooks

My own sequence work-related code

Proteopedia Page Contributors and Editors (what is this?)

Wayne Decatur

Personal tools