SIMAP--the database of all-against-all protein sequence similarities and annotations with new interfaces and increased coverage

Roland Arnold, Florian Goldenberg, Hans-Werner Mewes, Thomas Rattei

Research output: Contribution to journalArticlepeer-review

12 Citations (Scopus)
105 Downloads (Pure)


The Similarity Matrix of Proteins (SIMAP, database has been designed to massively accelerate computationally expensive protein sequence analysis tasks in bioinformatics. It provides pre-calculated sequence similarities interconnecting the entire known protein sequence universe, complemented by pre-calculated protein features and domains, similarity clusters and functional annotations. SIMAP covers all major public protein databases as well as many consistently re-annotated metagenomes from different repositories. As of September 2013, SIMAP contains >163 million proteins corresponding to ∼70 million non-redundant sequences. SIMAP uses the sensitive FASTA search heuristics, the Smith-Waterman alignment algorithm, the InterPro database of protein domain models and the BLAST2GO functional annotation algorithm. SIMAP assists biologists by facilitating the interactive exploration of the protein sequence universe. Web-Service and DAS interfaces allow connecting SIMAP with any other bioinformatic tool and resource. All-against-all protein sequence similarity matrices of project-specific protein collections are generated on request. Recent improvements allow SIMAP to cover the rapidly growing sequenced protein sequence universe. New Web-Service interfaces enhance the connectivity of SIMAP. Novel tools for interactive extraction of protein similarity networks have been added. Open access to SIMAP is provided through the web portal; the portal also contains instructions and links for software access and flat file downloads.

Original languageEnglish
Pages (from-to)D279-D284
Number of pages6
JournalNucleic Acids Research
Issue numberD1
Early online date26 Oct 2013
Publication statusPublished - Jan 2014


  • Databases, Protein
  • Internet
  • Molecular Sequence Annotation
  • Protein Structure, Tertiary
  • Sequence Alignment
  • Sequence Analysis, Protein
  • User-Computer Interface
  • Journal Article
  • Research Support, Non-U.S. Gov't


Dive into the research topics of 'SIMAP--the database of all-against-all protein sequence similarities and annotations with new interfaces and increased coverage'. Together they form a unique fingerprint.

Cite this