SIMAP: the similarity matrix of proteins

Thomas Rattei, Roland Arnold, Patrick Tischler, Dominik Lindner, Volker Stümpflen, H Werner Mewes

Research output: Contribution to journalArticlepeer-review

38 Citations (Scopus)

Abstract

Similarity Matrix of Proteins (SIMAP) (http://mips.gsf.de/simap) provides a database based on a pre-computed similarity matrix covering the similarity space formed by >4 million amino acid sequences from public databases and completely sequenced genomes. The database is capable of handling very large datasets and is updated incrementally. For sequence similarity searches and pairwise alignments, we implemented a grid-enabled software system, which is based on FASTA heuristics and the Smith-Waterman algorithm. Our ProtInfo system allows querying by protein sequences covered by the SIMAP dataset as well as by fragments of these sequences, highly similar sequences and title words. Each sequence in the database is supplemented with pre-calculated features generated by detailed sequence analyses. By providing WWW interfaces as well as web-services, we offer the SIMAP resource as an efficient and comprehensive tool for sequence similarity searches.

Original languageEnglish
Pages (from-to)D252-6
JournalNucleic Acids Research
Volume34
Issue numberDatabase issue
DOIs
Publication statusPublished - 1 Jan 2006

Keywords

  • Databases, Protein
  • Internet
  • Sequence Alignment
  • Sequence Homology, Amino Acid
  • Software
  • User-Computer Interface
  • Journal Article
  • Research Support, Non-U.S. Gov't

Fingerprint

Dive into the research topics of 'SIMAP: the similarity matrix of proteins'. Together they form a unique fingerprint.

Cite this