SIMAP--structuring the network of protein similarities

Thomas Rattei; Patrick Tischler; Roland Arnold; Franz Hamberger; Jörg Krebs; Jan Krumsiek; Benedikt Wachinger; Volker Stümpflen; Werner Mewes

doi:10.1093/nar/gkm963

SIMAP--structuring the network of protein similarities

Thomas Rattei, Patrick Tischler, Roland Arnold, Franz Hamberger, Jörg Krebs, Jan Krumsiek, Benedikt Wachinger, Volker Stümpflen, Werner Mewes

Cancer and Genomic Sciences

Research output: Contribution to journal › Article › peer-review

26 Citations (Scopus)

Abstract

Protein sequences are the most important source of evolutionary and functional information for new proteins. In order to facilitate the computationally intensive tasks of sequence analysis, the Similarity Matrix of Proteins (SIMAP) database aims to provide a comprehensive and up-to-date dataset of the pre-calculated sequence similarity matrix and sequence-based features like InterPro domains for all proteins contained in the major public sequence databases. As of September 2007, SIMAP covers approximately 17 million proteins and more than 6 million non-redundant sequences and provides a complete annotation based on InterPro 16. Novel features of SIMAP include a new, portlet-based web portal providing multiple, structured views on retrieved proteins and integration of protein clusters and a unique search method for similar domain architectures. Access to SIMAP is freely provided for academic use through the web portal for individuals at http://mips.gsf.de/simap/and through Web Services for programmatic access at http://mips.gsf.de/webservices/services/SimapService2.0?wsdl.

Original language	English
Pages (from-to)	D289-92
Journal	Nucleic Acids Research
Volume	36
Issue number	Database issue
DOIs	https://doi.org/10.1093/nar/gkm963
Publication status	Published - Jan 2008

Keywords

Databases, Protein
Internet
Protein Structure, Tertiary
Proteins
Sequence Alignment
Sequence Analysis, Protein
User-Computer Interface
Journal Article
Research Support, Non-U.S. Gov't

Access to Document

10.1093/nar/gkm963

Cite this

@article{9733515632784ade9d799fde8aef8692,

title = "SIMAP--structuring the network of protein similarities",

abstract = "Protein sequences are the most important source of evolutionary and functional information for new proteins. In order to facilitate the computationally intensive tasks of sequence analysis, the Similarity Matrix of Proteins (SIMAP) database aims to provide a comprehensive and up-to-date dataset of the pre-calculated sequence similarity matrix and sequence-based features like InterPro domains for all proteins contained in the major public sequence databases. As of September 2007, SIMAP covers approximately 17 million proteins and more than 6 million non-redundant sequences and provides a complete annotation based on InterPro 16. Novel features of SIMAP include a new, portlet-based web portal providing multiple, structured views on retrieved proteins and integration of protein clusters and a unique search method for similar domain architectures. Access to SIMAP is freely provided for academic use through the web portal for individuals at http://mips.gsf.de/simap/and through Web Services for programmatic access at http://mips.gsf.de/webservices/services/SimapService2.0?wsdl.",

keywords = "Databases, Protein, Internet, Protein Structure, Tertiary, Proteins, Sequence Alignment, Sequence Analysis, Protein, User-Computer Interface, Journal Article, Research Support, Non-U.S. Gov't",

author = "Thomas Rattei and Patrick Tischler and Roland Arnold and Franz Hamberger and J{\"o}rg Krebs and Jan Krumsiek and Benedikt Wachinger and Volker St{\"u}mpflen and Werner Mewes",

year = "2008",

month = jan,

doi = "10.1093/nar/gkm963",

language = "English",

volume = "36",

pages = "D289--92",

journal = "Nucleic Acids Research",

issn = "0305-1048",

publisher = "Oxford University Press",

number = "Database issue",

}

TY - JOUR

T1 - SIMAP--structuring the network of protein similarities

AU - Rattei, Thomas

AU - Tischler, Patrick

AU - Arnold, Roland

AU - Hamberger, Franz

AU - Krebs, Jörg

AU - Krumsiek, Jan

AU - Wachinger, Benedikt

AU - Stümpflen, Volker

AU - Mewes, Werner

PY - 2008/1

Y1 - 2008/1

N2 - Protein sequences are the most important source of evolutionary and functional information for new proteins. In order to facilitate the computationally intensive tasks of sequence analysis, the Similarity Matrix of Proteins (SIMAP) database aims to provide a comprehensive and up-to-date dataset of the pre-calculated sequence similarity matrix and sequence-based features like InterPro domains for all proteins contained in the major public sequence databases. As of September 2007, SIMAP covers approximately 17 million proteins and more than 6 million non-redundant sequences and provides a complete annotation based on InterPro 16. Novel features of SIMAP include a new, portlet-based web portal providing multiple, structured views on retrieved proteins and integration of protein clusters and a unique search method for similar domain architectures. Access to SIMAP is freely provided for academic use through the web portal for individuals at http://mips.gsf.de/simap/and through Web Services for programmatic access at http://mips.gsf.de/webservices/services/SimapService2.0?wsdl.

AB - Protein sequences are the most important source of evolutionary and functional information for new proteins. In order to facilitate the computationally intensive tasks of sequence analysis, the Similarity Matrix of Proteins (SIMAP) database aims to provide a comprehensive and up-to-date dataset of the pre-calculated sequence similarity matrix and sequence-based features like InterPro domains for all proteins contained in the major public sequence databases. As of September 2007, SIMAP covers approximately 17 million proteins and more than 6 million non-redundant sequences and provides a complete annotation based on InterPro 16. Novel features of SIMAP include a new, portlet-based web portal providing multiple, structured views on retrieved proteins and integration of protein clusters and a unique search method for similar domain architectures. Access to SIMAP is freely provided for academic use through the web portal for individuals at http://mips.gsf.de/simap/and through Web Services for programmatic access at http://mips.gsf.de/webservices/services/SimapService2.0?wsdl.

KW - Databases, Protein

KW - Internet

KW - Protein Structure, Tertiary

KW - Proteins

KW - Sequence Alignment

KW - Sequence Analysis, Protein

KW - User-Computer Interface

KW - Journal Article

KW - Research Support, Non-U.S. Gov't

U2 - 10.1093/nar/gkm963

DO - 10.1093/nar/gkm963

M3 - Article

C2 - 18037617

SN - 0305-1048

VL - 36

SP - D289-92

JO - Nucleic Acids Research

JF - Nucleic Acids Research

IS - Database issue

ER -

SIMAP--structuring the network of protein similarities

Abstract

Keywords

Access to Document

Fingerprint

Cite this