Relational Sequence Alignments and Logos

Andreas Karwath; Kristian Kersting

doi:10.1007/978-3-540-73847-3_29

Relational Sequence Alignments and Logos

Andreas Karwath, Kristian Kersting

Research output: Contribution to conference (unpublished) › Paper › peer-review

4 Citations (Scopus)

Abstract

The need to measure sequence similarity arises in many applicitation domains and often coincides with sequence alignment: the more similar two sequences are, the better they can be aligned. Aligning sequences not only shows how similar sequences are, it also shows where there are differences and correspondences between the sequences. Traditionally, the alignment has been considered for sequences of flat symbols only. Many real world sequences such as natural language sentences and protein secondary structures, however, exhibit rich internal structures. This is akin to the problem of dealing with structured examples studied in the field of inductive logic programming (ILP). In this paper, we introduce Real, which is a powerful, yet simple approach to align sequence of structured symbols using well-established ILP distance measures within traditional alignment methods. Although straight-forward, experiments on protein data and Medline abstracts show that this approach works well in practice, that the resulting alignments can indeed provide more information than flat ones, and that they are meaningful to experts when represented graphically.

Original language	English
Pages	290-304
DOIs	https://doi.org/10.1007/978-3-540-73847-3_29
Publication status	Published - 2007

Keywords

bioinformatics, inductive logic programming, relational learning, scientific knowledge

Access to Document

10.1007/978-3-540-73847-3_29

http://dx.doi.org/10.1007/978-3-540-73847-3_29

Cite this

@conference{52d431806cb24fda920f54e77a877ce5,

title = "Relational Sequence Alignments and Logos",

abstract = "The need to measure sequence similarity arises in many applicitation domains and often coincides with sequence alignment: the more similar two sequences are, the better they can be aligned. Aligning sequences not only shows how similar sequences are, it also shows where there are differences and correspondences between the sequences. Traditionally, the alignment has been considered for sequences of flat symbols only. Many real world sequences such as natural language sentences and protein secondary structures, however, exhibit rich internal structures. This is akin to the problem of dealing with structured examples studied in the field of inductive logic programming (ILP). In this paper, we introduce Real, which is a powerful, yet simple approach to align sequence of structured symbols using well-established ILP distance measures within traditional alignment methods. Although straight-forward, experiments on protein data and Medline abstracts show that this approach works well in practice, that the resulting alignments can indeed provide more information than flat ones, and that they are meaningful to experts when represented graphically.",

keywords = "bioinformatics, inductive logic programming, relational learning, scientific knowledge",

author = "Andreas Karwath and Kristian Kersting",

year = "2007",

doi = "10.1007/978-3-540-73847-3_29",

language = "English",

pages = "290--304",

}

TY - CONF

T1 - Relational Sequence Alignments and Logos

AU - Karwath, Andreas

AU - Kersting, Kristian

PY - 2007

Y1 - 2007

N2 - The need to measure sequence similarity arises in many applicitation domains and often coincides with sequence alignment: the more similar two sequences are, the better they can be aligned. Aligning sequences not only shows how similar sequences are, it also shows where there are differences and correspondences between the sequences. Traditionally, the alignment has been considered for sequences of flat symbols only. Many real world sequences such as natural language sentences and protein secondary structures, however, exhibit rich internal structures. This is akin to the problem of dealing with structured examples studied in the field of inductive logic programming (ILP). In this paper, we introduce Real, which is a powerful, yet simple approach to align sequence of structured symbols using well-established ILP distance measures within traditional alignment methods. Although straight-forward, experiments on protein data and Medline abstracts show that this approach works well in practice, that the resulting alignments can indeed provide more information than flat ones, and that they are meaningful to experts when represented graphically.

AB - The need to measure sequence similarity arises in many applicitation domains and often coincides with sequence alignment: the more similar two sequences are, the better they can be aligned. Aligning sequences not only shows how similar sequences are, it also shows where there are differences and correspondences between the sequences. Traditionally, the alignment has been considered for sequences of flat symbols only. Many real world sequences such as natural language sentences and protein secondary structures, however, exhibit rich internal structures. This is akin to the problem of dealing with structured examples studied in the field of inductive logic programming (ILP). In this paper, we introduce Real, which is a powerful, yet simple approach to align sequence of structured symbols using well-established ILP distance measures within traditional alignment methods. Although straight-forward, experiments on protein data and Medline abstracts show that this approach works well in practice, that the resulting alignments can indeed provide more information than flat ones, and that they are meaningful to experts when represented graphically.

KW - bioinformatics, inductive logic programming, relational learning, scientific knowledge

U2 - 10.1007/978-3-540-73847-3_29

DO - 10.1007/978-3-540-73847-3_29

M3 - Paper

SP - 290

EP - 304

ER -

Relational Sequence Alignments and Logos

Abstract

Keywords

Access to Document

Fingerprint

Cite this