Wellington: a novel method for the accurate identification of digital genomic footprints from DNase-seq data

Jason Piper; Markus C Elze; Pierre Cauchy; Peter N Cockerill; Constanze Bonifer; Sascha Ott

doi:10.1093/nar/gkt850

Wellington: a novel method for the accurate identification of digital genomic footprints from DNase-seq data

Jason Piper, Markus C Elze, Pierre Cauchy, Peter N Cockerill, Constanze Bonifer, Sascha Ott

Research output: Contribution to journal › Article › peer-review

113 Citations (Scopus)

Abstract

The expression of eukaryotic genes is regulated by cis-regulatory elements such as promoters and enhancers, which bind sequence-specific DNA-binding proteins. One of the great challenges in the gene regulation field is to characterise these elements. This involves the identification of transcription factor (TF) binding sites within regulatory elements that are occupied in a defined regulatory context. Digestion with DNase and the subsequent analysis of regions protected from cleavage (DNase footprinting) has for many years been used to identify specific binding sites occupied by TFs at individual cis-elements with high resolution. This methodology has recently been adapted for high-throughput sequencing (DNase-seq). In this study, we describe an imbalance in the DNA strand-specific alignment information of DNase-seq data surrounding protein-DNA interactions that allows accurate prediction of occupied TF binding sites. Our study introduces a novel algorithm, Wellington, which considers the imbalance in this strand-specific information to efficiently identify DNA footprints. This algorithm significantly enhances specificity by reducing the proportion of false positives and requires significantly fewer predictions than previously reported methods to recapitulate an equal amount of ChIP-seq data. We also provide an open-source software package, pyDNase, which implements the Wellington algorithm to interface with DNase-seq data and expedite analyses.

Original language	English
Pages (from-to)	e201
Journal	Nucleic Acids Research
Volume	41
Issue number	21
DOIs	https://doi.org/10.1093/nar/gkt850
Publication status	Published - 25 Sept 2013

Access to Document

10.1093/nar/gkt850Licence: Creative Commons: Attribution (CC BY)

Cite this

@article{532c6cba1ceb4bfba52e5f7c47ac5949,

title = "Wellington: a novel method for the accurate identification of digital genomic footprints from DNase-seq data",

abstract = "The expression of eukaryotic genes is regulated by cis-regulatory elements such as promoters and enhancers, which bind sequence-specific DNA-binding proteins. One of the great challenges in the gene regulation field is to characterise these elements. This involves the identification of transcription factor (TF) binding sites within regulatory elements that are occupied in a defined regulatory context. Digestion with DNase and the subsequent analysis of regions protected from cleavage (DNase footprinting) has for many years been used to identify specific binding sites occupied by TFs at individual cis-elements with high resolution. This methodology has recently been adapted for high-throughput sequencing (DNase-seq). In this study, we describe an imbalance in the DNA strand-specific alignment information of DNase-seq data surrounding protein-DNA interactions that allows accurate prediction of occupied TF binding sites. Our study introduces a novel algorithm, Wellington, which considers the imbalance in this strand-specific information to efficiently identify DNA footprints. This algorithm significantly enhances specificity by reducing the proportion of false positives and requires significantly fewer predictions than previously reported methods to recapitulate an equal amount of ChIP-seq data. We also provide an open-source software package, pyDNase, which implements the Wellington algorithm to interface with DNase-seq data and expedite analyses.",

author = "Jason Piper and Elze, {Markus C} and Pierre Cauchy and Cockerill, {Peter N} and Constanze Bonifer and Sascha Ott",

year = "2013",

month = sep,

day = "25",

doi = "10.1093/nar/gkt850",

language = "English",

volume = "41",

pages = "e201",

journal = "Nucleic Acids Research",

issn = "0305-1048",

publisher = "Oxford University Press",

number = "21",

}

TY - JOUR

T1 - Wellington

T2 - a novel method for the accurate identification of digital genomic footprints from DNase-seq data

AU - Piper, Jason

AU - Elze, Markus C

AU - Cauchy, Pierre

AU - Cockerill, Peter N

AU - Bonifer, Constanze

AU - Ott, Sascha

PY - 2013/9/25

Y1 - 2013/9/25

N2 - The expression of eukaryotic genes is regulated by cis-regulatory elements such as promoters and enhancers, which bind sequence-specific DNA-binding proteins. One of the great challenges in the gene regulation field is to characterise these elements. This involves the identification of transcription factor (TF) binding sites within regulatory elements that are occupied in a defined regulatory context. Digestion with DNase and the subsequent analysis of regions protected from cleavage (DNase footprinting) has for many years been used to identify specific binding sites occupied by TFs at individual cis-elements with high resolution. This methodology has recently been adapted for high-throughput sequencing (DNase-seq). In this study, we describe an imbalance in the DNA strand-specific alignment information of DNase-seq data surrounding protein-DNA interactions that allows accurate prediction of occupied TF binding sites. Our study introduces a novel algorithm, Wellington, which considers the imbalance in this strand-specific information to efficiently identify DNA footprints. This algorithm significantly enhances specificity by reducing the proportion of false positives and requires significantly fewer predictions than previously reported methods to recapitulate an equal amount of ChIP-seq data. We also provide an open-source software package, pyDNase, which implements the Wellington algorithm to interface with DNase-seq data and expedite analyses.

AB - The expression of eukaryotic genes is regulated by cis-regulatory elements such as promoters and enhancers, which bind sequence-specific DNA-binding proteins. One of the great challenges in the gene regulation field is to characterise these elements. This involves the identification of transcription factor (TF) binding sites within regulatory elements that are occupied in a defined regulatory context. Digestion with DNase and the subsequent analysis of regions protected from cleavage (DNase footprinting) has for many years been used to identify specific binding sites occupied by TFs at individual cis-elements with high resolution. This methodology has recently been adapted for high-throughput sequencing (DNase-seq). In this study, we describe an imbalance in the DNA strand-specific alignment information of DNase-seq data surrounding protein-DNA interactions that allows accurate prediction of occupied TF binding sites. Our study introduces a novel algorithm, Wellington, which considers the imbalance in this strand-specific information to efficiently identify DNA footprints. This algorithm significantly enhances specificity by reducing the proportion of false positives and requires significantly fewer predictions than previously reported methods to recapitulate an equal amount of ChIP-seq data. We also provide an open-source software package, pyDNase, which implements the Wellington algorithm to interface with DNase-seq data and expedite analyses.

U2 - 10.1093/nar/gkt850

DO - 10.1093/nar/gkt850

M3 - Article

C2 - 24071585

SN - 0305-1048

VL - 41

SP - e201

JO - Nucleic Acids Research

JF - Nucleic Acids Research

IS - 21

ER -

Wellington: a novel method for the accurate identification of digital genomic footprints from DNase-seq data

Abstract

Access to Document

Fingerprint

Cite this