TY - JOUR
T1 - High quality genome annotation and expression visualisation of a mupirocin-producing bacterium
AU - Haines, Anthony S
AU - Kendrew, Steve G
AU - Crowhurst, Nicola
AU - Stephens, Elton R
AU - Connolly, Jack
AU - Hothersall, Joanne
AU - Miller, Claire E
AU - Collis, Andrew J
AU - Huckle, Benjamin D
AU - Thomas, Christopher M
PY - 2022/5/5
Y1 - 2022/5/5
N2 - Pseudomonas strain NCIMB10586, in the P. fluorescens subgroup, produces the polyketide antibiotic mupirocin, and has potential as a host for industrial production of a range of valuable products. To underpin further studies on its genetics and physiology, we have used a combination of standard and atypical approaches to achieve a quality of the genome sequence and annotation, above current standards for automated pathways. Assembly of Illumina reads to a PacBio genome sequence created a retrospectively hybrid assembly, identifying and fixing 415 sequencing errors which would otherwise affect almost 5% of annotated coding regions. Our annotation pipeline combined automation based on related well-annotated genomes and stringent, partially manual, tests for functional features. The strain was close to P. synxantha and P. libaniensis and was found to be highly similar to a strain being developed as a weed-pest control agent in Canada. Since mupirocin is a secondary metabolite whose production is switched on late in exponential phase, we carried out RNAseq analysis over an 18 h growth period and have developed a method to normalise RNAseq samples as a group, rather than pair-wise. To review such data we have developed an easily interpreted way to present the expression profiles across a region, or the whole genome at a glance. At the 2-hour granularity of our time-course, the mupirocin cluster increases in expression as an essentially uniform bloc, although the mupirocin resistance gene stands out as being expressed at all the time points.
AB - Pseudomonas strain NCIMB10586, in the P. fluorescens subgroup, produces the polyketide antibiotic mupirocin, and has potential as a host for industrial production of a range of valuable products. To underpin further studies on its genetics and physiology, we have used a combination of standard and atypical approaches to achieve a quality of the genome sequence and annotation, above current standards for automated pathways. Assembly of Illumina reads to a PacBio genome sequence created a retrospectively hybrid assembly, identifying and fixing 415 sequencing errors which would otherwise affect almost 5% of annotated coding regions. Our annotation pipeline combined automation based on related well-annotated genomes and stringent, partially manual, tests for functional features. The strain was close to P. synxantha and P. libaniensis and was found to be highly similar to a strain being developed as a weed-pest control agent in Canada. Since mupirocin is a secondary metabolite whose production is switched on late in exponential phase, we carried out RNAseq analysis over an 18 h growth period and have developed a method to normalise RNAseq samples as a group, rather than pair-wise. To review such data we have developed an easily interpreted way to present the expression profiles across a region, or the whole genome at a glance. At the 2-hour granularity of our time-course, the mupirocin cluster increases in expression as an essentially uniform bloc, although the mupirocin resistance gene stands out as being expressed at all the time points.
KW - Anti-Bacterial Agents/metabolism
KW - Molecular Sequence Annotation
KW - Mupirocin
KW - Pseudomonas fluorescens/genetics
KW - Retrospective Studies
KW - Sequence Analysis, DNA/methods
UR - http://www.scopus.com/inward/record.url?scp=85129776998&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0268072
DO - 10.1371/journal.pone.0268072
M3 - Article
C2 - 35511780
SN - 1932-6203
VL - 17
JO - PLoS ONE
JF - PLoS ONE
IS - 5
M1 - e0268072
ER -