Automating incidence and prevalence analysis in open cohorts

Neil Cockburn*, Ben Hammond, Illin Gani, Samuel Cusworth, Aditya Acharya, Krishna Gokhale, Rasiah Thayakaran, Francesca Crowe, Sonica Minhas, William Parry Smith, Beck Taylor, Krishnarajah Nirantharakumar, Joht Singh Chandan

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

23 Downloads (Pure)

Abstract

Motivation: Data is increasingly used for improvement and research in public health, especially administrative data such as that collected in electronic health records. Patients enter and exit these typically open-cohort datasets non-uniformly; this can render simple questions about incidence and prevalence time-consuming and with unnecessary variation between analyses. We therefore developed methods to automate analysis of incidence and prevalence in open cohort datasets, to improve transparency, productivity and reproducibility of analyses.

Implementation: We provide both a code-free set of rules for incidence and prevalence that can be applied to any open cohort, and a python Command Line Interface implementation of these rules requiring python 3.9 or later.

General features: The Command Line Interface is used to calculate incidence and point prevalence time series from open cohort data. The ruleset can be used in developing other implementations or can be rearranged to form other analytical questions such as period prevalence.

Availability: The command line interface is freely available from https://github.com/THINKINGGroup/analogy_publication.
Original languageEnglish
Article number144
JournalBMC Medical Research Methodology
Volume24
Issue number1
DOIs
Publication statusPublished - 4 Jul 2024

Fingerprint

Dive into the research topics of 'Automating incidence and prevalence analysis in open cohorts'. Together they form a unique fingerprint.

Cite this