Non-parametric data-driven background modelling using conditional probabilities

Andrew Chisholm, Thomas Neep, Konstantinos Nikolopoulos, Rhys Owen, Elliot Reynolds*, Júlia Cardoso Silva

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

24 Downloads (Pure)

Abstract

Background modelling is one of the main challenges in particle physics data analysis. Commonly employed strategies include the use of simulated events of the background processes, and the fitting of parametric background models to the observed data. However, reliable simulations are not always available or may be extremely costly to produce. As a result, in many cases uncertainties associated with the accuracy or sample size of the simulation are the limiting factor in the analysis sensitivity. At the same time, parametric models are limited by the a priori unknown functional form and parameter values of the background distribution. These issues become ever more pressing when large datasets become available, as it is already the case at the CERN Large Hadron Collider, and when studying exclusive signatures involving hadronic backgrounds.

A widely applicable approach for non-parametric data-driven background modelling is proposed, which addresses these issues for a broad class of searches and measurements. It relies on a relaxed version of the event selection to estimate conditional probability density functions and two different techniques are discussed for its realisation. The first relies on ancestral sampling and uses data from a relaxed event selection to estimate a graph of conditional probability density functions of the variables used in the analysis, while accounting for significant correlations. A background model is then generated from events drawn from this graph, on which the full event selection is applied. In the second, a novel generative adversarial network is trained to estimate the joint probability density function of the variables used in the analysis. The training is performed on a relaxed event selection, which excludes the signal region, and the network is conditioned on a blinding variable. Subsequently, the conditional probability density function is interpolated into the signal region to model the background. The application of each method on a benchmark analysis and on ensemble tests is presented in detail, and the performance is discussed.
Original languageEnglish
Article number1
Number of pages32
JournalJournal of High Energy Physics
Volume2022
Issue number10
DOIs
Publication statusPublished - 3 Oct 2022

Keywords

  • Hadron-Hadron Scattering
  • Higgs Physics
  • Particle and Resonance Production
  • Rare Decay

Fingerprint

Dive into the research topics of 'Non-parametric data-driven background modelling using conditional probabilities'. Together they form a unique fingerprint.

Cite this