Optimising fairness through parametrised data sampling

Vladimiro González-Zelaya, Julián Salas, Dennis Prangle, Paolo Missier

Research output: Chapter in Book/Report/Conference proceedingConference contribution

20 Downloads (Pure)

Abstract

Improving machine learning models' fairness is an active research topic, with most approaches focusing on specific definitions of fairness. In contrast, we propose ParDS, a parametrised data sampling method by which we can optimise the fairness ratios observed on a test set, in a way that is agnostic to both the specific fairness definitions, and the chosen classification model. Given a training set with one binary protected attribute and a binary label, our approach involves correcting the positive rate for both the favoured and unfavoured groups through resampling of the training set. We present experimental evidence showing that the amount of resampling can be optimised to achieve target fairness ratios for a specific training set and fairness definition, while preserving most of the model's accuracy. We discuss conditions for the method to be viable, and then extend the method to include multiple protected attributes. In our experiments we use three different sampling strategies, and we report results for three commonly used definitions of fairness, and three public benchmark datasets: Adult Income, COMPAS and German Credit.

Original languageEnglish
Title of host publicationAdvances in Database Technology - EDBT 2021
Subtitle of host publication24th International Conference on Extending Database Technology, Proceedings
EditorsYannis Velegrakis, Yannis Velegrakis, Demetris Zeinalipour, Panos K. Chrysanthis, Panos K. Chrysanthis, Francesco Guerra
PublisherOpenProceedings.org
Pages445-450
Number of pages6
ISBN (Electronic)9783893180844
DOIs
Publication statusPublished - Mar 2021
EventAdvances in Database Technology - 24th International Conference on Extending Database Technology, EDBT 2021 - Virtual, Nicosia, Cyprus
Duration: 23 Mar 202126 Mar 2021

Publication series

NameAdvances in Database Technology - EDBT
Volume2021-March
ISSN (Electronic)2367-2005

Conference

ConferenceAdvances in Database Technology - 24th International Conference on Extending Database Technology, EDBT 2021
Country/TerritoryCyprus
CityVirtual, Nicosia
Period23/03/2126/03/21

Bibliographical note

Funding Information:
This research was partly supported by the Spanish Government under project RTI2018-095094-B-C21 "CONSENT".

Publisher Copyright:
© 2021 Copyright held by the owner/author(s).

ASJC Scopus subject areas

  • Information Systems
  • Software
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Optimising fairness through parametrised data sampling'. Together they form a unique fingerprint.

Cite this