Assessing Machine Learning via Dataset Mutation

Défense de mémoire de Germain Herbay

Catégorie : mémoire
Date : 30/08/2022 12:30 - 30/08/2022 14:00
Lieu : Salle académique
Orateur(s) : Germain Herbay
Organisateur(s) : Isabelle Daelman

Fairness is becoming a major concern in software engineering. As machine learning (ML) systems are increasingly used in critical systems (e.g., recruitment and lending), it is crucial to ensure that decisions computed by such systems do not exhibit unfair behaviour against certain social groups (e.g., those defined by gender, race, and age). Apart from robustness and safety, fairness is, therefore, an important property that a well-designed software should have. Previous works have been conducted to ensure this property by exposing, diagnosing and mitigating bias in ML systems. Although bias in data is a well-studied topic, software engineering has not yet fully explored its impact. To this end, we propose an approach relying on mutation testing to inject perturbations in the training data and analyse the impact of these perturbations on conventional fairness metrics. To evaluate our approach, we design data mutation techniques and use three popular datasets (i.e., Adult, COMPAS, Bank). The first evaluation reveals that fairness measures highly differ depending on the nature of the datasets and the perturbations used. This suggests that the ML algorithms are very sensitive to injected perturbations in the datasets. The second evaluation to better understand the impact of data distributions on fairness leads to less conclusive results. In summary, our results suggest that mutation analysis represents a potentially useful approach for a further in-depth understanding of fairness in ML systems, but requires further exploration.

KEYWORDS: Fairness, Machine Learning, Mutation Analysis

Contact : Isabelle Daelman - isabelle.daelman@unamur.be
Télecharger : vCal

Sections

Assessing Machine Learning via Dataset Mutation