A Python package for fairness metrics in language models


Various metrics have been proposed to measure how social biases affect predictions of large language models. However, these metrics rely on different interpretations of 'fairness' and different world-views, thus they measure different aspects. We wrote a survey paper about fairness evaluation methods for language models and we implemented the metrics in one Python package.

More Information:
  • it can be found here