Project

Gender Shades

Results

High-Level Benchmark Evaluation Results

  • Government Face Recognition Benchmark: 79.6% Lighter Skin 
  • Research  Gender Classification Benchmark:  86.2% Lighter Skin

High-Level Gender Classification Results

  • All classifiers perform better on male faces than female faces (8.1%-20.6% difference in error rate)
  • All classifiers perform better on lighter faces than darker faces (11.8%-19.2% difference in error rate)
  • All classifiers perform worst on darker female faces (20.8%-34.7% error rate)
  • Microsoft and IBM classifiers perform best on lighter male faces (error rates of 0.0% and 0.3% respectively)
  • Face++  classifiers perform best on darker male faces (0.7% error rate)
  • The maximum difference in error rate between the best and worst classified groups is 34.4%