Dataset Maps - MLQA (English)
Relevant Statistics
Percentage in-country: 53.63%
Missing countries: 80 of 243 (32.92%)
Total Variation Distance between observed and population-proportional distribution: 913.093
We also trained a linear model to find socioeconomic correlates of the datasets
Variance explained by population: 0.317
Variance explained by GDP: 0.561
Variance explained by geographic distance: 0.040
Variance explained by all 3 factors: 0.548