Dataset Maps - Natural Questions (English)
Relevant Statistics
Percentage in-country: 80.07%
Missing countries: 49 of 243 (20.16%)
Total Variation Distance between observed and population-proportional distribution: 11907.219
We also trained a linear model to find socioeconomic correlates of the datasets
Variance explained by population: 0.395
Variance explained by GDP: 0.535
Variance explained by geographic distance: 0.030
Variance explained by all 3 factors: 0.550