Dataset Maps - X-FACTR (12 languages)
greek
Relevant Statistics
Percentage in-country: 2.94%
Total Variation Distance between observed and population-proportional distribution: 1.623
We also trained a linear model to find socioeconomic correlates of the datasets
Variance explained by population: 0.282
Variance explained by GDP: 0.466
Variance explained by geographic distance: 0.145
Variance explained by all 3 factors: 0.562
yoruba
Relevant Statistics
Percentage in-country: 1.15%
Total Variation Distance between observed and population-proportional distribution: 0.000
We also trained a linear model to find socioeconomic correlates of the datasets
Variance explained by population: 0.296
Variance explained by GDP: 0.495
Variance explained by geographic distance: 0.058
Variance explained by all 3 factors: 0.543
french
Relevant Statistics
Percentage in-country: 16.34%
Total Variation Distance between observed and population-proportional distribution: 4078.705
We also trained a linear model to find socioeconomic correlates of the datasets
Variance explained by population: 0.314
Variance explained by GDP: 0.480
Variance explained by geographic distance: 0.129
Variance explained by all 3 factors: 0.560
bengali
Relevant Statistics
Percentage in-country: 11.46%
Total Variation Distance between observed and population-proportional distribution: 387.175
We also trained a linear model to find socioeconomic correlates of the datasets
Variance explained by population: 0.414
Variance explained by GDP: 0.512
Variance explained by geographic distance: 0.069
Variance explained by all 3 factors: 0.554
hebrew
Relevant Statistics
Percentage in-country: 2.12%
Total Variation Distance between observed and population-proportional distribution: 20.073
We also trained a linear model to find socioeconomic correlates of the datasets
Variance explained by population: 0.292
Variance explained by GDP: 0.489
Variance explained by geographic distance: 0.142
Variance explained by all 3 factors: 0.583
hungarian
Relevant Statistics
Percentage in-country: 1.97%
Total Variation Distance between observed and population-proportional distribution: 222.439
We also trained a linear model to find socioeconomic correlates of the datasets
Variance explained by population: 0.295
Variance explained by GDP: 0.494
Variance explained by geographic distance: 0.173
Variance explained by all 3 factors: 0.606
korean
Relevant Statistics
Percentage in-country: 0.84%
Total Variation Distance between observed and population-proportional distribution: 25.135
We also trained a linear model to find socioeconomic correlates of the datasets
Variance explained by population: 0.338
Variance explained by GDP: 0.511
Variance explained by geographic distance: 0.032
Variance explained by all 3 factors: 0.492
marathi
Relevant Statistics
Percentage in-country: 11.15%
Total Variation Distance between observed and population-proportional distribution: 0.000
We also trained a linear model to find socioeconomic correlates of the datasets
Variance explained by population: 0.391
Variance explained by GDP: 0.536
Variance explained by geographic distance: 0.074
Variance explained by all 3 factors: 0.568
russian
Relevant Statistics
Percentage in-country: 4.34%
Total Variation Distance between observed and population-proportional distribution: 208.369
We also trained a linear model to find socioeconomic correlates of the datasets
Variance explained by population: 0.299
Variance explained by GDP: 0.470
Variance explained by geographic distance: 0.192
Variance explained by all 3 factors: 0.587
spanish
Relevant Statistics
Percentage in-country: 40.71%
Total Variation Distance between observed and population-proportional distribution: 10002.477
We also trained a linear model to find socioeconomic correlates of the datasets
Variance explained by population: 0.311
Variance explained by GDP: 0.478
Variance explained by geographic distance: 0.099
Variance explained by all 3 factors: 0.539
turkish
Relevant Statistics
Percentage in-country: 7.55%
Total Variation Distance between observed and population-proportional distribution: 780.841
We also trained a linear model to find socioeconomic correlates of the datasets
Variance explained by population: 0.311
Variance explained by GDP: 0.484
Variance explained by geographic distance: 0.171
Variance explained by all 3 factors: 0.599
vietnamese
Relevant Statistics
Percentage in-country: 17.45%
Total Variation Distance between observed and population-proportional distribution: 1855.066
We also trained a linear model to find socioeconomic correlates of the datasets
Variance explained by population: 0.356
Variance explained by GDP: 0.516
Variance explained by geographic distance: 0.022
Variance explained by all 3 factors: 0.504