How rankings disguise gender inequality: A comparative analysis of ...

PLOS ONE

RESEARCH ARTICLE

How rankings disguise gender inequality: A

comparative analysis of cross-country gender

equality rankings based on adjusted wage

gaps

Karolina Goraus Tanska1, Joanna Tyrowicz2,3,4*, Lucas Augusto van der Velde ID2,5*

a1111111111

a1111111111

a1111111111

a1111111111

a1111111111

1 Faculty of Economic Sciences, University of Warsaw, Warsaw, Poland, 2 FAME|GRAPE, Warsaw,

Poland, 3 Faculty of Management, University of Warsaw, Warsaw, Poland, 4 IZA, Institute of Labor

Economics, Bonn, Germany, 5 Institute of Statistics and Demographics, Warsaw School of Economics,

Warsaw, Poland

* lvelde@sgh.waw.pl (LAV); jtyrowicz@.pl (JT)

Abstract

OPEN ACCESS

Citation: Tanska KG, Tyrowicz J, van der Velde LA

(2020) How rankings disguise gender inequality: A

comparative analysis of cross-country gender

equality rankings based on adjusted wage gaps.

PLoS ONE 15(11): e0241107.

10.1371/journal.pone.0241107

Editor: Marina Della Giusta, University of Reading,

UNITED KINGDOM

Received: March 23, 2020

Accepted: October 9, 2020

Published: November 4, 2020

Copyright: ? 2020 Tanska et al. This is an open

access article distributed under the terms of the

Creative Commons Attribution License, which

permits unrestricted use, distribution, and

reproduction in any medium, provided the original

author and source are credited.

Data Availability Statement: All outcome data and

full set of replication codes are within the paper and

its Supporting Information files, as well as on

authors¡¯ website . The

original data used to obtain the inputs for our

analysis are available from the Eurostat website

(

european-union-statistics-on-income-and-livingconditions) for all researchers, subject to fulfilling

the confidentiality requirements.

Methods for estimating the scope of unjustified inequality differ in their sensitivity to address

institutional and structural deficiencies. In the case of gender wage gaps, adjusting adequately for individual characteristics requires prior assessment of several important deficiencies, primarily whether a given labor market is characterized by gendered selection into

employment, gendered segmentation and whether these mechanisms differ along the distribution of wages. Given that countries are characterized by differentiated prevalence of

these deficiencies, ranking countries on gender wage gaps is a challenging task. Whether a

country is perceived as more equal than others depends on the interaction between the

method of adjusting gender wage gap for individual characteristics and the prevalence of

these deficiencies. We make the case that this interaction is empirically relevant by comparing the country rankings for the adjusted gender wage gap among 23 EU countries. In this

relatively homogeneous group of countries, the interaction between method and underlying

deficiencies leads to substantial variation in the extent of unjustified inequality. A country

may change its place in the ranking by as much as ten positions¨Cboth towards greater

equality and towards greater inequality. We also show that, if explored properly, this variability can yield valuable policy insights: changes in the ranking positions across methods

inform on the policy priority of the labor market deficiencies across countries in relative

terms.

1. Introduction

In this paper, we study cross-country rankings of gender inequality. For many aspects of gender inequality, policy debates focus on cross-country rankings. The rankings are obtained by

specialized institutions, who first estimate levels of inequality and subsequently rank countries

on one such measure or construct a composite index of such measures. Utilized for both

PLOS ONE | November 4, 2020

1 / 21

PLOS ONE

Funding: KGT was awarded grant UMO-2015/16/T/

HS4/00305 by the Polish National Science Centre

(Narodowe Centrum Nauki - NCN), .

.pl/. JT and LAV were awarded grant UMO2017/27/L/HS4/03219 by the NCN. The research

was partly conducted during a study visit

sponsored by the Polish National Academy for

Academic Exchange (#PPN/BEK/2018/1/00469)

. The funders had no role in

study design, data collection and analysis, decision

to publish, or preparation of the manuscript.

Competing interests: The authors have declared

that no competing interests exist.

How rankings disguise gender inequality

policy-evaluation and policy-making purposes (for Sweden and Switzerland see [1, 2], Bloomberg systematically ranks the US states for gender equality [3]), the rankings are used to shape

the public debate, set policy objectives and deploy public funding. Indeed, the debate on gender equality is largely influenced by cross-country rankings, see, for example [4¨C6]. In particular, after the World Economic Forum (WEF) published The Global Gender Gap Report in

2018, the US was shamed for ranking 49th in the world on talk-shows and in print [7], Japan

was shamed in the media for ranking the worst among G7 countries [8] and The Philippines

were praised for ranking as the most equal among South-East Asian countries [9]. In a similar

spirit, Forbes pursued with coverage of top-ranked countries, naming a few policies that were

deemed relevant for achieving high levels of gender equality (even though the listed policies

and the structure of the WEF gender equality index were not actually related [10]). In the

European Union, the publication of rankings in gender wage equality based on harmonized

linked employer-employee data every four years attracts coverage from the European Commission, national governments, and media alike.

However, in order to rank the countries, one first has to obtain the measures of inequality,

thus inevitably facing the choice of a proper measure of inequality. In academia, there appears

to be a broad consensus that comparisons of economic outcomes across genders should be

adjusted for differences in the underlying characteristics¨Ca process referred to as decomposition¨Ceven though most of the publicly debated rankings are based on raw differentials. Two

reasons explain the prevalence of raw differentials in the public debate. First, in order to adjust

the gender wage gaps for differences in the underlying characteristics, one requires access to

individual-level data, while most of the global and regional rankings are based on readily available aggregates. Second, while academics consistently emphasize the paramount relevance of

adjusting the measurement of gaps in outcomes for differences in individual characteristics,

there is no consensus on the choice of the specific decomposition method. Recent decades

have seen a growing proliferation of methods, data sources, and model specifications, reaching

effectively hundreds of potential combinations between data sources, set of control variables,

and decomposition method. As formally discussed by [11], depending on the underlying process of wage formation, labor market segmentation, and the causes behind the wage gaps, different estimators perform with varying reliability. For example [12], demonstrate a remarkable

dispersion of gender wage gap estimates for one data source for one country in one year,

obtaining roughly 2500 estimates of the adjusted gender wage gap. The multiplicity of gender

wage gap estimates stems from systematically manipulating control variables and methods,

and the estimates differ by more than 100% of the lowest obtained estimate.

The dispersion between the obtained estimates stems from the fact that each method differs

in its ability to reflect various labor market phenomena¨Cor deficiencies. For example, some

methods operate by design at the mean of the income distribution (such as parametric regression-based methods), and thus, they do not adequately reflect the scope of unjust inequality if

sticky floors or glass ceilings are important in a given labor market. The problem is all the

more acute for international comparisons because labor market deficiencies can differ across

countries, making specific methods suitable for some countries, but not for the others. For

example [13], demonstrated the substantial differentiation worldwide of the estimated inequality measures adjusted for differences in individual characteristics: their estimates for 63 countries range from 8% to as much as 48% of female income (in one given year). However, while

using one method for all the countries makes the estimated wage gaps comparable, it makes

the estimates possibly ill-suited for some countries, undermining the validity of ranking them

according to this measure of inequality. One can extend this argument to any other decomposition method.

PLOS ONE | November 4, 2020

2 / 21

PLOS ONE

How rankings disguise gender inequality

In this paper, we illustrate that country rankings of gender wage inequality differ substantially depending on the methodological choice. We show that a country can change its ranking

by roughly ten places towards greater equality or greater inequality, depending on whether the

underlying decomposition method accounts for selection into full-time employment or not.

These results quantitatively corroborate the concerns about the reliability of cross-country

rankings.

Further, we illustrate that the changes in the gender wage inequality rankings are systematically related to labor market features. We also show that changes in rankings across specifications correlate well with the measures of working time flexibility and work-life balance. These

findings illustrate that the cross-country rankings may vary systematically with the interaction

between the method of estimation and the institutional features of the labor markets.

In order to deliver these results, we utilize individual-level data across 23 EU countries. We

purposefully selected data sources harmonized in terms of sample design, questionnaire, and

implementation, in order to limit the role of data idiosyncrasies in cross-country rankings. We

apply the decomposition techniques introduced in prior literature (see [11, 12]) to these data

sets and obtain estimates of adjusted gender wage gaps, to which we refer as measures of unjustified inequality. We subsequently rank countries on those measures and study the links

between rankings, methodological choices, and labor market deficiencies.

The paper is structured as follows. We first discuss state of the art on (gender) wage gaps

measurement and demonstrate the relevance of methodological choices for cross-country

comparisons in section 2. Based on this overview, we describe our data and methods in section

3. The results are reported in section 4. The concluding remarks and policy implications summarize our study.

2. Measuring the wage gaps in the international context

Consider a population consisting of Robinson Crusoe and Friday: the two individuals are

highly differentiated in skills (individual characteristics) and caloric intake (outcome). The differences in caloric intake partly stem from relevant and observable skills (e.g., survival skills),

irrelevant and observable skills (e.g., literacy and knowledge of contemporaneous literature),

and policies, which in this simple example are represented by social structure imposed by Crusoe on Friday. Raw differences in caloric intake effectively underestimate the scope of inequality of outcomes in that population. While the measure of caloric intake is objective, it is not

(fully) informative of the extent of unjustified inequality in outcomes. To grasp the unjustified

inequality, one has to account for differences in outcome-relevant characteristics.

As means of motivation, Fig 1 reports the size of the raw gender wage gap (as published regularly by the Eurostat and used by both media and policymakers to identify the scope of gender-related inequality across the EU member states) and the estimates of adjusted gender wage

gap (AGWG) using the most common decomposition method. If the raw wage gap was indicative of the actual scope of unjust wage inequality, the countries should be located along the

diagonal in Fig 1, which is clearly not the case. The estimated adjusted wage gaps differ by as

much as 20 percentage points from the raw gender wage gap.

2.1. Decomposition methods to uncover unjustified inequality

To account for objective drivers of wage dispersion¨Cas opposed to merely raw inequality¨Cacademic research relies on methods that adjust differences in outcomes for differences in outcome-relevant characteristics. Following [14] and [15], parametric techniques decompose

observed differences in outcomes (raw wage gaps) into two components: differences in the

underlying characteristics and differences in how these characteristics matter in defining

PLOS ONE | November 4, 2020

3 / 21

PLOS ONE

How rankings disguise gender inequality

Fig 1. Raw vs. adjusted gender wage gaps. Data details described in section 3. The raw wage gap computed as

wm wf . The adjusted gender wage gap computed from Oaxaca-Blinder decomposition with the following set of

controls: age, education, residence, and marital status. Full set of estimates is discussed in section 4. Estimates obtained

separately for each country. The dotted line represents a 45 degree line.



outcomes. In the case of gender wage gaps, this is obtained by estimating, at least, two separate

wage regressions. The size of the pay difference can be decomposed into Wm ? Wf = ¦Â?(Xm ?

Xf) + Xm(¦Âm ? ¦Â?) + Xf (¦Â? ? ¦Âf), where the Wm ? Wf stands for the unadjusted (or raw) gender

wage gap (GWG), (Xm ? Xf) represents differences in characteristics between men and

women, and (¦Âm ? ¦Â?) and (¦Â? ? ¦Âf) stand for the differences in coefficients related to male and

female (dis)advantages, respectively. The literature refers to this last term as an adjusted gender

wage gap (AGWG).

For a few decades, the literature on the gaps in wages, education, etc.¨Chas been dominated

by a handful of techniques, namely the [14] and [15] decomposition, with the extensions for

functional form proposed by, e.g. [16¨C20]. These estimates were referred to as adjusted wage

gap, i.e., the wage gap that remains after adjusting for differences in characteristics important

for productivity (such as age, education, industry, occupation, firm characteristics, etc.). The

parametric Oaxaca-Blinder decomposition assumes that in the absence of discrimination, the

disadvantaged group would record earnings according to the advantaged group (counterfactual wage structure, see [14, 15]). In the case of gender, this is equivalent to setting ¦Â? = ¦Âm.

However, alternative assumptions are possible, affecting the interpretation of the adjusted gap.

If one believes that the advantaged group receives a premium, one can set ¦Â? = ¦Âf to be the fair

wage structure [16]. suggested to use simple average of coefficients in both groups, then ¦Â? =

0.5¦Âm + 0.5¦Âf [20], recommended weighted average, ¦Â? = %men?¦Âm + %women?¦Âf. Studying

the algebraic properties of this estimator [21], demonstrated that the weights should be the

opposite, but this differentiation is less relevant if shares of men and women in the labor market are fairly comparable. Finally, for interpretative ease [17], proposed to use the coefficients

PLOS ONE | November 4, 2020

4 / 21

PLOS ONE

How rankings disguise gender inequality

from pooled regression without gender dummy and [22] from pooled regression with a gender

dummy.

Regardless of the assumptions behind the counterfactual wage structure, these methods

share a common feature: they derive the scope of unjust (unexplained by individual characteristics) wage inequality from parameters estimated at the mean, and hence they provide little

information of inequality at different points of the outcome distribution.

This approach is troubled with weaknesses already acknowledged in the literature. First, if

the analyzed subpopulations of men and women differ by characteristics (e.g., women are better educated; or jobs are segregated across genders), the parametric decomposition at the

mean may be meaningless: not a single man or woman could be "similar" to the sample mean.

Second, the parametric regression-based approaches cannot account for the sticky floors or

glass ceilings (e.g., due to differentiated access to top paying positions). Third, if the patterns of

selection into e.g., employment differ across genders, the parametric decompositions assign to

wage mechanisms what is effectively unrelated to wages per se but is related to employment.

All three of these problems may generate a significant bias in the results for wage gaps, and

analogous examples can be established for other outcome measures.

A wide array of new methods addresses one or more of these shortcomings. For example,

the parametric decomposition methods have been extended to allow for selection into employment a la [23], with specificity of the selection patterns translating directly to the measures of

unjust inequality. There was substantial effort into providing decomposition methods for noncontinuous outcomes (e.g. self-employment, access to public service, health status [24, 25].

There are also many semi-parametric or non-parametric methods, whose major advantage is

that they allow to go beyond the mean and analyze continuous outcomes along their distributions [26¨C28]. Again, accounting for selection into employment is a challenge, addressed partially, see [29, 30]. The urge to compare only the comparable implies that a decision needs to

be made about what ¡°comparable¡± actually means [31]. proposes an exact matching method to

identify ¡°comparable¡± individuals and then isolate the "incomparable" individuals in both

groups to infer the possible selectivity in the wage process. An alternative approach consists of

reweighing the distribution of one group to replicate the distribution of the other group in

terms of individual characteristics [32].

The literature provides a wide selection of methods to obtain measures of AGWG which

also operate along with the wage distribution rather than simply at the mean [26¨C28, 32].

While estimating the AGWG with this technique, any parametric decomposition may be

applied for obtaining the parameters of the wage equation, adjusted for the distributional

properties of wages. The methods that rely on the functional form of the estimated wage

regression may lead to issues if the model is misspecified and model parameters are biased.

The literature also provides semi-parametric and non-parametric alternatives to estimate

AGWG. In [32], the counterfactual conditional distribution is obtained via a reweighing procedure through which the attributes observed among women (men) are given weights such

that the resulting distribution of characteristics resembles that of men (women). Hence, this

technique utilizes information about the similarity of both male and female populations in

terms of underlying characteristics relevant to wages. The non-parametric decomposition proposed by [31] applies exact matching to construct a counterfactual population of women. The

advantages of this decomposition method comprise (a) comparing the comparable because

prior to matching, a common support restriction is applied, and (b) the validity of the estimates does not rest on functional form assumptions. Moreover, differences between workers

inside and outside of the common support are informative of the consequences of

segmentation.

PLOS ONE | November 4, 2020

5 / 21

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download