Accesso libero

Does universal long-term care insurance boost female labor force participation? Macro-level evidence

INFORMAZIONI SU QUESTO ARTICOLO

Cita

Introduction

The introduction or expansion of a social insurance program is one of the most essential and controversial public policy issues in both developed and developing countries. Public health insurance is a central topic of study and debate. The 2008 Medicaid expansion in Oregon and the Affordable Care Act in the United States, for instance, are leading examples of policy changes that have led to numerous academic studies and public discussions (Obama, 2016; Sommers et al., 2017).

While public long-term care (LTC) insurance is much less studied than public health insurance, public LTC systems and publicly financed formal care are important for both those who need care for themselves and for their family members, particularly female informal care-givers (in this paper LTC indicates personal care for those who need assistance for their daily activities, formal caregivers mean publicly funded caregivers, and informal caregivers are all the other types of caregivers such as family and relatives).

The effects of informal caregiving on the female labor supply have been intensively studied in health and labor economics. Lilly et al. (2007) and Bauer and Sousa-Poza (2015) reviewed studies on the impact of informal caregiving on caregivers’ labor supply and related outcomes. The estimated effects of informal caregiving on the (female) labor supply in these studies are different and heterogeneous, but both Lilly et al. (2007) and Bauer and Sousa-Poza (2015) conclude that in general, the estimated negative impacts tend to be small or modest. Another important topic in the empirical literature on LTC in economics is the relationship between informal and formal LTC (Van Houtven and Norton, 2004; Charles and Sevak, 2005; Hanaoka and Norton, 2008; Bolin et al., 2008; Bonsang, 2009; Barczyk and Kredler, 2017, 2018).

On the other hand, only few papers examine the effects of public LTC insurance (hereafter LTCI) on the female labor supply (Shimizutani et al., 2008; Tamiya et al., 2011; Sugawara and Nakamura, 2014; Fukahori et al., 2015; Geyer and Korfhage, 2015, 2018; Fu et al., 2017). We can point out at least two reasons why the effects of public LTCI are not much studied. First, there are only a handful of developed countries (Luxemburg, the Netherlands, Germany, Japan, and South Korea) that have introduced independent public LTCI programs. Public LTC services in many other countries are mainly financed by general tax revenues and/or health-related public insurance programs and are provided as a kind of social or health service. Thus, it is often difficult to find exogenous sources of variation in LTC services that enable researchers to identify the fiscal, economic, and social consequences of public LTC programs. Second, even if we find a distinct introduction or expansion of a public LTCI program, it is difficult to estimate its causal impact due to the universality of current LTCI schemes in several countries. In short, because there are no solid “control” groups within the same country due to the universality of LTCI programs, we cannot compare the socioeconomic outcomes of those who are covered by LTCI with their estimated counterfactual outcomes. This universal feature of existing public LTCI is a major obstacle to the plausible identification of the impact of an LTCI introduction.

To overcome the difficulty of finding a reliable control group within the same country, we estimate the nationwide aggregate impact of a large-scale LTCI introduction in Japan on public finance and female labor force participation, utilizing within-country variations in the country-level panel data. Our empirical strategy relies on the synthetic control (SC) method developed by Abadie and Gardeazabal (2003) and Abadie et al. (2010) for plausible statistical causal inference in a case study. By “case study” we mean that the number of the “treated” cases or units is only one, which in this paper is Japan.

Our findings suggest that LTCI introduction substantially increased the in-kind benefits in Japan but did not crowd out public health expenditure. We also do not find any positive LTCI impact on the labor force participation for middle-aged women. These findings imply that LTCI introduction in Japan was not a sufficient booster capable of altering Japanese female-dependent informal caregiving and low female labor market participation, which are often identified as characteristics of Japanese familialism (OECD, 2012, 2017).

Our contributions are twofold. First, this is to our knowledge the first study that investigates the nationwide general-equilibrium impacts of a large-scale LTCI introduction. Most previous studies of LTCI effects on labor supply, which we will discuss in Section 2.3, use individual-level data to identify partial-equilibrium effects, explicitly or implicitly investigating changes in the labor supply of informal caregivers before and after LTCI introduction. These microlevel partial effects are informative and policy-relevant, but they do not provide information about how a nationwide universal LTCI introduction has (or has not) changed the country in question's aggregate fiscal and labor-market conditions (see, among others, Heckman et al., 1998; Blundell et al., 2004; Finkelstein, 2007) for the distinction between a partial-equilibrium effect and a general-equilibrium effect in social program evaluations. Our result of no LTCI effect on female labor force participation is different from some microlevel empirical evidence, and this suggests that we need to reconsider several possible pathways from LTCI to female labor supply.

Second, it is interesting to shed light on the nationwide impact of a universal LTCI program on female labor force participation, because a large-scale LTCI program could alter the balance between “home production” and publicly subsidized LTC services. For example, while some recent influential historical or cross-country studies on the determinants of female labor force participation do not focus on the roles of informal and formal LTC (Goldin, 2006, 2014; Olivetti and Petrongolo, 2016, 2017), some cross-country or within-country studies find a negative relationship between the level of “family ties” or “home production” and (female) labor force participation (Alesina and Giuliano, 2010; Ngai and Pissarides, 2011). In addition, whereas many microlevel studies of the effects of informal caregiving on female labor supply did not find strong negative effects, Crespo and Mira (2014) found a clear North–South gradient (in Europe) in the positive effect of parental ill health on the probability of informal caregiving by daughters and also observed weaker evidence of a North–South gradient in the negative effect of informal caregiving on the female labor force participation. Although the Japanese case was not studied in the work of Crespo and Mira (2014), the literature of comparative welfare states often categorizes Japan among the “familialistic” welfare states; other examples include southern and continental European countries, where female family members play a primary role in the provision of child and elderly care (Esping-Andersen, 1997, 1999). It was thus expected that Japan would be on the “south” side and the expansion of formal LTC services by LTCI would reduce the burden of female caregivers and boost female labor force participation. The fact that we did not find such an effect at an aggregate level suggests that we need to reexamine the determinants of the labor supply of middle-aged women.

The rest of the paper is organized as follows. In Section 2, we discuss the institutional backgrounds of LTCI introduction in the international and Japanese contexts. Section 3 explains our empirical strategy with an SC method. In Section 4, we describe our data sources and data arrangements and then show descriptive statistics. Section 5 provides the results of SC estimation and conventional placebo tests. Section 6 presents the results of additional SC estimations. Section 7 discusses our results and concludes our paper.

Background
LTCI in the international context

In 2014, the aging rate (population ages 65 and above, % of total) among OECD nations reached the range of between 6.9% (Mexico) and 27.0% (Japan), and the averages of the aging rates among the OECD and EU members were 16.8% and 19.8%, respectively. All of these numbers are unprecedentedly high (World Development Indicators, 2017). Faced with a situation in which their societies are aging, OECD nations have introduced and developed LTC systems that are based on their own institutional and historical backgrounds (Colombo et al., 2011; Swartz, 2013; Costa-font et al., 2015).).

Table 1 summarizes the characteristics of LTC systems for the elderly among OECD countries in terms of coverage, benefits, and sources of funding based on Colombo et al. (2011). We do not consider LTC systems that target only non-elderly people with a disability. As can be seen in this table, LTC systems are quite diverse among OECD nations, but there are some clusters. First, Nordic countries, which are often considered as the leading welfare states, finance LTC costs through tax revenues. In addition, these countries provide LTC services to people with a disability without specific age-related criteria. The United Kingdom, Spain, and the Czech Republic are also categorized in this cluster. Second, many continental European countries such as France, Italy, and Austria adopt more mixed financing systems, but also provide LTC services without strict age-related criteria. Third, public LTCI has been adopted by only a few continental European and Asian countries such as Germany, Luxembourg, the Netherlands, Japan, and South Korea, where public health insurance systems had already been adopted before the introduction of LTCI.

LTC systems in OECD countries

Sources of fundsCoverage and benefits
People with a disabilityAged people with a disability / People with an age-related disability
In kindCash and in kindIn kindCash and in kind
Tax revenuesCanadaCzech Republic, Denmark, Finland, Ireland, Norway, Spain, Sweden, UKGreeceSlovak Republic
LTC insurance (Premiums and taxes)Germany, Luxemboug, NetherlandsJapanKorea
MixedHungary, PortugalAustria, Belgium, France, Italy Poland, Slovenia, SiwitzerlandAustraliaMexico, US

Source: the author's tabulation based on Colombo et al. (2011), Table 7.1.

Japan has had LTCI since 2000. One important feature of the Japanese LTCI is that its introduction caused sharp, but not incremental increases in LTC financing and spending. This provides us with a good opportunity to identify the impact of a large-scale LTCI introduction.

LTCI in Japan

Before LTCI was introduced in 2000, public LTC services in Japan were mainly means-tested programs for the low-income elderly. Under the means-tested programs, the elderly people in need of LTC but ineligible for public LTC benefits were often admitted to hospitals and stayed there for a long time even after necessary medical treatment had concluded (Campbell and Ikegami, 2000, 2003). This is called “social hospitalization” of the elderly, which was (and still is) considered a notorious social phenomenon in Japan's aging society. This problem was exacerbated by the introduction of a new healthcare scheme for the elderly in 1983, which had a relatively generous payment system for elderly hospital admission. In order to minimize such “social hospitalization” and to cope with both increasing medical costs for the elderly and the expanding need for LTC services caused by a rapidly aging population, in the 1990s the Japanese government implemented several reforms that were financed by national and local taxes. Due to several limitations of the tax-financing LTC system, the Long Term Care Insurance Law was enacted in 1998 and enforced in April 2000.

In what follows, we explain the institutional setting of LTCI in Japan. First, when it comes to financing, LTCI in Japan is managed as a uniform and independent social insurance system that is, however, financed by both insurance premiums and taxes. This mixed financing system is not peculiar to LTCI; the Japanese public health insurance system is also financed by both insurance premiums and taxes. Insurers of LTCI are in principle local municipalities and the sources of their revenues are insurance premiums and local and national taxes along with fiscal adjustment systems.

Second, regarding eligibility, Japan's LTCI is a universal program that does not require means-testing for eligibility for LTC services. That is, all people aged 65 and above are covered by Japan's LTCI, but people aged 40–64 years are only eligible for LTCI benefits if they have age-related diseases. LTC benefits for younger people with a disability are mostly provided by local governments and financed by tax revenues.

Third, the Japanese LTCI is a centralized system. That is, the institutional settings of financing, care needs assessments, and eligibility criteria and the types and schemes of service provision are mostly determined by the central government. At the same time, the insurers are local municipalities and have responsibility for the implementation of care needs assessments and the planning of local LTC service provision. In most areas, LTC services are publicly funded by LTCI but often privately provided by firms and nonprofit organizations.

Fourth and finally, Japan's LTCI provides only in-kind benefits. The Japanese government has not included cash benefits for informal caregivers in LTCI, probably because of concern that cash benefits could strengthen female gender roles in informal caregiving and could prevent women from joining or staying in the labor market (Campbell, 2002). Hieda (2012) also indicates that the Ministry of Health and Welfare excluded the option of cash benefits in the early stages of the policymaking process due to fiscal reasons.

For the historical, institutional, and political backgrounds of LTCI introduction in Japan, see also Campbell and Ikegami (2000, 2003), Campbell (2002), Campbell et al. (2010), and Rhee et al. (2015). In addition, Tables A1 and A2 in Appendix A1, which are based on Exhibit 3 and Exhibit 2 in Campbell et al. (2010), respectively, compare LTCIs in Japan and Germany based on the institutional settings in 2008.

Micro versus macro impact

In this paper, we focus on LTCI's macrolevel impact on public expenditure and female labor participation. The advantage of using country-level data is that we can examine the nation-level general-equilibrium impact of LTCI, which is rarely investigated in the literature.

Previous individual-level studies of LTCI effects on female labor supply (mostly in Germany and Japan) present mixed results (Geyer and Korfhage, 2015, 2018; Shimizutani et al., 2008; Tamiya et al., 2011; Sugawara and Nakamura, 2014; Fukahori et al., 2015; Yamada and Shimizutani, 2015; Kondo, 2017; Fu et al., 2017). As a whole, however, these studies imply that LTCI with in-kind benefits may have some positive effect on female labor supply, whereas LTCI with cash benefits seems to have a negative effect, although evidence is still insufficient to draw a strong conclusion. If these implications based on individual-level studies can be straightforwardly applied to a macrolevel analysis, we expect Japan, where only in-kind benefits are available, to have experienced a positive LTCI impact on female labor supply.

The findings of the above microlevel studies are important, but there are some limitations. Several previous studies argue that they utilize a difference-in-differences (DID) method as their identification strategies (Geyer and Korfhage, 2018; Shimizutani et al., 2008; Tamiya et al., 2011; Fukahori et al., 2015; Fu et al., 2017). Treatment and control groups in these studies, however, are not defined based on an exogenous group-level exposure to LTCI introduction as a standard DID framework implies. This is because in Germany and Japan, LTCI programs were uniformly introduced nationwide and their coverage is universal (for all generations in Germany and for the elderly in Japan). Hence it is impossible to compare informal caregivers who are affected by LTCI introduction with informal caregivers who are not.

Most of the previous studies therefore compare changes in the labor supply before and after LTCI introduction between “informal caregivers” and “others”, without directly comparing LTCI-affected with non-LTCI-affected caregivers. This empirical strategy may be plausible in some circumstances, but cannot take into account the fact that the introduction of a universal LTCI scheme should also affect the decision-making behind “being a caregiver or not”. This may result in possible endogeneity bias, because “informal caregivers” consist of different subgroups before and after LTCI introduction.

In addition, large-scale LTCI introduction can also affect female labor force participation through the creation of employment opportunities for middle-aged women. For example, some empirical welfare-state studies such as Mandel and Semyonov (2006) emphasize the role of the welfare state as a provider of employment opportunities for women. This employment effect of LTCI introduction on the female labor market may lead to the violation of SUTVA (Stable Unit Treatment Value Assumption) in microlevel studies.

One alternative way to identify the causal effect of an LTCI introduction that takes into account these problems is to exploit some regional variation in the intensity of the LTCI introduction. For example, Løken et al. (2016) utilize the differential increase in the availability of federal funds in municipalities caused by a national LTC reform for the elderly in Norway, and Hollingsworth et al. (2017) utilize an LTC policy reform in Scotland using England and Wales as the control regions.

It is, however, difficult to find such an exogenous variation in the introduction of a universal and uniform LTCI, which may explain why previous studies in Japan and Germany utilize the different identification strategies described above.

We therefore shift our focus from a microlevel or municipality-level variation to a country-level variation to examine the aggregate impact of LTCI introduction. One advantage of cross-country analysis is that we can directly investigate the nation-level aggregate impact of LTCI introduction by comparing Japan with other countries that have not experienced LTCI introduction.

While it may be difficult to construct a valid SC unit using country-level data because of the large heterogeneity among countries, there is now an increasing number of studies that investigate the aggregate impact of nation-level policy reforms on relevant outcomes using country-level panel data and the SC method (Ryan et al., 2016; Restrepo and Rieger, 2016; Rieger et al., 2017; Arnold and Stadelmann-Steen, 2017; Podestà, 2017; Barlow, 2018; Olper et al., 2018; Tanndal and Waldenström 2018; Andersson, 2019; Rubolino and Waldenström, 2020; Geloso and Pavlik, 2020; Absher et al., 2020).

To address the inherent vulnerability of constructing an SC unit using country-level data, we provide sensitivity and placebo analyses based on methods proposed in Abadie et al. (2010) and Abadie et al. (2015) and demeaned-outcome analysis based on Ferman and Pinto (2019). In addition, we implement more extensive placebo analyses by combining permuted treatment assignment and leave-one-out estimation (Appendix A5). Finally, we also provide additional robustness checks by limiting donor pool countries and conducting in-time placebo tests (Appendix A6). All of these additional analyses are meant to cope with some drawbacks in exploiting cross-country variation for causal inference and also address some concerns about the uniqueness or incomparability of Japanese demographic conditions.

Empirical Strategy
A case study using the SC method

Because our study focuses on the specific nationwide event of LTCI introduction in Japan using country panel data, we have only one “treated” unit in our sample for analysis. The SC method proposed by Abadie and Gardeazabal (2003) and Abadie et al. (2010) is a suitable method to investigate the impact of such a single but noticeable event. We will now briefly explain how the SC method achieves the identification of aggregated LTCI effects.

First, let us define the aggregate effect of the LTCI introduction as αit on some outcome variable Yit, where i and t indicate a country and a year, respectively. This implies we assume that the effect of the LTCI introduction varies across countries and years. Next, we consider the situation in which an LTCI program is introduced in country i = J (i.e. Japan) in year T0 and assume that the LTCI introduction is fully implemented and irreversible. In this case, we can define the treatment effect αJt as follows: αJt=YJt(1)YJt(0),fort>T0{\alpha _{Jt}} = {Y_{Jt}}\left( 1 \right) - {Y_{Jt}}\left( 0 \right),\;\;\;{\rm{for}}\;t > {T_0} where YJt(DJ) is a potential outcome with the intervention status DJ, where DJ = 1 indicates the LTCI introduction and DJ = 0 represents no LTCI introduction. Thus YJt(1) is identical to an observed outcome YJtobsY_{Jt}^{obs} and YJt(0) is a “counterfactual” outcome that would be realized if country J did not introduce LTCI in tT0 + 1. In order to estimate αJt, we need to estimate YJt(0) in tT0 + 1.

Abadie and Gardeazabal (2003) and Abadie et al. (2010) proposed a novel method to estimate YJt(0) by utilizing the weighted average of outcome variables of control units i (i = 1,2...,N), that is, ΣkJwk*Yktobs{\Sigma _{k \ne J}}w_k^*Y_{kt}^{obs} . An optimal time-invariant weight wk*w_k^* for each control unit k is determined so that the vector of optimal weights W*=(w1*,w2*,,wk*){W^*} = \left( {w_1^*,w_2^*, \ldots ,w_k^*} \right)' minimizes the difference between the pre-intervention outcomes and characteristics (called predictors) of the treated unit and the weighted average of predictors of the control units, given that 0wk*10 \le w_k^* \le 1 and ΣkJwk*=1{\Sigma _{k \ne J}}w_k^* = 1 . A single fictional control unit constructed by the optimal weights W* is called synthetic control.

Thus, SC has pre-intervention outcomes and characteristics which are set as similarly as possible to those of the treated unit in terms of observed predictors, but it does not receive a treatment in the post-intervention period. Therefore the outcome of the SC in the post-intervention period is meant to represent the counterfactual status of the treated unit YIt(0).

Given that the SC can provide unbiased estimates of the counterfactual status of the treated unit YIt(0), αIt is estimated as follows: α^Jt=YJtobskJwk*Yktobs.{\hat \alpha _{Jt}} = Y_{Jt}^{obs} - \sum\limits_{k \ne J} {w_k^*Y_{kt}^{obs}.}

Building on some parametric assumptions but allowing for time-varying unobserved confounders, Abadie et al. (2010) proved that the above SC estimator is unbiased if the treated unit and the SC are well-matched in observed predictors and outcome variables in long pre-intervention periods.

In a subsequent study, Abadie et al. (2015) recommended that the SC method should be applied in cases where a sizable number of pre-intervention periods are available, in order to construct a credible SC. We examine the effects of Japanese LTCI introduction since 2000 on the fiscal outcomes and female labor force participation. Our pre-intervention periods are in most cases about 20 years (1980–1999).

Informal test of the null hypothesis

One weakness of the SC method is that it does not provide a formal statistical test for the null hypothesis. As a complement to formal statistical hypothesis testing, Abadie et al. (2010) provides an alternative, informal, placebo test akin to a permutation or randomization test in which a researcher calculates and collects “placebo” SC estimates by assigning the “label” of the intervention status to each control unit and then compares a true SC estimate with these placebo values. Most of the previous studies using the SC method show the results of this kind of placebo test, and we also present the results of this conventional test in Section 5.5. In Appendix A5, we further explore the placebo analysis in the SC method and provide extended placebo trials that are still informal but more rigorous and we hope more informative.

Selection of donor pool countries

One important issue in SC analysis is how to select the candidates for control countries, which are called “donor pool” countries. Due to data availability, we first limit our sample, including Japan, to 24 nations that had joined the OECD before 1980. This sample mostly consists of developed countries in Western Europe, North America (the United States and Canada) and Oceania (Australia and New Zealand), as well as Japan. This sample restriction is justifiable from an econometric perspective, because it is preferable to have relatively homogeneous control units in a donor pool that are reasonably comparable to the treated unit in terms of socioeconomic characteristics (Abadie et al., 2010, 2015).

In addition, we exclude Germany, the Netherlands, and Luxembourg from the donor pool because these countries adopted LTCI during the sample period. This means that we do not allow these countries to be included in the synthetic Japan.

Finally, Iceland, Greece and Turkey, which are original OECD members, are also excluded from the donor pool because of lack of data for Iceland, the unusual budgetary situation in the late 2000s for Greece, and significant socioeconomic differences with Japan for Turkey. Note that relatively new OECD members such as Eastern European countries and South Korea are also not included in the donor pool because, along with lack of data, they were developing countries with significantly different political regimes in the 1980s and 1990s.

As a result of these sample selection procedures, 17 OCED countries are selected as primary donor pool countries. In some SC estimations, a few more countries are further dropped from the donor pool due to lack of data. In robustness checks, we further limit our donor pool countries based on several additional criteria.

Data

For our empirical analysis, we construct annual panel data for 18 OECD countries from 1980 to 2013 by combining various data sources (OECD, 2016, 2019a, b). Table A3 in Appendix A2 presents a complete list of the definitions and sources of our dataset.

To begin with, our main fiscal outcome to be investigated is the variable of in-kind benefits for the elderly, because Japanese LTCI provides only in-kind benefits and covers only the elderly. In order to investigate the crowding-out effects of LTCI on other related public expenditures, we also collect data for public health expenditures that in principle do not include LTC expenditure. For the units of the fiscal variables, we use expenditure as a percentage of GDP and expenditure per head. For these fiscal variables, we construct the panel data up to 2013.

We then use four variables describing female labor supply: female labor force participation (LFP) rates for middle-age cohorts 40–44, 45–49, 50–54, and 55–59. For these LFP variables, we also construct the panel data up to 2013. Unfortunately, we cannot analyze the counterpart male LFP rates with the SC method, because Japanese male LFP rates are among the highest in the OECD countries and a valid “synthetic Japan” cannot be constructed based on other OECD countries. Note that it is reasonable to use the data up to 2013 in order to avoid the possible confounding effects of the changing fiscal and macroeconomic environment caused by the introduction of so-called “Abenomics,” an aggressive macroeconomic policy under the Abe administration, in 2013 and the increase in the consumption tax rate from 5% to 8% in 2014.

Our main predictors are pre-intervention outcomes and demographic variables. When it comes to pre-intervention outcomes, all of these are used as separate predictors based on the theoretical and empirical findings of Ferman et al. (2020). Demographic variables consist of a population under 15 as a percentage of the total population (child population), the growth rate of the child population, population aged 65 and over as percentage of the total population (elderly population), and the growth rate of the elderly population. These data are obtained from OECD Employment and Labour Force Statistics. Other demographic variables are employment in agriculture (% of civilian employment), employment in industry, and employment in services. These data come from the “Comparative Welfare State Dataset” (Brady et al. 2014).

We also include additional predictors that are meant to capture the impact of economic development on the outcomes of interest: per capita GDP and GDP growth. We use expenditure-side real GDP, which is taken from the “Pen World Table 8.1” (Feenstra et al., 2015). Per capita GDP is calculated as the expenditure-side real GDP divided by population.

Table 2 presents descriptive statistics of our panel data. The original data consist of unbalanced panel data for 17 OECD donor pool countries and Japan between 1980 and 2013, although the data availability differs by year and country. We show descriptive statistics for Japan and the donor pool countries, respectively. The period for the outcome variables is between 1980 and 2013, but the period for the predictors other than pre-intervention outcomes is between 1980 and 1999, because we use only pre-intervention statistics for the predictor variables. In order to implement SC estimation with the annual data, we impute missing values by linear interpolation, but we do not extrapolate any values. Thus, we sometimes drop years or countries due to data limitations depending on the outcome variable.

Descriptive statistics

VariableJapanDonor pool countries


Obs.MeanStd. Dev.Min.Max.Obs.MeanStd. Dev.Min.Max.
Outcomes (1980–2013)
Public expenditure on benefits in kind for the elderly (% of GDP)340.610.590.101.745440.570.720.002.86
Public expenditure on benefits in kind for the elderly (per elderly person)341135.71955.26245.762749.345441491.811970.720.008005.41
Total public expenditure on health care (% of GDP)345.410.994.377.685785.431.302.128.72
Total public expenditure on health care (per capita)341839.76578.51971.473031.475782017.54765.16402.754349.87
Female labor force participation rate (age 40–44)3469.632.0064.1173.1151175.5712.8424.1993.54
Female labor force participation rate (age 45–49)3471.533.2464.4476.1451173.2214.7724.2992.73
Female labor force participation rate (age 50–54)3466.814.4058.7675.1351166.3116.6423.7988.54
Female labor force participation rate (age 55–59)3456.874.7549.8666.5052052.2117.9014.4183.41
Predictors other than pre-intervention outcomes (1980–1999)
Per capita real GDP (million US$, PPP, 2005)2022.805.4516.1830.0134022.295.948.2439.61
Child population (%)2018.923.0314.7923.5134020.292.9414.2930.44
Elderly population (%)2012.292.449.1016.7234013.752.089.4117.91
Employment in agriculture (%)207.411.745.1810.423407.524.841.5627.26
Employment in industry (%)2034.080.9731.6635.3334029.444.2821.2740.26
Employment in services (%)2058.512.5854.2463.1534063.037.7636.1274.47
Annual growth rate of per capita real GDP (%)202.833.72−2.539.973402.673.27−9.7211.94
Annual growth rate of population (%)200.440.200.160.803400.530.48−0.733.93
Annual growth rate of child population (%)20−2.350.78−3.64−0.41340−1.081.13−4.151.40
Annual growth rate of elderly population (%)203.220.671.794.033400.851.00−2.493.02

Notes: Original data is unbalanced panel data for 18 OECD countries between 1980 and 2013. In order to implement SC analysis with annual data, we impute some missing values by linear interpolation, but we do not extrapolate any values. The 18 OECD countries are Australia, Austria, Belgium, Canada, Denmark, Finland, France, Ireland, Italy, Japan, New Zealand, Norway, Portugal, Spain, Sweden, Switzerland, the United Kingdom, and the United States. The period for the outcome variables is between 1980 and 2013, but the period for the predictors other than pre-intervention outcomes is between 1980 and 1999, because we use only pre-intervention values for the predictor variables. See Section 3.3 for a detailed explanation of the donor pool selection and Appendix A2 for the variable definitions and sources.

Results
Impacts on in-kind benefits for the elderly

Figure 1 provides the results of SC estimation for in-kind benefits for the elderly. Thick solid lines are realized in-kind benefits as percentage of GDP for panel A and as per elderly person for panel B. The other three lines are the counterpart values of three SCs.

Figure 1

SC estimation for in-kind benefits for the elderly (% of GDP).

Notes: SC 1 is constructed from the original donor pool, SC 2 is constructed from the donor pool that excludes the country that receives the highest weights in the first SC estimation, and SC 3 is constructed from the donor pool that also excludes the country that receives the highest weights in the second SC estimation. Canada is excluded from the original donor pool due to lack of data. For SC estimation we use the synth command in Stata with the nested and allopt options. See Tables A4 and A5 in Appendix A3 for detailed estimation results.

SC 1 in the graph is constructed from the original donor pool and therefore its values are regarded as baseline counterfactual outcomes in the post-intervention period; that is, they represent the levels of in-kind benefits for the elderly if LTCI had not been introduced in Japan. As discussed in Section 3, SC estimates are the gaps between the outcomes of a treated unit and an SC. If an SC is validly constructed based on pre-intervention outcomes and predictors, SC estimates are expected to be around zero in the pre-intervention period and can be interpreted as causal effects in the post-intervention period.

SC 2 is constructed using SC estimation in which the country that receives the highest weight in the first SC estimation is excluded from the donor pool. SC 3 is constructed from a donor pool that additionally excludes the country that receives the highest weight in the second SC estimation. These robustness checks using SCs 2 and 3 are particularly important in our cross-country comparison, where there is a risk that some specific countries receive higher weights and idiosyncratic shocks in these countries may undermine the validity of the SC estimation. See also Abadie et al. (2015) for further discussion of this type of sensitivity checks. In Appendix A3, we provide the weights and the predetermined covariate values used for constructing SCs 1–3.

The results of SC estimation in Figure 1 indicate sharp increases in in-kind benefits for the elderly in Japan just after the introduction of LTCI. The gaps between the actual benefit level and those of the SCs persist and increase during the sample period and the size of the gaps reaches around one percentage point of GDP in panel A and about US$1,000 (in terms of PPP) in panel B in 2010, 10 years after LTCI introduction.

We argue that the expenditure increase by one percentage point of GDP within 10 years is not negligible and its aggregate-level impact on the social and economic outcomes such as female labor force participation is worth investigating.

Crowding out health expenditures?

We then examine whether LTCI introduction crowds out closely related public expenditure, that is, other public health expenditures. Figure 2 provides the results of SC estimation for public health expenditure. When we compare the actual outcomes with those of SCs 1–3 in panels A and B, the gaps between the outcomes in Japan and synthetic Japan are negative in the early 2000s, indicating that LTCI introduction might have led to the suppression of public health expenditure in this period. This suppression, however, appears to be small in terms of effect size, and we will further examine the significance of the observed suppression using placebo tests in Section 5.5. Overall, there is no clear evidence that LTCI introduction has caused a large public-expenditure shift from healthcare to LTC and the persistent suppression of public health expenditure.

Figure 2

SC estimation of public health expenditure (% of GDP).

Notes: SC 1 is constructed from the original donor pool, SC 2 is constructed from a donor pool that excludes the country that receives the highest weights in the first SC estimation, and SC 3 is constructed from a donor pool that also excludes the country that receives the highest weights in the second SC estimation. Norway is excluded from the original donor pool due to a lack of data. For SC estimation, we use the synth command in Stata with the nested and allopt options. See Tables A6 and A7 in Appendix A3 for detailed estimation results.

Impacts on female labor force participation

Moving on from fiscal outcomes, Figure 3 provides our SC estimation results for female labor force participation (LFP) rates by age cohort. Despite the large fiscal expansion for LTCI, there is no sign of positive LTCI effects on the LFP rates in any of the cohorts. In fact, the female LFP rates for ages 50–54 and 55–59 appear to have been even suppressed after LTCI introduction, compared with those of all of the SCs.

Figure 3

SC estimation of female LFP rates by age cohort.

Notes: See the note in Figure 1 for a detailed explanation of the graph. Due to data availability, the first year of our sample is 1986. Austria, Ireland, and Switzerland are excluded from the original donor pool due to lack of data. Except for the age cohort (55–59, lower right graph), Finland is also excluded due to lack of data. For SC estimation, we use the synth command in Stata with the nested and allopt options. See Tables A8–A11 in Appendix A3 for detailed estimation results.

The overall tendency of Japan's stagnated LFP rates in the post-intervention period suggests that there may exist a Japan-specific trend in the female LFP rates that is not taken into account by the SCs. This implies that SC estimates (i.e. outcome gaps between Japan and an SC) may not properly capture the causal effects of the LTCI introduction.

In order to eliminate this possible Japan-specific trend in the female LTF rates, we subtract the female LFP rate of ages 45–49 from the female LFP rate of ages 50–54 and ages 55–59 and then use these differenced variables as outcomes. The idea behind this procedure is that women aged 45–49, whose LFP rate is the highest among the four age cohorts in the post-reform period, are likely to be less affected by LTCI introduction, because their parents and parents-in-law tend to still have no need of LTC, whereas women aged 50–54 and 55–59 are likely to be more affected by the LTCI introduction because of a higher need for LTC for their parents or parents-in-law.

Thus, subtracting the female LFP rate of age cohort 45–49 from that of an older cohort may effectively eliminate the Japan-specific trend of female LFP, leaving a change in the older cohort LFP rate caused by LTCI introduction. This estimation strategy is akin to the triple-difference or difference-in-difference-in-difference (DDD) strategy, although we use the SC method after differencing the outcomes of “more affected” and “less affected” cohorts in both treated and control (or donor pool) countries.

The estimation results based on this strategy are shown in Figure 4. The left graph shows the trend of LFP-rate differences between the age cohorts 50–54 and 45–49 in Japan (bold solid lines) and its SCs. The right graph presents the counterpart trends of LFP-rate differences between the age cohorts 55–59 and 45–49. The results also do not indicate any positive impact of LTCI introduction on the female LFP rates for these two age cohorts. In fact, the right-hand graph again shows that the female LFP rates for the age cohort 55–59 seem to be suppressed after 2000 even after eliminating the trend of female LFP for the age cohort 45–49.

Figure 4

SC estimation for the difference in female LFP rates by age cohort.

Notes: See the note in Figure 1 for a detailed explanation of the graph. Due to data availability, the first year of our sample is 1986. Austria, Finland, Ireland, New Zealand, and Switzerland are excluded from the original donor pool due to lack of data. For SC estimation, we use the synth command in Stata with the nested and allopt options, but in cases in which there is an optimization error (due to a poor pre-intervention fit), we implement synth without nested and allopt. See Tables A12 and A13 in Appendix A3 for detailed estimation results.

Demeaned SC

To complement the above analyses, we also conduct additional SC estimation. That is, to alleviate imperfect pre-intervention fit, we implement SC estimation using demeaned outcomes in which pre-intervention outcome means are subtracted from outcome values (i.e. Y˜it=Yit(1/T0)Σt=1T0Yit{\tilde Y_{it}} = {Y_{it}} - \left( {1/{T_0}} \right)\Sigma _{t = 1}^{{T_0}}{Y_{it}} ). Ferman and Pinto (2019) shows that the demeaned SC estimator has better properties than the original SC estimator under some conditions, in particular when the imperfect pre-intervention fit exists in the conventional SC estimation. Note that we use only pre-intervention demeaned outcomes as predictors.

Figure 5 shows that the results of demeaned SC estimates are in line with those of the original estimates, while pre-intervention fits are improved for some LFP outcomes. Overall, we again find no sign of positive LTCI effects and female LFP rates for ages 55–59 (and possibly ages 50–54) appear to have been suppressed after LTCI introduction.

Figure 5

SC estimation of demeaned outcomes.

Notes: See the note and the legend in Figure 1 for detailed explanations of the graph. Due to data availability, the first year of our sample for the outcomes of female LFP rates is 1986. Due to lack of data, Canada is excluded from the donor pool for the outcome of in-kind benefits for the elderly, Norway for the outcome of public health expenditure, and Austria, Ireland, and Switzerland for the outcome of female LFP rates. Note that we use only pre-intervention demeaned outcomes as predictors. For SC estimation, we use the synth command in Stata with the nested and allopt options.

Placebo results

Figure 6 shows estimation results for placebo trials on all of the outcomes except for the results of demeaned SC estimation. On one hand, the first and second graphs in this figure indicate that the SC estimates for in-kind benefits for the elderly seem to be higher than most of the placebo estimates just after 2000, indicating that we can unambiguously conclude there was a fiscal impact of LTCI introduction (around a one percentage point increase in 2014). On the other hand, we do not find any clear effect on the public health expenditure, although negative SC estimates are relatively large around 2005 in particular, for per capita public health expenditure.

Figure 6

Placebo results.

Notes: Thick lines are Japan's SC estimates and the other lines are placebo SC estimates. We calculate placebo SC estimates by assigning the “label” of the intervention status to each control unit, using all the other control units as a donor pool. Note that the composition of donor pools (control units) is different depending on the outcome variables due to data constraints. For baseline SC estimates (bold black line), we use the synth command in Stata with the nested and allopt options. For placebo SC estimates (colored line), we implement synth without nested and allopt, because nested and allopt options sometimes result in optimization errors in some placebo trials.

Negative SC estimates for female LFP rates are sometimes clearly larger in size than most placebo estimates. In particular, the female LFP rate for ages 55–59 decreases after 2000 and the magnitude is larger than any placebo estimates. This tendency is mitigated if we subtract the female LFP for ages 45–49 from female LFP for ages 55–59 (the last two graphs). The last graph, nonetheless, indicates that the female LFP rate for ages 55–59 stagnated after 2000 and the magnitude is larger than most of the placebo estimates, although pre-intervention fits are poor for many placebo trials.

In the Appendices, we also provide the results of placebo tests for demeaned SC estimation (Appendix A4) and extended placebo tests based on permuted treatment assignment, resampled donor pools, and several test statistics (Appendix A5) to verity the above findings.

Further Analyses

This section discusses further SC analyses that address several concerns about our main SC estimation. We implement three different additional SC analyses. We briefly explain the backgrounds and results of these further examinations as follows.

First, we exclude from the donor pool the no-LTCI countries that experienced relatively high growth in in-kind benefits for the elderly in the post-intervention period. In the main SC estimation, we assume that Japan would have realized a similar level of in-kind benefits for the elderly to those of some no-LTCI countries if Japan had not introduced LTCI. However, if some control countries substantially increase their in-kind benefits for the elderly after 2000 without introducing LTCI, the interpretation of our SC estimation may be complicated, particularly if what we want to know is not the impact of LTCI itself but the impact of in-kind benefits for the elderly in general. We thus exclude several countries with the highest changes in in-kind benefits for the elderly in the 2000s (i.e., Austria, Finland, France, Spain, and the United Kingdom) based on Table A14 in Appendix A6. The SC estimation results do not change much from the baseline results, although the in-kind benefits for the elderly of SCs in the post-intervention period tend to be smaller than those in the baseline analysis (Figure A11 in Appendix A6).

Second, we remove five countries in which in-kind benefits for the elderly are relatively unreliable or unstable in the pre-intervention period. Specifically, we remove Italy, Belgium, and Portugal because their values of in-kind benefits for the elderly are zero in the period 1980–1989, possibly due to classification in OECD statistics, and we also drop Australia and Sweden because their in-kind benefits for the elderly fluctuate before the intervention years, probably due to their LTC reforms during this period (Cullen, 2003; Trydegård and Thorslund, 2010). Although it is not clear how these poor data properties in the pre-intervention period affect the post-intervention SC estimates, we implement SC estimation after excluding these five countries from the donor pool. The estimation results are again similar to our main findings (Figure A12 in Appendix A6).

Third, we implement in-time placebo SC estimation with a backdated intervention year. In the main analysis, the growth rate of the share of elderly in the pre-intervention period in Japan was larger than those of all the SCs (Appendix A3). Although it is not possible to solve this insufficient balancing on this variable using the standard SC method with positive weights, we can examine whether this insufficient balancing leads to serious estimation bias using in-time placebo tests proposed by Abadie et al. (2015) and Abadie (2019). To do this, we backdated the intervention period from 2000 to 1993 and implemented the in-time placebo test on a hold-out validation period to examine whether Japan's rapid population aging is a serious confounding factor in our analysis. The estimation results are shown in Figure A13 in Appendix A6. We neither observe an increase in in-kind benefits for the elderly nor stagnation of female LFP rates for ages 50–54 and 55–59, implying that the pre-LTCI rapid population aging in Japan was not a direct driving force of these outcomes. These results suggest that the sharp increase in the in-kind benefits for the elderly after 2000 in Japan and no counterpart increase in female LFP rates can be interpreted as evidence of no significant positive LTCI effect on female LFP.

Discussion and Conclusion

The nationwide LTCI introduction in Japan is one of the major social welfare reforms carried out in the 1990s and 2000s in aging OCED countries. In this paper, we investigate the impact of this LTCI introduction on fiscal outcomes and female labor force participation, exploiting the quasi-experimental features of LTCI introduction and using an SC method.

Our estimation results imply that LTCI introduction had a significant positive impact on the target expenditure item in Japan (i.e. in-kind benefits for the elderly), but we did not find robust effects on public health expenditure or female labor force participation rates. These results suggest that the LTCI program in Japan has not played a sufficient role to alter the family-dependent character of LTC provision and low female labor force participation in this country.

This macrolevel finding in our study may not be consistent with several recent microlevel studies that found some positive labor-supply effects of LTCI's in-kind benefits. Given the fact that we estimate the aggregate LTCI effects, whereas the previous studies study individual-level LTCI effects, we can provide several possible explanations that are consistent with both findings.

First of all, it is possible that we failed to detect some positive LTCI effects on female LFP rates because the power of our SC estimation may not be high enough. Even if this is the case, however, our analysis and several robustness checks still imply that an aggregate positive LTCI effect on female labor force participation, if it exists, is small enough to remain undetected by our analysis.

Second, LTCI benefits may have enabled more frail elderly people to live at home with their family. If this is the case, it is possible that some family caregivers worked more because of more in-kind benefits from LTCI (i.e. a positive effect), but some people worked less because they chose to be family caregivers for elderly people who would have been in hospitals or nursing homes if LTCI had not been introduced (i.e. a negative effect). Most individual-level studies focus on the first effect, but our aggregate-level study is meant to capture both effects.

This cancelling-out negative effect is at least somewhat plausible, given the fact that Japanese LTCI has mostly led to increases in residential care rather than institutional care. The ratio of the elderly who received residential LTCI services increased from 4.4% in 2000 to 12.4% in 2015. On the other hand, the total capacity of institutional care for the elderly (both public and private) only increased from 3.7% in 2000 to 5.5% in 2015. In addition, the number of long-term elderly inpatients (including social hospitalization) significantly decreased after the introduction of LTCI in 2000: the ratio of the elderly who were hospitalized for more than one month decreased from 2.5% in 1999 to 1.7% in 2014. These statistics imply that more old people who need health and social care now stay at home for a longer period using formal LTC services. This is exactly what the Japanese government intended to achieve through LTCI (Campbell and Ikegami, 2000, 2003), but this may increase the burden on some informal care-givers who would not have become caregivers if the elderly they take care of had instead been admitted to hospitals for a long period or stayed in nursing homes.

Overall, our study revealed that the Japanese LTCI introduction clearly boosted LTC spending but failed to boost labor force participation for middle-aged women. We discussed some possible mechanisms behind these results, but the mystery of no aggregate positive LTCI effect remains. Further studies are required to address this question and reconsider the potential roles of LTCI in female labor force participation.