Under Pressure: Reputation, Ratings, and Inaccurate Self-Reporting in The Nursing Home Industry

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Received: 17 May 2018 Revised: 15 May 2019 Accepted: 28 May 2019 Published on: 5 August 2019

DOI: 10.1002/smj.3063

RESEARCH ARTICLE

Under pressure: Reputation, ratings, and inaccurate


self-reporting in the nursing home industry

Amandine Ody-Brasier1 | Amanda Sharkey2

1
Yale University, School of Management,
New Haven, Connecticut
Abstract
2
University of Chicago, Booth School of
Research Summary: This paper examines firms' strategic
Business, Chicago, Illinois responses to reputational pressures in a critical healthcare
domain—the U.S. nursing home industry. We investigate
Correspondence
Amandine Ody-Brasier, Yale University,
whether organizations improved in terms of care quality
School of Management, 165 Whitney following an exogenous change in the required number of
Avenue, New Haven, CT 06517.
nursing hours associated with star-based ratings to which
Email: [email protected]
nursing homes are subject. We show that although firms at
risk of losing a star tended to self-report higher staffing
levels after the policy change, these reported increases
were not associated with improvements in an important
patient outcome—bedsores. These findings are consistent
with false reporting of staffing data, or insufficient or inef-
fective hiring practices. Although we cannot definitively
establish the existence of false reporting, supplementary
analyses offer little support for the latter two possibilities.
Managerial Summary: Third-party ratings systems often
stipulate that firms meet certain standards to attain a favor-
able evaluation. Firms must make strategic decisions about
whether and how to comply with these. This paper exam-
ines firms' strategic responses to changes in the required
number of nursing hours associated with star-based ratings
in the nursing home industry. Our results indicate that
firms at risk of losing a star responded by reporting
staffing increases. However, we find no concurrent
improvement in a patient outcome that previous research
suggests should change as a result of increased staffing.
We investigate whether the lack of improvement may be
due to insufficient or ineffective hiring, and find scant evi-
dence of either one. Although we lack direct evidence

Strat Mgmt J. 2019;40:1517–1544. wileyonlinelibrary.com/journal/smj © 2019 John Wiley & Sons, Ltd. 1517
1518 ODY-BRASIER AND SHARKEY

of false reporting, our findings suggest this as a strong


possibility.

KEYWORDS
healthcare, nursing homes, rating systems, reputation, response to
regulation

1 | INTRODUCTION

Information asymmetries are endemic to markets for many types of products and services (Akerlof,
1970). In recent decades, a multitude of ratings, rankings, certification/labeling, and disclosure sys-
tems have arisen to mitigate such disparities between consumers and producers. These forms of
third-party evaluation involve the disclosure of novel information that sheds light on a firm's standing
relative to others along valued dimensions, such as quality or environmental performance, thereby
contributing to organizational reputation (Espeland & Sauder, 2016; Lange, Lee, & Dai, 2011; Rao,
1994; Rindova, Williamson, Petkova, & Sever, 2005).
Research across a variety of empirical settings demonstrates that information provided by third-
party evaluators can, through the mechanism of reputation, shape consumer and investor choices
(e.g., Bowman & Bastedo, 2009; Gordon, Knock, & Neely, 2009; Lyon & Shimshack, 2015; Mere-
dith, 2004; Pope, 2009; Sauder & Lancaster, 2006). Moreover, scholars have argued that ratings,
rankings, and other third-party evaluations—because of their wide reach and perceived expert
status—can drive more sudden and dramatic shifts in reputation than may have occurred in the past
(Espeland & Sauder, 2016; Rindova et al., 2005). Thus, ratings and other forms of public evaluation
represent an ever more consequential part of the environment in which firms make strategic
decisions.
In turn, policymakers and government officials have increasingly adopted ratings and other forms
of public evaluation, a strategy that Schneiberg and Bartley (2008) characterize as “regulation by
information.” When used as a policy tool, these forms of public evaluation are premised on the idea
that reputational incentives—in the form of consumer and investor responses to newly public
information—can incentivize organizations to maintain or improve their performance on such other-
wise difficult-to-observe dimensions as quality, safety, and environmental compliance.
However, there are a range of strategic actions that firms might adopt in the face of such regula-
tion by information. On the one hand, some organizations have responded to ratings and rankings by
initiating changes that have led to substantial improvements (see, e.g., Jin & Leslie, 2003; Bennear &
Olmstead, 2008; Chatterji & Toffel, 2010; Sharkey & Bromley, 2015). Chatterji and Toffel (2010)
show, for example, that firms tend to reduce their emissions of toxic pollutants once their output
becomes subject to public ratings.
On the other hand, some organizations have responded by engaging in various forms of “gaming
the system,” which Espeland and Sauder (2007, p. 29) define broadly as “manipulating the rules and
numbers in ways that are unconnected to, or even undermine, the motivation behind them.” For
instance, medical facilities have chosen to admit relatively healthier patients, which improves the
mortality ratings that form a key basis of their ratings but does nothing to improve the treatment of
sick individuals (Dranove, Kessler, McClellan, & Satterthwaite, 2003). Additional types of gaming
ODY-BRASIER AND SHARKEY 1519

have been documented in the hotel industry (Mayzlin, Dover, & Chevalier, 2014), the restaurant
industry (Luca & Zervas, 2016), and in the market for higher education (Espeland & Sauder, 2007).
Finally, there are cases where public evaluations have been inconsequential because they fail to
induce a meaningful response from the organization(s) being evaluated or because they do not yield
their intended results despite firms' good-faith efforts. These outcomes can result when ratings exert
little reputational pressure on organizations—as when consumers pay no attention to the ratings or
have few alternatives and so are unable to discipline providers by voting with their feet.
Given the heterogeneity in organizational responses to ratings and other forms of public evalua-
tion, the effectiveness of these systems warrants careful evaluation; one cannot merely assume that
they generate the intended outcomes. We examine firms' strategic responses to public evaluation in
the empirical context of health care, a prototypical example of a market plagued by information
asymmetry and one that is of great substantive interest to many individuals. We focus in particular
on the U.S. nursing home industry, which currently includes about 16,000 nursing homes providing
care to some 1.6 million residents—many of whom are elderly, frail, and/or disabled (Centers for
Medicare and Medicaid Services [CMS], 2015). As the average life span has increased and more
people require long-term care, the goal of ensuring high-quality care for nursing home residents has
attracted significant public attention (Institute of Medicine, 1986; U.S. General Accounting Office
[US GAO], 1998; U.S. Senate, 1974). As summarized by Osterman (2018, pp. 20–21): “In 2014,
nursing home inspections revealed that 20 percent of facilities were deficient in ways that put resi-
dents' health in jeopardy […] Nursing home scandals are a regular feature in the press […] But nurs-
ing homes will continue to be an important component of long-term services and supports.”
To mitigate consumers' challenges in identifying providers of high-quality care and “harness mar-
ket forces to encourage poorly performing homes to improve quality or face the loss of revenue”
(US GAO, 2002, p. 3), the CMS in 2009 created Nursing Home Compare (NHC) five-star rating sys-
tem, a publicly searchable website that presents information about the quality of care provided by
nursing homes. The site includes a rating of each home on a scale of one to five stars; this overall rat-
ing is based on sub-ratings that CMS awards in three areas: on-site inspections, quality measures
(as compared with fixed benchmarks), and self-reported data on nursing staff levels.
The Nursing Home Compare ratings have attracted considerable attention from consumers, which
is evidenced by the NHC site receiving about 1.4 million visits annually. Werner, Konetzka, and
Polsky (2016) show that the implementation of the five-star ratings resulted in an average loss of 8%
of market share for one-star facilities and an increase of 6% of market share for five-star facilities.
Furthermore, qualitative evidence from media reports suggests that the ratings are a key driver of
referrals from doctors and nurses and also affect whether insurers deem a home to be “preferred”
(Thomas, 2014). These ratings, then, have shaped consumer demand via multiple routes.
Given the influence of ratings on consumer demand, it seems natural that this form of public eval-
uation would prompt organizations to strive for a favorable rating. That said, organizations can take
a variety of strategic actions to ensure a positive evaluation. Although a favorable rating may reflect
substantial efforts to care for residents, a nursing home could seek to boost its rating by “gaming”
the ratings system. We attempt to distinguish between these possibilities by examining homes'
responses to the NHC ratings.
Our analyses focus on the NHC-imposed staffing requirements that homes must meet to achieve a
particular star level. A nursing home's staffing performance is a substantial component (one third) of
its overall evaluation; this is because the care and attention from qualified medical professionals play
a critical role in residents' overall quality of care (Lin, 2014). Osterman (2018, p. 10) puts it this
way: “One message of the emerging thinking on managing chronic conditions is that improving the
1520 ODY-BRASIER AND SHARKEY

quality of life for the elderly and disabled does not require high-tech medicine but rather quality care
and attention.” We analyze how facilities responded to an exogenous shock that occurred when the
federal government raised the staffing cut-offs (i.e., the per-day and per-resident number of hours
worked by three categories of nursing staff) associated with various star levels.
Given the influential nature of the NHC ratings on consumer choice, we expect that nursing
homes will be responsive to any change in staffing requirements and that homes will be incentivized
to report the staffing levels needed to attain a favorable star rating (i.e., four or five stars). However,
we are less certain that nursing homes will actually increase the amount of staff devoted to resident
care, rather than gaming the system.
The possibility of gaming looms large in this setting because of the way in which staffing data are
collected. Staffing levels are self-reported and subject to “verification” by on-site inspectors only
about once each year. In addition, the reporting requirement stipulates that homes disclose the num-
ber of nursing hours only for the most recent 2-week period.
Hence, there are several ways to game the system in hopes of receiving a good rating. Two possi-
bilities are salient in this context. First, a home could simply report false staffing data. Second, it
could increase staff levels on a temporary basis; then its reported staffing levels are accurate with
regard to the 2-week period in question but are misleading because they do not correspond to its
staffing levels during the rest of the year.
Reports from prominent media outlets and think tanks point to incidents that create suspicions of
false reporting. For example, a 2018 New York Times article highlights the staffing discrepancies
documented by an interviewee whose wife resides in a home: “While he fed his wife, he noted two
aides for the 40 residents on the floor—half what Medicare says is average at Beechtree [nursing
home].”1 Another New York Times report noted that, “of more than 50 homes on a federal watch list
for quality, nearly two-thirds hold four- or five-star ratings for their staff levels” (Thomas, 2014). A
recent Government Accountability Office report (2015, p. 26) warns that

[a]lthough CMS data show that the average total nurse hours per resident/day increased
from 2009 through 2014, CMS does not have assurances that these data are accurate.
CMS uses data on nurse staffing hours that are self-reported by the nursing homes, but
the agency does not regularly audit these data to ensure their accuracy. CMS has con-
ducted little auditing of staffing data outside of when state survey agency surveyors are
on-site for inspections, and as a result may be less likely to identify intentional or uni-
ntentional inaccuracies in the self-reported data. Many of the regional office and state
survey agency officials we spoke with expressed concern over the self-reported nature
of these data, noting that it may be easy to misrepresent nurse staff hours. For instance,
one state survey agency stated that nursing home residents would sometimes tell sur-
veyors that the high numbers of staff on site during the survey were not normally pre-
sent and other regional office and state survey agency officials noted that some homes
will “staff up” when expecting a standard survey in order to make their staffing levels
look better.

News reports have cited specific examples of the “inspections” and “health outcomes” components
of a nursing home's rating being low (e.g., two out of five) despite it receiving four stars on the
“staffing levels” component, an inconsistency which is difficult to reconcile (Rau, 2018). A report
from the Center for Public Integrity (2014) identifies discrepancies between the staffing levels
1
https://www.nytimes.com/2018/07/07/health/nursing-homes-staffing-medicare.html.
ODY-BRASIER AND SHARKEY 1521

reported to Nursing Home Compare and the salary expenses reflected in homes' reports to Medicare,
raising the possibility of false or inaccurate reporting to NHC. These reports call into question the
accuracy of a ratings system on which millions of families rely when making choices about health
care for their loved ones.
The consequential nature of NHC ratings, when combined with the strong possibility of false
reporting, indicate that a systematic examination of nursing home responses to ratings-driven pres-
sure is merited. However, it is challenging to conduct such an analysis because homes that game the
system will naturally do so in a surreptitious fashion. Although we cannot definitively establish the
incidence of false reporting, we follow other researchers (e.g., Mayzlin et al., 2014) who have
employed creative empirical strategies to identify patterns of behavior that are highly consistent with
false reporting.
More specifically, we employ a novel approach to assess how nursing homes responded to an
increase in ratings-driven pressure: leveraging an exogenous shock that occurred when the staffing
requirements associated with various star levels were increased. We start with difference-in-
differences (DiD) analyses in which we compare changes, over time, in the reported staffing levels
of (a) homes that were at risk of losing a star if they did not increase their staffing levels to meet the
new requirements and (b) homes whose current staffing levels were already high enough that their
star rating was not put in jeopardy by the new regime. We find strong evidence that at-risk homes
strategically responded to the staffing requirement shock by increasing the reported number of nurs-
ing staff hours.
We next assess whether the increase in staffing requirements was effective in the sense of generat-
ing improvements in resident outcomes (i.e., in line with the logic behind that increase). For that pur-
pose, we examine whether there was a concurrent decrease in the rate of bedsores—a basic health
outcome that requires little skill to prevent but has been shown by extensive medical research
(e.g., Horn, Buerhaus, Bergstrom, & Smout, 2005; Lin, 2014) to be strongly related to the amount of
nursing care a resident receives. We find no decrease in this measure among residents of at-risk facil-
ities after the policy change, a result that is consistent with nursing homes (a) hiring insufficient or
ineffective staff or (b) temporary or falsely reporting staffing increases intended only to bolster the
home's rating. We conduct a series of robustness checks and additional analyses to rule out alterna-
tive explanations for our results; examples include regression to the mean, “floor” effects, and a lon-
ger time frame before health benefits materialize.
To aid in distinguishing between gaming and ineffective hiring, we examine our fine-grained data
to see whether homes replaced more expensive (and skilled) labor with cheaper labor. We find that
increases in reported staffing were driven by the most expensive and skilled form of labor, namely
registered nurses, which makes it unlikely that the lack of improvement in health outcomes is due to
inadequate skill. To determine whether insufficient increases in staffing can fully account for the lack
of reduction in bedsore prevalence, we run two-stage least squares regression models in which we
test whether the increase in nursing hours among at-risk homes driven by the new staffing require-
ment is associated with a reduction in bedsores. We find no significant association between staffing
and bedsore prevalence in the second stage of this model, making it difficult to believe that insuffi-
cient staffing increases alone account for our findings.2 We also analyze a separate dataset on com-
pensation expenses, in which homes report (to the CMS) information on paid salaries and nursing
hours. We find that at-risk homes did not report higher paid salaries or nursing hours after the policy
2
We also analyzed a subsample of at-risk homes that reported the largest increases in staffing (i.e., more than 18 min per
resident per day at the top quartile), and we find no greater reduction in bedsores among those homes relative to other homes
(please see Appendix S6).
1522 ODY-BRASIER AND SHARKEY

change, compared to homes not at risk, which is difficult to reconcile with the reported increases in
staffing.
At a minimum, our results indicate that changes in the staffing requirements, as incorporated in
the Nursing Home Compare rating system, did not result in improvements in basic health outcomes
for residents. Moreover, although we cannot definitively prove that false reporting occurred, our sup-
plementary analyses cast doubt on several other possible explanations.

2 | DATA AND METHODS

2.1 | Data
We examine these questions by analyzing data reported to the NHC quality rating system, which
include detailed information on resident health outcomes and reported staffing levels at nursing
homes in the United States. All nursing homes that receive federal Medicaid or Medicare funds must
report to the system, which means that our data cover the vast majority of nursing homes in the coun-
try. Our dataset includes quarterly information on a panel of more than 17,000 nursing homes
between 2009 and 2015. For our analyses, we examine all homes and quarters both before and after
the policy change (of April 1, 2012), which is the basis of our identification strategy.
As described previously, NHC ratings reflect three components: inspection data, quality mea-
sures, and self-reported staffing information. Scores on each of these components are weighted and
adjusted, based on the mix of residents, to determine the overall star level that a nursing home
receives. Figure 1 shows how a rating appears on the NHC website.
We now discuss each component in detail. The inspection data consist of reports on the number,
scope, and severity of any deficiency identified during an inspection. On-site inspections are con-
ducted by state-trained health inspectors, and it is rare for more than 15 months to elapse between a
home's inspections. During these visits, inspectors evaluate whether the home complies with federal
requirements on “medication management, proper skin care, assessment of resident needs, nursing
home administration, environment, kitchen/food services, and resident rights and quality of life”
(CMS, 2009, pp. 3–4). Inspections are unannounced, and state inspection teams spend several days
in the nursing home to assess its level of compliance with federal requirements. The outcome-based
quality measures assess the quality of care provided by a nursing home. These measures address a
broad range of functioning and health status; they were selected by CMS based on their validity and
reliability and include the percentage of residents with bedsores, a criterion on which our analyses
focus. As part of the Minimum Data Set (MDS)—a federally mandated process of clinical assess-
ment for all residents—facility nurses collect and enter, on a quarterly basis, resident-level assess-
ment data. Self-reported staffing levels capture the number of nursing hours per resident per day and
are broken down into three types: nursing provided by Registered Nurses (RNs), Licensed Practi-
tioner Nurses (LPNs), and Certified Nursing Assistants (CNAs). That one of the three components in
each facility's overall rating is devoted to staffing levels reflects the importance of staffing for the
quality of resident care, as extensively documented in prior research (see e.g., Lin, 2014). Nursing
homes must report the number of hours worked over the 14 days immediately prior to the inspection.
These data are provided to field inspectors (on Form CMS 671) by home administrators and consti-
tute the numerator for computation of staffing levels; the denominator is the facility's number of resi-
dents, which is reported quarterly by the home (on Form CMS 672). Thus the CMS calculates
quarterly per-resident staffing numbers based on staffing data (reported during “annual” inspections)
and on resident headcounts (reported quarterly).
ODY-BRASIER AND SHARKEY 1523

FIGURE 1 Screenshot of a rating on the Nursing Home Compare website

Our analyses focus on self-reported information about staffing levels. Because staffing data are
provided on-site to field inspectors, there are limits to how much a home can inflate its numbers.
Even so, certain factors lend themselves to false reporting. First, although inspections are
unannounced, they tend to occur in geographically based clusters and so directors can often antici-
pate when they will occur (Thomas, 2014). This heightens the possibility that administrators could
provide an inaccurate picture of their staffing levels by temporarily adding staff during the inspection
time frame. Second, the NHC publicizes what staffing levels are required to achieve each star level
(see CMS, 2009, p. 10); hence misreporting could result as nursing homes look to achieve, by any
means necessary, a particular star level (Ederer, Holden, & Meyer, 2018). During the period of our
analysis, there was neither a protocol for identifying false reporting—with the exception of inspectors
noting “live” staffing violations during their visit—nor any penalty for misreporting. In this setting,
it is entirely possible that some nursing homes report artificially inflated staffing levels.

2.2 | Analytical approach


It is difficult to assess the accuracy of self-reported data. The most straightforward approach would
be to compare self-reported staffing levels with an objective or verified source of the same informa-
tion. However, there is no widely available measure of such levels other than that based on reports to
1524 ODY-BRASIER AND SHARKEY

the NHC. This is not surprising when one considers that, if a true measure of staffing levels were
widely available, then the NHC's use of self-reported data would be unnecessary.
To address this challenge, we follow in the tradition of prior research that has used indirect
methods to identify possible gaming or rule breaking (Della Vigna & La Ferrara, 2010; Duggan &
Levitt, 2002; Jacob & Levitt, 2003; Luca & Zervas, 2016; Mayzlin et al., 2014). We reason that if
self-reported staffing levels were accurate, then one would observe that changes in reported staffing
levels are followed by changes in the resident health outcomes that are closely related to staffing
levels. There are two reasons why we focus on pressure sores (i.e., bedsores or pressure ulcers), an
injury to the skin and underlying tissue that results from prolonged pressure or friction on the skin.
The first reason is that bedsores are an outcome typically associated with poor or nonexistent nursing
care (Casey, 2013; Lyder & Ayello, 2008).3 For example, using an instrumental variables approach,
Lin (2014, p. 19) finds that a one SD increase in the number of hours of registered nurse staffing per
resident per day (i.e., an increase of approximately 18.96 min in their data) is associated with a 17%
reduction in the fraction of patients with bedsores.4 Bedsores thus represent an outcome that is under
the direct control of nursing staff. A large body of work characterizes bedsores as one of the most
nurse-sensitive quality indicators (Horn et al., 2005; Bostick, Rantz, Flesner, & Riggs, 2006; see also
Grabowski & Hirth, 2003; Grabowski & Castle, 2004). For instance, Hillmer et al. (2005, p. 158)
argue that “pressure ulcers are an excellent marker of quality of care because very few residents
receiving proper care should develop this condition.” Similarly, CMS justifies the choice of
quality measures such as the proportion of residents with bedsores by emphasizing “their validity
and reliability, [and] the extent to which the measure is under the facility's control” (CMS, 2012,
p. 10; emphasis added). It follows that an increase in self-reported nursing hours should, over time,
translate into a lower proportion of residents with bedsores—provided the self-reported data are accu-
rate. (We later present analyses based on data from our empirical setting to provide further evidence
of the link between staffing and bedsores, when staffing levels are objectively captured by
inspectors.)
The second (and related) reason for our focus on bedsores is that their prevention requires
effort more so than skills from the nursing staff. Few conditions are “as manageable as a bedsore”
(Osterman, 2018: 27). Indeed, bedsores are easily preventable; they can be avoided and alleviated
by frequent repositioning of the immobile resident (Kane, Ouslander, & Abrass, 1989). Hence
even lower-skilled nursing staff, such as CNAs, can prevent bedsores (Osterman, 2018). There-
fore, irrespective of whether homes replace expensive labor with cheaper labor (e.g., decrease the
number of RN and LPN hours while increasing CNA hours) or hire less skilled practitioners
within the RN and LPN categories, the prevalence of bedsores should decline with greater nursing
staff time.
We acknowledge that the data on pressure ulcers are also self-reported. Yet on-site inspectors are
specifically mandated to assess “proper skincare” during their visit, and bedsores are easy to observe
(CMS, 2009). Furthermore, there are no public, bedsore-related cut-offs that nursing homes can

3
Florence Nightingale notably wrote: “If he has a bedsore, it's generally not the fault of the disease, but of the nursing”
(1859, p. 8).
4
Lin's (2014) analysis relies on comprehensive data from nursing homes in eight states whose legislatures mandated changes
to the minimum staffing requirements at nursing homes in 2000 or 2001. She includes in her study all Medicare-certified
nursing homes (which amounts to coverage of 96% of all nursing homes; Lin, 2014, p. 16), in these eight states, but her final
sample “excludes skilled nursing facilities as they typically provide short-term post-acute care and thus require a much higher
level of staffing” (Lin, 2014, p. 16). Importantly, these data were collected prior to the implementation of the Nursing Home
Compare five-star rating system.
ODY-BRASIER AND SHARKEY 1525

target to attain a certain star level. Because this criterion is therefore difficult to “game,” we have
confidence in the accuracy of these data.5
As already mentioned, we leverage the NHC system's changes (on April 1, 2012) in the number
of nursing hours required to obtain specific star ratings. Those changes, which were the first update
to these cut-off points since the system's introduction in 2009, increased the number of hours
required to obtain particular star ratings (Supporting Information, Appendix S1 gives details on the
old and new cut-off points). We view this update as an exogenous shift in pressure that nursing
homes faced to report higher staffing levels.
In our analyses, we are interested in the response of nursing homes that were at risk of losing a
star because of the new cut-offs. We say that a facility is “at risk of losing a star” if reporting the
same level of RN and/or total (i.e., RN + LPN + CNA) staffing as it reported in the quarter preced-
ing the NHC policy change would result in losing a star under the new system—that is, because both
RN hours and total nursing hours count separately toward a home's star rating (see Appendix S1). By
this definition, about half of the nursing homes in our sample were at risk.

2.3 | Empirical analyses


We use fixed-effects, difference-in-differences analyses to assess the tendency of at-risk facilities to
increase their reported nursing hours. A DiD analysis (Ashenfelter & Card, 1985) compares the
change over time in the behaviors of actors exposed to a treatment (here, the risk of losing a star
because of the exogenous shift in staffing cut-offs) to the behavioral changes among actors not so
exposed, thereby “netting out” any changes that might have occurred in the absence of the treatment.
We also undertake a DiD analysis of changes in the proportion of residents with bedsores before and
after implementation of the new staffing cut-offs. Each facility was observed in each quarter during
which it operated over the period 2009–2015. Our regressions incorporate home fixed effects as well
as dummies for each quarter; SEs are clustered by nursing home.

2.4 | Variables
Our first dependent variable is the Total reported nursing hours per resident per day. This measure is
the sum of the average daily number of RN, LPN, and CNA hours reported for the 14 days prior to
an inspection divided by the number of residents reported in each quarter (N = 411,082 across all
quarters in the study period).6 Our second dependent variable is the Proportion of residents with bed-
sores in a given home. This measure captures the percentage of residents with pressure ulcers. These
data are reported quarterly and correspond to the aggregate of the latest information available on each
resident's assessment form (MDS 3.0). Assessments are due no less frequently than every 92 days
5
Any manipulation of the bedsores data would create a “conservative” bias, since nursing homes would most likely under-
report the proportion of residents with bedsores. That, in turn, would make it more difficult for us to observe “no
improvement” in resident health (i.e., no reduction in bedsores) after a reported increase in staffing.
6
Given that the dependent variable in our analysis is a ratio, it is theoretically possible that nursing homes could increase the
number of nursing hours per resident per day either by increasing reported staffing levels or decreasing reported resident totals.
We view the latter as very unlikely for several reasons. First, homes are reimbursed by Medicare in part based on the number
of residents. Thus, underreporting residents could be very costly. Second, given that resident totals are collected quarterly and
staffing data are collected annually, homes that underreported resident totals would need to keep doing so in order to maintain
their desired staffing rating over time within the same year. Finally, our data suggest underreporting of residents is unlikely:
occupancy rates did not decrease post- vs. pre-treatment among at-risk homes. If anything, occupancy rates decreased slightly
among homes that were not at risk of losing a star.
1526 ODY-BRASIER AND SHARKEY

for each resident, although some data are missing for this variable (N = 279,388). These missing data
are due mainly to: (a) CMS failing to report them in the last quarter of 2010 and the first quarter of
20117; and (b) some facilities failing to provide data—either because they did not report the appro-
priate information or because there were too few residents for CMS to compute a proportion.
Following the classic DiD approach, we construct three independent variables: a time-invariant
indicator set equal to 1 if a nursing home is At risk of losing a star (and set to 0 otherwise); a dummy
for each quarter after the policy change in staffing cut-points (Post change), and the interaction
between these two terms. Because we include home and time fixed effects, we focus on interpreting
the At risk × Post change interaction term.
Our analyses accommodate important differences between nursing homes by including the fol-
lowing covariates (see e.g., Autor, 2003). Chain is set to 1 if the home is part of a chain or to 0 if it
is independent; this control is important because homes that become part of a chain might adopt pro-
cesses that prevent or minimize errors in reporting (Mayzlin et al., 2014; Pierce & Toffel, 2013).
For-profit status is captured by an indicator set to 1 if the home is a for-profit entity or to 0 otherwise.
Changes in for-profit status might affect a home's likelihood of misreporting its staffing levels,
although the direction of any such effect is unclear. Proportion of residents on Medicaid reflects the
socioeconomic status of residents. We control for Size by including the number of beds that each
home operates. To account for possible changes in the competitive environment, we divide the num-
ber of homes within the focal home's county by the size of that county's population aged over
65 (Home density) for each year; county-level population data are from the 2010 U.S. Census.
Table 1 reports descriptive statistics and correlations for our sample. Missing data for some quarters
and homes results in our N being smaller in the models that include all controls.

3 | VALIDITY C HECKS

Before presenting our main analyses, we assess the validity of a key assumption for our empirical
strategy. Namely, we rely on an established body of research (e.g., Horn et al., 2005; Lin, 2014)
showing that staffing hours are negatively correlated with the prevalence of bedsores. Despite that
consistent finding in other work on this topic, the correlation matrix (Table 1) shows little cross-
sectional relationship between these two variables for our dataset.8 One interpretation of this result is
that there is, in fact, no relationship between staffing and bedsores in our sample. It is not clear why
that would be the case, given the large body of research just alluded to and the size of our sample. A
more plausible interpretation is that reported staffing hours do not—for whatever reason—reflect the
amount of nursing care actually provided. In that event, we should not expect to see a correlation
between reported staffing hours and bedsore levels.
To assess the validity of our assumption that increases in staffing should be associated with a
reduction in bedsores when staffing is captured accurately, we exploit a specificity of our empirical
setting. During their visits, inspectors who observe that a home does not “have enough nurses to care
for every resident in a way that maximizes the residents' well-being” and/or does not “use a regis-
tered nurse at least eight hours a day, seven days a week” must formally report these violations. They
must also report homes that do not “give residents proper treatment to prevent new bed (pressure)
sores or heal existing bedsores.” If we are correct in assuming that higher levels of staffing should be
7
We ran subsample analyses on the data collected after the first quarter of 2011 (see Appendix S2); our results do not change.
8
We also run regressions that include home fixed effects and model the proportion of residents with bedsores as a function of
the (lagged) reported staffing while incorporating various control variables (including analyses on the pre-treatment
subsample). We find no significant relationship even when these controls are included (see Appendix S3).
ODY-BRASIER AND SHARKEY

T A B L E 1 Descriptive statistics and correlations

Mean SD 1 2 3 4 5 6 7 8
1. Total reported nursing hours (/resident/day) 4.04 1.04
2. % residents with bedsores 7.31 5.34 −0.06
3. At risk of losing a star 0.51 0.50 −0.04 0.00
4. Post change 0.47 0.50 0.07 −0.26 −0.01
5. Chain 0.55 0.50 −0.15 −0.02 0.03 0.00
6. For profit 0.69 0.46 −0.22 0.10 0.02 0.00 0.21
7. Percent of residents on Medicaid 59.90 23.22 −0.20 0.09 −0.01 −0.02 0.01 0.21
8. Number of beds 106.87 62.12 −0.06 0.10 0.02 −0.01 −0.10 −0.02 0.12
9. Home density (county-level) 0.08 0.24 −0.02 −0.05 0.00 0.00 0.00 −0.06 −0.04 −0.09
1527
1528 ODY-BRASIER AND SHARKEY

T A B L E 2 FE regressions predicting bedsore violations (given by inspectors) and bedsore prevalence (reported by
homes)

LPM bedsore violation Logit bedsore violation OLS % residents with bedsores
Staffing violation (1/0, 0.201 (0.011) [0.000] 1.305 (0.060) [0.000] 0.425 (0.129) [0.001]
given by inspectors)
Chain (1/0) 0.002 (0.005) [0.601] 0.025 (0.045) [0.582] −0.013 (0.093) [0.886]
For profit (1/0) −0.000 (0.009) [0.977] −0.012 (0.094) [0.897] −0.127 (0.173) [0.462]
% residents on Medicaid −0.000 (0.000) [0.722] −0.000 (0.001) [0.791] −0.012 (0.003) [0.000]
Number of beds −0.000 (0.000) [0.600] −0.001 (0.002) [0.646] 0.003 (0.003) [0.434]
Home density 0.017 (0.032) [0.595] 0.303 (0.519) [0.560] −2.556 (1.004) [0.011]
(county-level)
Constant 0.156 (0.022) [0.000] 12.232 (0.502) [0.000]
Home FE Yes Yes Yes
Quarter FE Yes Yes Yes
2
R 0.01 0.23
2
Prob > Chi 1261.26 (31) [0.000]
N 98,997 52,012 68,023

SE (clustered by home in all models) in parentheses p-value in brackets.

associated with a reduction in bedsores, then nursing homes found guilty of these staffing violations
by inspectors should also be more likely to be found guilty of not properly treating or preventing
bedsores. We formally investigate whether that is the case using a separate data set consisting of all
reports written for all inspections conducted by CMS between 2009 and 2014 (N = 109,036).9
First, we note a positive and significant bivariate correlation in these data between staffing and
bedsore violations (corr. = 0.1039, p < .000). Second, results of a home–fixed-effects linear proba-
bility model confirm this relationship: model 1 in Table 2 shows that, in a facility where inspectors
flag a staffing violation during their visit, the probability of a concurrent bedsore violation signifi-
cantly increases when all other independent variables are held constant (β = 0.201, p < .000). We
obtain similar results if we use a conditional logit specification (second model in Table 2) despite los-
ing observations (β = 1.305, p < .000). Third, the staffing violations flagged by inspectors are signif-
icant predictors of the proportion of residents with bedsores reported by nursing homes in that
quarter (β = 0.425, p < 0.001; third model in Table 2). Overall, these results confirm prior findings
on the negative relationship between staffing levels and the prevalence of bedsores—that is, when
both factors are accurately measured.

4 | MAIN RESULTS

We start by graphing the data's “pre-trends.” Figure 2 plots the average reported staffing hours for
treatment and control groups over the observation period. For both groups, we show the local poly-
nomial smooth plot (with 95% confidence intervals) during the pre- and posttreatment periods. The
average control home (i.e., not at risk of losing a star) reported higher staffing levels—as expected,
9
In other words, these analyses rely on a distinct and smaller sample than the full quarterly data set used for our main analyses
in Table 3.
ODY-BRASIER AND SHARKEY 1529

F I G U R E 2 Total
reported nursing hours

since the cut-off changes would have put at risk only those facilities (at a given star rating) with
lower staffing levels. This is not a concern because our statistical analysis focuses on changes from
these potentially different starting points. The figure shows that the treatment and control groups
exhibit similar trends in their reported staffing before the policy change. We can also see a small but
significant increase in reported staffing for the treated (i.e., at-risk) group, but not for the control
group, after the policy change.
Table 3 presents the DiD results for reported staffing levels. In line with Figure 2, a “naïve” speci-
fication without controls suggests that, between the pre- and posttreatment period, treated homes

T A B L E 3 DiD predicting reported nursing hours per resident per day and proportion of residents with bedsores

FE OLS reported FE OLS reported FE OLS % residents FE OLS % residents


nursing hours nursing hours with bedsores with bedsores
At risk X post 0.023 (0.009) 0.020 (0.009) [0.023] −0.069 (0.059) [0.240] −0.095 (0.062) [0.124]
change [0.009]
Chain (1/0) −0.021 (0.008) −0.002 (0.060) [0.977]
[0.014]
For profit (1/0) −0.077 (0.015) −0.045 (0.103) [0.661]
[0.000]
% residents on −0.002 (0.000) −0.011 (0.002) [0.000]
Medicaid [0.000]
Number of beds −0.002 (0.000) 0.001 (0.002) [0.626]
[0.000]
Home density −0.310 (0.160) −1.908 (0.484) [0.000]
(county-level) [0.052]
Constant 4.280 (0.055) [0.000] 11.320 (0.052) [0.000] 11.971 (0.314) [0.000]
Home FE Yes Yes Yes Yes
Quarter FE Yes Yes Yes Yes
2
R 0.02 0.02 0.21 0.21
N 411,082 371,811 279,388 254,696

SE clustered by home in parentheses p-value in brackets.


1530 ODY-BRASIER AND SHARKEY

increased their reported staffing levels to a greater extent than did control homes: model 1 (i.e., the
table's first data column) shows that a home at risk of losing a star (treated) increased reported
staffing by more than 1.38 min (=0.023*60) per resident per day in comparison with the other (con-
trol) homes. Incorporating all our control variables, in model 2, reveals that a home at risk of losing a
star reported an increase of about 1.21 min per resident per day. For comparison, the 2012 national
reported average was 4 hr/resident/day for all staffing (i.e., RNs + LPNs + CNAs). So even though
the effect is therefore small in magnitude, recall that (a) homes at risk of losing a star need only
ensure they will meet the new cut-off in order to maintain their rating and (b) doing so could, in
many cases, require only a few more minutes of reported staffing. Also, note that this effect repre-
sents the average increase in reported staffing by all at-risk homes and across all quarters in the post-
treatment period.
Of critical importance is that Table 3 shows no concurrent improvement in resident outcomes as
measured by a reduced prevalence of bedsores (models 3 and 4). The At risk × Post change interac-
tion is not statistically significant. These findings run counter to expectations if the rise in reported
staffing hours reflected increased nursing care. This combination of results suggests that the average
response to the pressure to report higher levels of nursing staff might have been to inflate the
reported numbers, not to make substantive changes. We later present analyses aiming at investigating
this possibility, as well as alternative explanations.

5 | M A G N I T U D E O F T H E EF F E C T

At this point, we would like to understand the relatively small economic magnitude of the effect we
uncover: an increase of 1.21 min/resident/day in the homes at risk of losing a star. We first want to
investigate why, given prior work showing that firms tend to respond to incentives put in place by
ratings systems, at-risk firms only increased their staffing levels by a relatively small amount com-
pared to homes that did not face the prospect of losing a star under the new requirements. We then
turn to examine whether this relatively small increase in nursing hours may account for the lack of
improvement in bedsores. We address these questions in turn.
First, it is important to keep in mind that the estimated 1.21 min/resident/day increase represents
an average across all at-risk homes. One possible explanation for the small size of this average effect
is that not all homes at risk of losing a star increase their reported staffing. In fact, if there is some
misreporting, then it is most likely to be by homes that are closest to the new cut-off and hence need
only report a small staffing increase—especially since such claims will seem relatively credible. We
offer two illustrative examples. First, a five-star home that reported 4.40 hr (including 0.712 RN
hours) before the NHC changes would need to report a staffing increase of only 0.018 hr, or just
1.08 additional min/resident/day, in order to maintain its rating after those changes. In contrast, a
five-star home that reported 4.08 hr (including 0.550 RN hours) before the cut-off change would
need to report an additional 0.338 hr (including at least 0.16 RN hours) to maintain its rating after
the change; that increase amounts to 20.28 additional min/resident/day (including at least 9.6 RN
min). Since staffing data are reported to inspectors on site, homes are constrained to report numbers
that do not deviate wildly from what those inspectors might observe; therefore, we think the first
(resp. second) example home is more (resp. less) likely to misreport its staffing levels.
To investigate this possibility, we computed the “distance” (in terms of number of staffing hours)
between (a) each home's reported staffing levels in the previous quarter and (b) the closest new
(updated) staffing cut-off. We then use this measure to introduce an additional differencing factor
into the estimator. This approach allows us to examine how differences in the reported staffing levels
ODY-BRASIER AND SHARKEY 1531

T A B L E 4 DiDiD predicting reported nursing hours

FE OLS FE OLS
At risk X post change 0.038 (0.017) [0.025] 0.048 (0.018) [0.006]
Distance to cut-off (q-1) −1.443 (0.096) [0.000] −1.548 (0.094) [0.000]
At risk X distance to cut-off (q-1) 0.624 (0.155) [0.000] 0.741 (0.158) [0.000]
Post change X distance to cut-off (q-1) 0.321 (0.077) [0.000] 0.366 (0.075) [0.000]
At risk X post change X distance to cut-off (q-1) −0.645 (0.161) [0.000] −0.674 (0.165) [0.000]
Chain (1/0) −0.017 (0.013) [0.187]
For profit (1/0) −0.086 (0.028) [0.002]
% residents on Medicaid −0.001 (0.001) [0.031]
Number of beds −0.001 (0.001) [0.105]
Home density (county-level) −0.445 (0.099) [0.000]
Constant 3.687 (0.023) [0.000] 3.975 (0.086) [0.000]
Home FE Yes Yes
Quarter FE Yes Yes
R2 0.02 0.03
N 411,082 371,811

SE clustered by home in parentheses p-value in brackets.

of at-risk homes located closer to versus farther from the (updated) cut-offs changed from before to
after implementation of the new policy—and then to compare that change with how differences
between the closest versus farthest homes changed among homes that were not at risk of losing a star
during the same time frame. The results, reported in Table 4, confirm our intuition: there is a negative
and significant “triple interaction” suggesting that, the farther a treated home is from meeting the
new cut-off, the smaller the extent of the reported staffing change (β = −0.645, p < .000).
Note that these results also help us alleviate concerns about regression to the mean. If, before the
policy change, the treated homes had for some reason experienced an inferior draw (i.e., a decline in
staffing levels) relative to the control homes, then they might naturally revert to the underlying aver-
age after the policy change. In that case, our empirical approach—which relies on differencing
changes in staffing levels between the two groups—could be compromised. However, the results
reported in Table 4 suggest that it is not the homes that required the largest staffing increases (owing,
perhaps to an inferior draw) reporting posttreatment staffing increases. This result is reassuring in
that it mitigates the concern that our basic DiD design does not allow us to disentangle the treatment
effect from mean reversion (c.f. Chay, McEwan, & Urquiola, 2005).
A second reason for the relatively small size of the main effect is that it reflects the change over
all periods after the policy change. We expect that the pressure organizations face to report an
increase in staffing will cause a discrete “jolt” only shortly after the policy change. Our “leads and
lags” model (to be described later), allows us to examine the policy change's effect on at-risk homes
at specific moments of time during the posttreatment period. Those analyses show larger effects in
two periods shortly after the policy change. In sum: our results indicate that, following the change in
cut-offs, nursing homes that experienced exogenous pressure to increase their staffing levels simply
to maintain their current rating (i.e., a home at risk of losing a star) increased their reported staffing
hours to a greater extent than control homes did.
1532 ODY-BRASIER AND SHARKEY

We now turn to the second question posed above: whether the relatively small average increase in
staffing levels can account for the fact that we find no significant improvement in the rate of bedsores
among at-risk homes relative to those not at risk, following the policy change. In other words, it
could be that the relatively small average staffing increase we observe among the homes at risk of
losing a star is real yet not sufficient to reduce bedsore prevalence. Recall for comparison that, as we
noted earlier, the research drawing on the sample that we view as most similar to our own (i.e., Lin,
2014) used an instrumental variables approach and found that a one-SD increase in nursing hours per
resident-day (i.e., 18.96 min per resident-day in her sample) was associated with a decrease of 17%
in the proportion of patients with bedsores.
To investigate more directly whether the reported increases in nursing hours driven by the change
in staffing requirements was associated with a reduction in bedsore prevalence, we ran two-stage least
squares (2SLS) regression models.10 In the first-stage model, we regressed nursing hours per resident-
day on the “at-riskXpost-change” interaction as well as other control variables. The second stage
model regresses the proportion of patients with bedsores on the predicted values of nursing hours from
the first-stage regression, as well as other controls, including nursing home and year fixed effects.
Because we have reason to suspect that the exclusion restriction for instrumental variables regression
is likely to be violated in this case, we do not claim that this model has a strictly causal interpretation.
Rather, we view this as a way to examine whether the amount of nursing hours explained by being at
risk of losing a star due to the policy change is associated with bedsore prevalence.
Results of this analysis are presented in Table 5, which shows that the proportion of nursing hours
explained by being at risk is not associated with a reduction in bedsores. As model 1 shows, the coef-
ficient for nursing hours is positive—that is, in the opposite direction to what one would expect—and
it is not statistically significant under this specification (p = .248). These results thus cast doubt on
the notion that the increase in reported staffing levels that was driven by the new nursing hour
requirements was associated with an improvement in the quality of care that patients received, at least
along this most basic of dimensions. The results are also difficult to explain in light of prior research
that related staffing and bedsores (e.g., Lin, 2014) and is consistent with misreported nursing hours.

6 | ROBUSTN E SS C HE C KS

6.1 | Underestimated SEs


As a robustness check, we rerun our analyses with home-level, block bootstrapped SEs and using
1,000 replications. Block bootstrapping—whereby the data are randomly sampled across blocks—is
frequently used in DiD analyses that involve repeated observations (Gubler, Larkin, & Pierce, 2016).
Our results are unchanged under this approach (see Appendix S4 for details).

6.2 | False positives


Because false positives are frequent in DiD analyses (Bertrand, Duflo, & Mullainathan, 2004), we
implement placebo tests to demonstrate that the estimated treatment effects on reported staffing levels
10
We also split at-risk nursing homes into subsamples based on the size of their reported staffing increase following the policy
change, focusing on those homes that increased their staffing the most (top quartile). We then ran DiD models on these at-risk
homes. Model 1 in Appendix S6 gives the results of these analyses: even in these homes that reported the largest staffing
increases, we observe no significant relationship between reported staffing changes and bedsore prevalence in the subsequent
quarter.
ODY-BRASIER AND SHARKEY 1533

T A B L E 5 FE 2SLS regressions for Second-stage model


the proportion of residents with bedsores
Total nursing hours 2.730 (2.363) [0.248]
Chain (1/0) 0.066 (0.084) [0.427]
For profit (1/0) 0.187 (0.214) [0.383]
% residents on Medicaid −0.006 (0.004) [0.084]
Number of beds 0.004 (0.003) [0.225]
Home density (county-level) −0.533 (1.303) [0.682]
Quarter FE YES
Home FE YES
2
R 0.13
N 251,435

First-stage F-statistic is 23.314 SE clustered by home in parentheses p-value in


brackets.

are not artifacts of the data structure. This procedure consists of randomly assigning placebo treat-
ment dates in our panel data; we then repeat our primary analyses with 1,000 replications. Figure 3
plots a random sample of 100 of these regression coefficients along with their 95% confidence inter-
vals. The treatment effect estimated from our actual data is shown in black and is greater than nearly
all of the placebo trials.

6.3 | Endogenous treatment date


Although our evidence suggests that the treatment date is not correlated with pre-treatment staffing
levels, we cannot entirely dismiss endogeneity concerns. Hence, we implement a “lags and leads”
model (Angrist & Pischke, 2008) in which we estimate a separate treatment coefficient for each
period before and after the policy change. The results, which are plotted in Figure 4, are consistent
with the policy change causing increases in reported staffing. We observe significant increases in
reported staffing both two and three periods after the policy change (β = 0.019, p < .040 and
β = 0.037, p < .007, respectively). We believe that the delay in reaction by nursing homes reflects

F I G U R E 3 Placebo
tests (random sample of
100 regression coefficients)
1534 ODY-BRASIER AND SHARKEY

FIGURE 4 Treatment
effect estimates for reported
nursing staff

that, although the new staffing cut-offs were implemented by CMS effective in April 2012, they were
not officially published until the July 2012 Technical User's Guide. Thus, it appears that most of the
at-risk homes reacted immediately after the new cut-offs were published, not after they were offi-
cially implemented. The increases correspond to (respectively) about 1.16 and 2.21 min per resident
per day, or about 102 and 194 min each day in a nursing home with the average number of residents.
It is reassuring to see no statistically significant difference between treated and control homes prior
to the policy change.

7 | ADDITIONAL ANALYSES

The foregoing analyses indicate that nursing homes reported higher staffing numbers in response to
increased pressure to do so. These effects are robust to our tests of whether they were attributable
instead to underestimated SEs, false positives, or endogeneity in the treatment date. However, we
find no evidence that these reported increases in staffing resulted in any significant improvement in
resident outcomes (i.e., fewer bedsores).
This pattern of results is consistent with two possibilities. First, perhaps homes increased staffing
yet could not, for several reasons, convert the staffing increases into reductions in bedsores.11 Sec-
ond, we would not observe any effect of staffing increases on bedsores if those increases were fabri-
cated. We now conduct additional analyses aimed at distinguishing between these possibilities.

11
We also considered the possibility that a home with relatively low levels of bedsores in the period prior to the policy change
might be unable to improve, despite hiring nursing staff, because of floor effects. We run subsample analyses on homes that
reported a high proportion of residents with bedsores, that is, those homes that were—immediately prior to the policy
change—in the top quartile for the prevalence of bedsores. For those nursing homes, we first ran DiD models examining
changes in staffing levels after the policy change in those homes that were at risk of losing a star, finding that at-risk homes
did report significantly higher levels of staffing compared to control homes after the policy change. We then examine the
relationship between staffing and bedsores after the policy change. Even for this subsample, in which the room for
improvement should be greatest, there is no evidence of a statistically significant effect of reported changes in staffing on the
prevalence of bedsores in the subsequent quarter.
ODY-BRASIER AND SHARKEY 1535

7.1 | Replacing expensive labor with cheaper labor


Previous research and our own analyses provide strong evidence of a direct relationship between
staffing and the quality of health care—especially for uncomplicated conditions such as bedsores.
However, it could be that facilities hired nursing staff who lacked the skills to reduce bedsores. A
possible mechanism in this case is if nursing homes at risk of losing a star replaced expensive nursing
staff (RNs and perhaps LPNs) with less qualified personnel (CNAs). Thus a nursing home could
increase its total staffing hours without incurring higher costs (because CNAs earn much less than do
RNs or LPNs).12 Although even lower-skilled nursing staff should be able to prevent bedsores by
simply repositioning the resident (Osterman, 2018), the scenario described here might explain why
nursing homes do not improve health outcomes despite staffing increases.
In additional analyses (see Table 6), we examined whether the reported increase in nursing staff
was driven by less- or more-qualified staff—that is, reported CNA hours versus reported LPN or RN
hours. Results show that at-risk nursing homes significantly increased their reported RN hours
(β = 0.014, p < 0.001), the most expensive type of labor, and did not significantly increase their
reported CNA or LPN hours (β = 0.003, p < 0.658 and β = 0.004, p < 0.303, respectively). The first
result corresponds to a reported increase in RN staffing of more than 0.84 min per resident per day.
For comparison, the national reported average in 2012 was 44 min/resident/day for RN staffing. In
short, at-risk homes reported an increase in qualified but not unqualified nursing staff. Note that
reporting an increase in RN staffing (vs., CNA or LPN staffing) makes strategic sense because this
type of labor counts “twice” in the calculation of a home's rating vis-à-vis staffing: it counts both for
RN and for total staffing requirements (see Appendix S1). Hence these findings cast considerable
doubt on the notion that bedsores were not adequately treated because nursing homes substituted
low-skilled for high-skilled staff or that those that were hired were unskilled.
Another way to distinguish between false reporting and an inability to affect bedsores despite true
staffing increases is to investigate labor cost data. Our reasoning is that, if staffing hours per resident
per day actually increased, then there should be a concomitant increase in the salaries paid to nursing

T A B L E 6 DiD predicting reported nursing hours by nursing type (CNAs, LPNs, or RNs)

FE OLS CNAs FE OLS LPNs FE OLS RNs


At risk X post change 0.003 (0.006) [0.658] 0.004 (0.004) [0.303] 0.014 (0.004) [0.001]
Chain (1/0) −0.014 (0.006) [0.019] −0.007 (0.004) [0.059] −0.000 (0.004) [0.950]
For profit (1/0) −0.048 (0.010) [0.000] 0.004 (0.006) [0.434] −0.033 (0.007) [0.000]
% residents on Medicaid −0.000 (0.000) [0.017] −0.001 (0.000) [0.000] −0.001 (0.000) [0.000]
Number of beds −0.001 (0.000) [0.009] −0.000 (0.000) [0.235] −0.001 (0.000) [0.000]
Home density (county-level) −0.208 (0.054) [0.000] −0.031 (0.029) [0.283] −0.072 (0.094) [0.444]
Constant 2.555 (0.034) [0.000] 0.879 (0.019) [0.000] 0.846 (0.033) [0.000]
Home FE Yes Yes Yes
Quarter FE Yes Yes Yes
2
R 0.00 0.00 0.07
N 371,811 371,811 371,811

SE clustered by home parentheses p-value in brackets.

12
National estimates of average hourly pay rates are $28.47 for RNs, $19.06 for LPNs, and $11.68 for CNAs.
1536 ODY-BRASIER AND SHARKEY

staff—especially since the reported increase is for the most expensive type of nursing (i.e., RNs).
Medicare-certified nursing homes are required to submit an annual cost report that contains such pro-
vider information as cost and charges by cost center (in total and for Medicare), Medicare settlement
data, and financial statement data; this information includes the direct salaries paid to full-time and
part-time nursing staff as well as the paid hours corresponding to those salaries. It is reasonable to
assume that increases in staffing should be reflected in these data. We therefore examined the cost
reports to see whether increased staffing hours are accompanied by increases in the direct salaries
paid to nursing staff. These cost report data have some missing observations and hence, especially
for regressions that include the full list of controls, a smaller N.
We start by taking as our dependent variable the annual direct salaries reported for full- and part-
time nursing staff. We replicate the previous analyses conducted with the bedsore data and find no
significant increase in the direct annual salaries paid by at-risk homes after the change in staffing cut-
off points (see models 1 and 2 in Table 7).
We then check whether any changes can be observed in the annual number of paid hours related
to full- and part-time nursing salaries. The preceding results, which showed no change in salary
expenses despite nursing homes reporting more staff hours, could arise if homes forced salaried
employees to work longer hours with no additional pay; hence analyzing the paid hours reported to
Medicare allows us to examine more directly how staffing hours changed as the cut-offs shifted. The
results (reported in models 3 and 4 of Table 7) again show no significant interaction. In other words:
whereas our earlier analyses suggest that nursing homes at risk of losing a star reported higher levels
of RN staffing as inputs to the NHC rating system, our analyses of data reported to Medicare show

T A B L E 7 DiD predicting annual paid salaries and hours for nursing

FE OLS paid FE OLS paid salaries FE OLS paid hours FE OLS paid hours
salaries nursing nursing nursing nursing
At risk X post −1,590.210 −1,521.525 (1,912.779) 107.263 (86.631) 119.143 (92.897)
change (1,864.839) [0.394] [0.426] [0.216] [0.200]
Chain (1/0) 3,394.585 (2,250.501) 40.918 (142.910)
[0.131] [0.775]
For profit (1/0) −3,152.918 (6,867.194) −246.100 (184.455)
[0.646] [0.182]
% residents on −126.458 (57.240) 0.026 (5.238)
Medicaid [0.027] [0.996]
Number of beds 445.738 (136.770) 10.906 (4.215)
[0.001] [0.010]
Home density −23,166.602 −1,476.888
(county-level) (12,408.749) [0.062] (412.604) [0.000]
Constant 243,856.212 200,685.526 8,000.967 (42.590) 6,968.367 (614.459)
(898.181) [0.000] (16,751.427) [0.000] [0.000] [0.000]
Home FE Yes Yes Yes Yes
Quarter FE Yes Yes Yes Yes
2
R 0.01 0.01 0.00 0.00
N 190,848 174,552 201,686 184,707

SE clustered by home in parentheses p-value in brackets.


ODY-BRASIER AND SHARKEY 1537

no increase in either paid salaries or paid hours for full-time or part-time nursing staff. At best, these
findings suggest that nursing staff were not financially compensated for all their hours; at worst, they
cast more doubt on nursing home self-reports of increased staffing.

7.2 | Longer time horizon required to reduce bedsore prevalence


One might wonder whether training newly hired staff to reposition residents requires time and
whether, as a result, the corresponding improvements in residents' health do not manifest immedi-
ately. If nursing homes hire new staff following the policy change but do not train them to reposition
residents until later, then the additional staffing's reduction of bedsores may become evident only in
the longer term. Note that this conjecture is difficult to reconcile with the fact that, as we previously
showed, overall staffing increases were driven by a rise in hours for the most skilled type of nurse—
RNs. Nonetheless, to examine this possibility, we run a lags and leads analysis with Proportion of
residents with bedsores as the dependent variable (see Appendix S5 for additional details). We find
no significant decrease in the prevalence of bedsores during the first, second, third, fourth, fifth, or
sixth period after the policy change. This result renders it unlikely that resident health outcomes
improve, even in the longer term, after reported staffing increases.
In summary, the results of DID and 2SLS models suggest that the increase in reported nursing
hours driven by the change in staffing requirements was not associated with any statistically signifi-
cant change in bedsore prevalence. Moreover, the results of our analyses are inconsistent with several
possible explanations for why facilities might fail to reduce bedsores despite a true increase in their
staffing levels. These accounts include: (a) the replacement of expensive labor with cheaper labor;
and (b) a longer time horizon required to improve health outcomes. Finally, our analyses of salary
and hours data reported by homes to a separate Medicare database do not provide evidence of the
increased salary expenses and hours that one would expect based on the staffing levels reported to
the NHC ratings system. While we lack direct prima facie evidence of false reporting, this body of
evidence leaves it open as a strong possibility.

8 | P O S S I B L E S O U R C E S OF V A R I A N C E I N MI S R E P O R T I N G

Having established an average tendency toward what we believe is the misreporting of staffing data,
we conduct post hoc analyses to assess possible sources of variance in this behavior. “Regulation by
information” theories are predicated on the idea that firms respond to ratings and other sorts of forced
information disclosure owing to the possibility of reputational loss, which in turn may adversely
affect consumer purchasing decisions. Building on this line of argument, we examine two factors that
may lead an organization to be more or less sensitive to changes in its reputation and hence more or
less likely to engage in behaviors such as misreporting; these factors are local competition and for-
profit status. We use triple-difference estimators to conduct the analyses.

8.1 | Local competition


One factor that may lead firms to be especially sensitive to the possibility of reputational loss is the
strength of local competition. Where competition is greater, consumers have more choices. Con-
sumers having viable alternatives increase the importance to a firm of any gain or loss in its reputa-
tion (Elfenbein, Fisman, & McManus, 2015). Hence greater local competition might intensify the
pressure that firms face to maintain their reputations, and research has shown that organizations may
1538 ODY-BRASIER AND SHARKEY

devote greater effort to protecting their reputations under competitive pressure. For example, there is
some evidence that greater competition from geographically proximate rivals increases the likelihood
of posting positive but fake reviews on rating sites such as TripAdvisor and Expedia (Mayzlin et al.,
2014) and Yelp (Luca & Zervas, 2016).
We start by measuring competition indirectly via the Occupancy rate, a home-level measure cal-
culated as the total number of residents divided by the total number of beds (the sample average is
0.83). The intuition behind this analysis is that homes that operate close to capacity—and hence that
have few beds available for new residents—might be less incentivized to misreport staffing data. The
triple-difference estimators are presented in Table 8, for which N is smaller because of missing data.
The model 1 results suggest that occupancy rates do significantly moderate the reaction of treated
homes: those under less competitive pressure (i.e., with fewer beds to fill) report staffing increases to
a lesser extent (β = −0.002, p < 0.010). However, the occupancy rate does not moderate outcomes
related to the prevalence of bedsores (see model 2); this finding is consistent with reported nursing
hours failing to represent staffing accurately.
We then use our Home density measure to investigate the moderating effect of local competition
more directly. Recall that this variable represents the number of homes within the focal home's
county and is weighted by that county's population aged over 65 in 2010. Model 3 in Table 8 sug-
gests, in line with model 1, that competition moderates the reaction of homes at risk of losing a star:
such homes report greater increases in staffing when pressured by having more competitors within
their county (β = 0.101, p < 0.000). Yet when we take the proportion of residents with bedsores as a
dependent variable, home density again does not significantly moderate the relationship of interest
(see model 4). These results are consistent with previous research suggesting that higher levels of
competition tend to increase the extent of gaming behaviors.

8.2 | Profit status


Non-profit nursing homes constitute about 30% of the homes in our sample. In the health care mar-
ket, important implications follow from whether a facility is a non-profit organization or instead a
for-profit entity that operates with the goal of generating financial returns for its owners. Non-profit
nursing homes are required to reinvest any profits in the home and so do not distribute those profits
elsewhere (Weisbrod, 1998); as a result, administrators of non-profit homes may be more oriented
toward providing quality care and less concerned about ratings, whose main effect is ultimately on
the home's “bottom line.” That said, non-profit status does not completely eliminate managerial con-
cerns about how ratings may also affect the nursing home's ability to attract customers. Note also
that, apart from how consumer reactions to ratings may affect financial outcomes, the managers of
non-profit homes are likely to care deeply about their organization's reputation. In fact, because pro-
vision of high-quality care is more central to the mission of non-profit homes, managers may well be
no less concerned (than their for-profit counterparts) about ratings because such evaluations are a tan-
gible verification of success at achieving their mission.
Using a triple-difference (DiDiD) estimator, we examine whether changes in the “profit status” of a
home affect the extent to which it might misreport staffing data. The results presented in model 5 of
Table 8 indicate that for-profit nursing homes that become non-profit homes react more strongly to the
cut-off policy change if they are at risk of losing a star, but the results are not statistically significant
(β = −0.027, p < 0.178 for for-profit homes). To examine whether a change in profit status affects
outcomes related to the prevalence of bedsores, we run a corresponding a DiDiD analysis (see model
6 in Table 8). The results are marginally significant (p < .057) and sufficiently complex to warrant
T A B L E 8 DiDiD predicting reported nursing hours and % residents with bedsores

Reported nursing % Bedsore Reported nursing % Bedsore Reported nursing % Bedsore


At risk X post change 0.001 (0.001) [0.135] −0.003 (0.006) [0.556]
Post X occ. rate 0.000 (0.001) [0.762] 0.014 (0.004) [0.000]
ODY-BRASIER AND SHARKEY

At risk X post X occ. rate −0.002 (0.001) [0.010] −0.001 (0.005) [0.808]
At risk X home density 0.580 (0.106) [0.000] 2.639 (2.190) [0.228]
Post X home density −0.072 (0.023) [0.001] 0.772 (0.209) [0.000]
At risk X post X density 0.101 (0.028) [0.000] −0.022 (0.276) [0.935]
At risk X for profit 0.028 (0.033) [0.390] −0.080 (0.221) [0.716]
Post X for profit −0.008 (0.016) [0.640] −0.689 (0.094) [0.000]
At risk X post X for profit −0.027 (0.020) [0.178] 0.241 (0.127) [0.057]
Occupancy rate −0.009 (0.001) [0.000] −0.026 (0.004) [0.000]
Chain (1/0) −0.021 (0.008) [0.012] −0.006 (0.060) [0.922] −0.021 (0.008) [0.013] −0.003 (0.060) [0.966] −0.021 (0.008) [0.013] −0.012 (0.060) [0.840]
For profit (1/0) −0.074 (0.015) [0.000] −0.052 (0.103) [0.613] −0.077 (0.015) [0.000] −0.041 (0.103) [0.686] −0.081 (0.026) [0.002] 0.293 (0.165) [0.076]
% res. on Medicaid −0.002 (0.000) [0.000] −0.011 (0.002) [0.000] −0.002 (0.000) [0.000] −0.011 (0.002) [0.000] −0.002 (0.000) [0.000] −0.010 (0.002) [0.000]
Number of beds −0.004 (0.001) [0.000] −0.004 (0.002) [0.055] −0.002 (0.000) [0.000] 0.001 (0.002) [0.608] −0.002 (0.000) [0.000] 0.002 (0.002) [0.464]
Home density −0.352 (0.156) [0.024] −1.938 (0.472) [0.000] −0.635 (0.057) [0.000] −1.380 (0.444) [0.002] −0.304 (0.162) [0.062] −1.795 (0.459) [0.000]
Constant 5.205 (0.085) [0.000] 15.170 (0.473) [0.000] 4.284 (0.054) [0.000] 11.840 (0.321) [0.000] 4.269 (0.056) [0.000] 11.661 (0.314) [0.000]
Home FE Yes Yes Yes Yes Yes Yes
Quarter FE Yes Yes Yes Yes Yes Yes
2
R 0.03 0.21 0.02 0.21 0.02 0.21
N 371,588 254,531 371,811 254,696 371,811 254,696

SE clustered by home in parentheses p-value in brackets.


1539
1540 ODY-BRASIER AND SHARKEY

further interpretation. Plotting the results of our triple interaction model suggests that, in both the
treated and control groups, becoming a non-profit is associated with lower reported levels of bedsores
in the pre-treatment period; however, for-profit and non-profit homes exhibit a different time trend. On
average, for-profit homes reduced their proportion of residents with bedsores between the pre- and
post-treatment periods—although at-risk for-profit homes did so to a lesser extent. In contrast, non-
profit homes reported (on average) an increase in the prevalence of bedsores among their residents
between the two periods, although at-risk non-profit homes reported a slightly smaller increase on
average. These results raise interesting questions for future research on whether the nursing staff at
non-profit facilities actually provide better care, as is widely supposed, than do their for-profit peers.

9 | CONCLUSION

Over the last few decades, the environment in which organizations operate has been profoundly trans-
formed by the rise of measurement, assessment, and evaluation devices such as ratings and rankings.
This transformation is both a manifestation of and a response to calls for organizational efficiency and
effectiveness as well as for increased accountability and transparency (Espeland & Sauder, 2016;
Fung, Graham, & Weil, 2007; Power, 1997; Strathern, 2000). However, the research examining orga-
nizations' strategic responses to ratings and other forms of third-party evaluation offers mixed evidence
for the effectiveness of “regulation by information”. Given the prevalence of these new forms and the
heterogeneous responses to them, scholars have argued that better “understanding the disciplinary
effects of measures like rankings is crucial” (Sauder & Espeland, 2009, p. 80).
Our work responds to this call by examining how nursing homes responded to changes in staffing
requirements associated with the NHC five-star rating system. This system was implemented so that
consumers could better distinguish between higher- and lower-quality nursing homes, and theories of
regulation by information argue that empowering consumers in this manner should lead organiza-
tions to improve. Yet our analyses raise serious questions about the value of the information con-
veyed by these ratings and suggest that consumers should be cautious in their reliance upon these
data as a sole or primary driver of decisions regarding the care of their loved ones.
In particular, we find that homes at risk of losing a star under the new staffing requirements
responded to this pressure by reporting increases in nursing hours. These increases were small, on
average, which in and of itself, raises questions about whether the rating system is an effective driver
of organizational change. We also showed that these increases in RN hours did not translate into
improved resident outcomes. Contrary to predictions in the medical literature, there was not an asso-
ciated decline in the proportion of residents with bedsores.
This pattern of evidence is consistent with two sets of possible explanations: insufficient or inef-
fective hiring, whereby homes increased their staffing hours but for some reason did not realize a
corresponding improvement in resident outcomes; or gaming, whereby homes responded to the new
requirements by reporting inaccurate data. We present several supplementary analyses aimed at dis-
tinguishing between those two sets of possibilities. Of these analyses, two are particularly important.
First we show that homes did not respond to the requirement for increasing staffing hours by primar-
ily hiring less costly and lower-skilled nursing staff. Rather, at-risk homes reported increases in the
most skilled type of employee—registered nurses—most likely because RN hours are counted both
separately and as contributing to total nursing hours. This finding suggests that a lack of skill
(i.e., ineffective hiring) does not explain the observed lack of improvement in bedsores. A second
key piece of evidence is our examination of the salary costs and hours reported to Medicare. The data
show no rise in salary expenses or in paid hours following implementation of the change—a finding
ODY-BRASIER AND SHARKEY 1541

that is difficult to reconcile with nursing home reports of an increase in the number of RN hours
worked. As in other settings where researchers have used indirect approaches to shine a spotlight on
possible gaming (e.g., Mayzlin et al., 2014), we acknowledge that we cannot definitively establish
that nursing homes reported false data. However, having sought out and thoroughly investigated
other possible explanations for our findings, we are left to conclude that false or misleading reporting
is a strong possibility.
Finally, in subsequent explorations of the main finding, we analyzed conditions under which
gaming might be more prevalent. Echoing the work of others in the health care domain (e.g., Snyder,
2010), we find that gaming is more pervasive when facilities face more competition. We also find,
similar to Burbano and Ostler (2018), some evidence that gaming might be more common among
non-profit organizations. Identifying variations in responses to ratings is important because doing so
helps identify the circumstances under which regulators and consumers should be most wary of rely-
ing too much on the information that organizations provide.
By highlighting the strong likelihood of gaming behaviors in a critical domain (viz., health care),
our work underscores the need for caution in the design of rating systems and for continuous moni-
toring to evaluate whether ratings actually achieve their goals. Indeed, CMS has since April 2018
been moving away from reliance on the self-reported data analyzed here and now requires homes to
submit staffing information electronically based on payroll and other auditable data.13 This transition
was prompted by suspicions of inaccuracies in self-reported data (U.S. Government Accountability
Office, 2015), and our analyses indicate that such concerns were likely justified.
Of course, our findings bear implications also for how stakeholders should evaluate and incorporate
ratings into their decision-making processes. When consumers and other audiences rely on ratings, rank-
ings, and other forms of regulation by information, they implicitly place their trust in the accuracy of the
data that underlie those summary evaluations (e.g., star ratings). Prior work has questioned whether that
trust is warranted by showing how evaluations can be tainted by biases or conflicts of interest on the part
of ratings organizations (Fleischer, 2009). Our paper has shown also that such trust may be unwarranted
when the data on which ratings are based are unverified and hence subject to manipulation. Our results
suggest that stakeholders should pay closer attention to these details of rating systems and take the rat-
ings “with a grain of salt.” Unfortunately, this open-eyed approach may negate the very benefits that rat-
ings were designed to achieve—namely, making quality assessments less time-intensive.
Finally, our results should be of interest to scholars interested in status and reputation. It is worth
considering how processes involving reputation may differ depending on whether reputation is gen-
erated via relatively formal means (e.g., ratings) or by relatively informal ones (e.g., word of mouth).
Espeland and Sauder (2016) note that ratings typically attempt to distill hard-to-measure constructs,
such as quality of care, into a single number—a “quantitative” assessment that is imbued with an
aura of objectivity. We therefore speculate that ratings-based reputational measures may influence
stakeholders, such as consumers and investors, to a greater extent than do less formal measures. This
paper suggests that, because stakeholders may give more credence to formalized measures than to
other means of assessing reputation, organizations may be more responsive to the former and hence
more likely to include gaming as part of their response. One implication is that reputational measures
may be less reflective of “true” quality under formalized rating systems than under informal systems.
Future work would do well to test these arguments.
We must acknowledge that the allure of ratings and other third-party evaluations is due, in part, to
their distillation of a variety of generally complex data down to a simple metric that is easy to com-
prehend. The downside, however, is that consumers may have little sense of what is going on “under
13
https://www.cms.gov/Medicare/Provider-Enrollment-and-Certification/CertificationandComplianc/downloads/usersguide.pdf
1542 ODY-BRASIER AND SHARKEY

the hood.” As Espeland and Sauder (2007, p. 4) remark: “Awash in numbers, we now take social sta-
tistics for granted so much that we forget how hard it is to make them.” We hope that our study not
only serves to remind readers about the limits of regulation by information but also offers useful
insight to those in the business of designing such systems.

ACKNOWLEDGEMENTS
The authors appreciate the extremely helpful feedback of editor Lamar Pierce, as well as two anony-
mous reviewers. The authors are also grateful for comments from Marissa King, Elizabeth Pontikes,
Olav Sorenson, Chris Yenkey, as well as seminar participants at the 2016 Junior Organizational The-
ory Conference, the 2016 Economic Sociology Conference, 2017 Duke Strategy Conference and
HEC-Paris. All errors remain our own.

R E F E REN CE S

Akerlof, G. (1970). The market for lemons: Quality uncertainty and the market mechanism. Quarterly Journal of Eco-
nomics, 84(3), 488–500.
Angrist, J., & Pischke, J.-S. (2008). Mostly harmless econometrics: An Empiricist's companion. Princeton, NJ: Prince-
ton University Press.
Ashenfelter, O., & Card, D. (1985). Using the longitudinal structure of earnings to estimate the effect of training pro-
grams. The Review of Economics and Statistics, 67(4), 648–660.
Autor, D. H. (2003). Outsourcing at will: The contribution of unjust dismissal doctrine to the growth of employment
outsourcing. Journal of Labor Economics, 21, 1–42.
Bennear, L. S., & Olmstead, S. M. (2008). The impacts of the ‘right to know’: Information disclosure and the violation
of drinking water standards. Journal of Environmental Economics and Management, 56(2), 117–130.
Bertrand, M., Duflo, E., & Mullainathan, S. (2004). How much should we trust differences-in-differences estimates?
Quarterly Journal of Economics, 119(1), 249–275.
Bostick, J. E., Rantz, M. J., Flesner, M. K., & Riggs, C. J. (2006). Systematic review of studies of staffing and quality
in nursing homes. Journal of the American Medical Directors Association, 7(6), 366–376.
Bowman, N. A., & Bastedo, M. N. (2009). Getting on the front page: Organizational reputation, status signals, and the
impact of the ‘U.S. News & World Report’ on student decisions. Research in Higher Education, 50, 415–436.
Burbano, V., & Ostler, J. (2018). Misconduct by nonprofit, for-profit and public organizations: Everybody cheats, but
with different objectives and outcomes (Unpublished Working Paper).
Casey, G. (2013). Pressure ulcers reflect quality of nursing care. Nursing New Zealand, 19(10), 20–24.
Center for Public Integrity. (2014). Analysis shows widespread discrepancies in staffing levels reported by nursing
homes. Retrieved from http://www.publicintegrity.org/2014/11/12/16246/analysis-shows-widespread-discrepanci
es-staffing-levels-reported-nursing-homes.
Centers for Medicare and Medicaid Services. (2009). Design for nursing home compare five-star quality rating system:
Technical Users' Guide.
Centers for Medicare and Medicaid Services. (2012). Design for nursing home compare five-star quality rating system:
Technical users' guide.
Centers for Medicare and Medicaid Services. (2015). Design for nursing home compare five-star quality rating system:
Technical users' guide.
Chatterji, A. K., & Toffel, M. W. (2010). How firms respond to being rated. Strategic Management Journal, 31(9),
917–945.
Chay, K. Y., McEwan, P. J., & Urquiola, M. (2005). The central role of noise in evaluating interventions that use test
scores to rank schools. American Economic Review, 95(4), 1237–1258.
Della Vigna, S., & La Ferrara, E. (2010). Detecting illegal arms trade. American Economic Journal, 2(4), 26–57.
Dranove, D., Kessler, D., McClellan, M., & Satterthwaite, M. (2003). Is more information better? The effects of ‘report
cards’ on health care providers. Journal of Political Economy, 11(3), 555–588.
ODY-BRASIER AND SHARKEY 1543

Duggan, M., & Levitt, S. D. (2002). Winning Isn't everything: Corruption in sumo wrestling. American Economic
Review, 92(5), 1594–1605.
Ederer, F., Holden, R., & Meyer, M. (2018). Gaming and strategic opacity in incentive provision. RAND Journal of
Economics, 49(4), 819–854.
Elfenbein, D. W., Fisman, R., & McManus, B. (2015). Market structure, reputation, and the value of quality certifica-
tion. American Economic Journal: Microeconomics, 7(4), 83–108.
Espeland, W. N., & Sauder, M. (2007). Rankings and reactivity: How public measures recreate social worlds. Ameri-
can Journal of Sociology, 113(1), 1–40.
Espeland, W. N., & Sauder, M. (2016). Engines of Anxiety: Academic rankings, reputation and accountability.
New York, NY: Russell Sage Foundation.
Fleischer, A. (2009). Ambiguity and the equity of rating systems: United States brokerage firms, 1995-2000. Adminis-
trative Science Quarterly, 54(4), 555–574.
Fung, A., Graham, M., & Weil, D. (2007). Full disclosure: The perils and promise of transparency. Cambridge,
England: Cambridge University Press.
Gordon, T. P., Knock, C. L., & Neely, D. N. (2009). The role of ratings agencies in the market for charitable contribu-
tions: An empirical test. Journal of Accounting and Public Policy, 28, 469–484.
Grabowski, D. C., & Castle, N. G. (2004). Nursing homes with persistent high and low quality. Medical Care
Research and Review, 61(1), 89–115.
Grabowski, D. C., & Hirth, R. A. (2003). Competitive spillovers across non-profit and for-profit nursing homes. Jour-
nal of Health Economics, 22(1), 1–22.
Gubler, T., Larkin, I., & Pierce, L. (2016). Motivational Spillovers from awards: Crowding out in a multitasking envi-
ronment. Organization Science, 27(2), 286–303.
Hillmer, M. P., Wodchis, W. P., Gill, S. S., Anderson, G. M., & Rochon, P. A. (2005). Nursing home profit status and
quality of care: Is there any evidence of an association? Medical Care Research and Review, 62(2), 139–166.
Horn, S. D., Buerhaus, P., Bergstrom, N., & Smout, R. J. (2005). RN staffing time and outcomes of long-stay nursing
home residents: Pressure ulcers and other adverse outcomes are less likely as RNs spend more time on direct
patient care. American Journal of Nursing, 105(11), 58–70.
Institute of Medicine. (1986). Improving the quality of care in nursing homes. Committee on nursing home regulation.
Washington, DC: National Academy Press.
Jacob, B. A., & Levitt, S. D. (2003). Rotten apples: An investigation of the prevalence and predictors of teacher
cheating. Quarterly Journal of Economics, 118(3), 843–877.
Jin, G. Z., & Leslie, P. (2003). The effect of information on product quality: Evidence from restaurant hygiene grade
cards. Quarterly Journal of Economics, 118, 409–451.
Kane, R. L., Ouslander, J. G., & Abrass, I. B. (1989). Essentials of clinical geriatrics (2nd ed.). New York, NY:
McGraw-Hill.
Lange, D., Lee, P. M., & Dai, Y. (2011). Organizational reputation: A review. Journal of Management, 37(1),
153–184.
Lin, H. (2014). Revisiting the relationship between nurse staffing and quality of care in nursing homes: An instrumen-
tal variables approach. Journal of Health Economics, 37, 13–24.
Luca, M., & Zervas, G. (2016). Fake it till you make it: Reputation, competition, and yelp review fraud. Management
Science, 62(12), 3412–3427.
Lyder CH, Ayello EA. 2008. Pressure ulcers: A patient safety issue. Retrieved from http://www.ahrq.gov/qual/
nurseshdbk/docs/LyderC_PUPSI.pdf.
Lyon, T. P., & Shimshack, J. P. (2015). Environmental disclosure: Evidence from Newsweek's green companies rank-
ings. Business & Society, 54(5), 632–675.
Mayzlin, D., Dover, Y., & Chevalier, J. A. (2014). Promotional reviews: An empirical investigation of online review
manipulation. American Economic Review, 104(8), 1–40.
Meredith, M. (2004). Why do universities compete in the ratings game? An empirical analysis of the effects of the ‘U.
S. News & World Report’ college rankings. Research in Higher Education, 45(5), 443–461.
Nightingale, F. (1859). Notes on nursing: What it is, and what it is not. New York, NY: D. Appleton and Company.
Osterman, P. (2018). Who will care for us? Long-term care and the long-term workforce. New York, NY: The Russell
Sage Foundation.
1544 ODY-BRASIER AND SHARKEY

Pierce, L., & Toffel, M. W. (2013). The role of organizational scope and governance in strengthening private monitor-
ing. Organization Science, 24(5), 1558–1584.
Pope, D. G. (2009). Reacting to rankings: Evidence from ‘America's best hospitals’. Journal of Health Economics, 28
(6), 1154–1165.
Power, M. (1997). The audit society: Rituals of verification. Oxford, England: Oxford University Press.
Rao, H. (1994). The social construction of reputation: Certification contests, legitimation, and the survival of organiza-
tions in the American automobile industry: 1895–1912. Strategic Management Journal, 15, 29–44.
Rau, J. (2018, July 7). ‘It's Almost Like a Ghost Town.’ Most Nursing Homes Overstated Staffing for Years. The
New York Times. Retrieved from https://www.nytimes.com/2018/07/07/health/nursing-homes-staffing-medicare.html.
Rindova, V. P., Williamson, I. O., Petkova, A. P., & Sever, J. M. (2005). Being good or being known: An empirical
examination of the dimensions, antecedents, and consequences of organizational reputation. Academy of Manage-
ment Journal, 48(6), 1033–1049.
Sauder, M., & Espeland, W. N. (2009). The discipline of rankings: Tight coupling and organizational change. Ameri-
can Sociological Review, 74(1), 63–82.
Sauder, M., & Lancaster, R. (2006). Do rankings matter? The effects of ‘U.S. News & World Report’ rankings on the
admissions process of law schools. Law and Society Review, 40(1), 105–134.
Schneiberg, M., & Bartley, T. (2008). Organizations, regulation, and economic behavior: Regulatory dynamics and
forms from the nineteenth to twenty-first century. Annual Review of Law and Social Science, 4, 31–61.
Sharkey, A. J., & Bromley, P. (2015). Can ratings have indirect effects? Evidence from the organizational response
to Peers' environmental ratings. American Sociological Review, 80(1), 63–91.
Snyder, J. (2010). Gaming the liver transplant market. Journal of Law and Economics, 26(3), 546–568.
Strathern, M. (2000). Afterword: Accountability…and ethnography. In M. Strathern (Ed.), Audit cultures: Anthro-
pological studies in accountability, ethics and the academy (pp. 279–304). New York, NY: Routledge:
Abingdon.
Thomas, K. (2014, August 24). Medicare Star Ratings Allow Nursing Homes to Game the System. The New York
Times. Retrieved from https://www.nytimes.com/2014/08/25/business/medicare-star-ratings-allow-nursing-homes-
to-game-the-system.html?_r=0.
U.S. General Accounting Office. (1998). California nursing homes: Care problems persist despite federal and state
oversight. Washington, D.C: United States General Accounting Office.
U.S. General Accounting Office. (2002). Nursing homes: Public reporting of quality indicators has merit, but national
implementation is premature. Washington, DC: U.S. General Accounting Office.
U.S. Government Accountability Office. (2015). Nursing home quality: CMS should continue to improve data and
oversight. Washington, DC: U.S. Government Accountability Office.
U.S. Senate, Subcommittee on Long-term Care of the Senate Special Committee on Aging. (1974). Nursing home care
in the United States: Failure in public policy. Washington, DC: United States Government Printing Office.
Weisbrod, B. A. (1998). The nonprofit Mission and its financing: Growing links between nonprofits and the rest of the
economy. In B. A. Weisbrod (Ed.), To profit or not to profit: The commercial transformation of the nonprofit sec-
tor (pp. 1–22). Cambridge: Cambridge University Press.
Werner, R. M., Konetzka, R. T., & Polsky, D. (2016). Changes in consumer demand following public reporting of
summary quality ratings: An evaluation in nursing homes. Health Services Research, 51 Suppl 2, 1291–1309.

S U P P O R T I NG IN F O RM AT I O N
Additional supporting information may be found online in the Supporting Information section at the
end of this article.

How to cite this article: Ody-Brasier A, Sharkey A. Under pressure: Reputation, ratings, and
inaccurate self-reporting in the nursing home industry. Strat Mgmt J. 2019;40:1517–1544.
https://doi.org/10.1002/smj.3063

You might also like