1 s2.0 S0167718724000237 Main
1 s2.0 S0167718724000237 Main
1 s2.0 S0167718724000237 Main
A R T I C L E I N F O A B S T R A C T
JEL classification: We estimate the welfare consequences of local news broadcasting decisions in advertiser-funded
C57 television, a central question in media regulation. We model programming decisions as the
C61 outcome of a discrete game played by rival stations competing for advertising revenue (which
C72
depends on viewing) by choosing lineups of local news and entertainment broadcasts. Using
L11
L13
program-level data on television viewing and advertising prices, we find modest under-provision
of local news relative to the level that maximizes television viewing. Counterfactual simulations
Keywords: indicate an average deficit of 7.4 broadcasts per market, or 12.8% of local station broadcasts
Two-sided markets during the evening news hours. Most of this shortfall is in the 7:30 timeslot leading into prime
Advertising time. We distinguish two sources of inefficiency: losses due to advertiser valuation of local news
Television and entertainment viewers, and losses from classic business stealing. Losses from competition
Business stealing represent about one third of the estimated shortfall, suggesting gains to cooperation among
competing stations.
1. Introduction
The relationship between market competition and television programming has interested economists since the very start of the
TV era when Steiner (1952), and later Beebe (1977), outlined how competitive broadcasters might duplicate mass content and offer
too little niche programming. More recently, theoretical work on two-sided markets such as Anderson and Gabszewicz (2006) shows
how advertiser funding can further distort programming away from viewer preferences. The concern that competitive markets might
under-provide local news has motivated strong and long-standing regulation of local television markets in the United States through
ownership caps and cross-ownership restrictions. Indeed, regulatory requirements have remained largely unchanged despite dramatic
changes in news delivery brought about by cable television, the internet and mobile platforms.
Despite policy and research interest in local news, little empirical research speaks to welfare outcomes associated with local tele-
vision broadcasts. We offer an empirical analysis of oligopolistic competition among broadcasters in a two-sided market framework
that makes welfare trade-offs in program choice concrete. Our analysis can show not only whether local news is under-provided in
television markets, but also why.
Our general approach is to model station programming decisions as the outcome of a discrete game of complete information
played by rival stations. Stations compete for advertising revenue, which depends on viewing, by choosing a lineup of local news and
entertainment broadcasts spanning the six half-hour timeslots between 5:00 and 8:00 p.m. We focus on this time period because it
* Corresponding author.
E-mail addresses: [email protected] (M.J. Baker), [email protected] (L.M. George).
https://doi.org/10.1016/j.ijindorg.2024.103068
Received 19 May 2022; Received in revised form 26 March 2024; Accepted 1 April 2024
Available online 5 April 2024
0167-7187/© 2024 Elsevier B.V. All rights reserved.
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
brackets the traditional evening “news hours” when local stations (rather than national networks) select programming. We focus on
lineups to account for realistic features of television markets such as complementarity among program formats and the importance
of lags and leads in viewing. We focus on revenue maximization due to the availability of high-quality revenue data but limited cost
information, though we account for potential cost heterogeneity with station effects.
We estimate viewing and advertising functions that allow interdependence in stations’ payoffs, then control for possible equi-
librium multiplicity using a game-theoretic econometric model to control for this interdependence. We estimate the model using
a monthly average of station-level viewing and advertising revenue in each timeslot in each market, identifying parameters from
variation within stations over time, and across stations in each market in each timeslot. We account for time preferences with con-
trols for local news-national news sequencing and viewing lags. We account for potential unobserved time-preferences for local news
programming by allowing our viewing parameter to depend on market size, which is correlated with preference heterogeneity in
media settings.
To conduct our welfare analysis, we simulate the model to determine the three counterfactual program lineups that maximize
total program views, aggregate advertiser surplus, and joint station revenue, respectively. We then compare the three maximizing
lineups to each other and to observed schedules.
Our viewing and advertising models extend the approach of Berry and Waldfogel (1999) to incorporate differentiated program-
ming.1 To study viewing, we formulate a nested multinomial logit model of television program demand. The multinomial logit
formulation adapts well to situations where the number of choices varies across markets, a key feature of television markets, and
allows for straightforward calculation of viewer utility (McFadden, 1980). In “free-to-air” media such as television and radio, it is
not possible to directly measure the value of programming, so our viewer optimum is measured relative to the outside option of not
watching television. Our advertising model is based on a log-linear demand for viewer impressions by advertisers which changes
functional form with program type. This specification allows advertiser value for additional viewers to vary across formats, which is
relevant in television advertising.2
With these viewing and advertising functions, local stations choose evening lineups to maximize revenue in oligopolistic com-
petition. We require that observed lineups both maximize station revenue and reflect a best response to the lineups of competing
stations. That is, we posit that the observed programming lineups, and the resulting viewership and advertising revenues, constitute
a pure-strategy Nash equilibrium of a complete information static game.
The Nash equilibrium conditions of the station game require calculation of a large number of counterfactual viewing and ad-
vertising prices in a vast strategy space. Given the computational cost of estimation in this setting, we adopt a simulation-based
approach to estimation based on Ackerberg (2009) with elements from Bajari et al. (2010a) and Chernozhukov and Hong (2003).
As in Bajari et al. (2010a) and similar to ideas in Ellickson and Misra (2011a), we first estimate a preliminary model. We use the
preliminary model to simulate counterfactual viewing and advertising revenue consistent with observed outcomes in each market,
assuming observed lineups represent a pure-strategy Nash equilibrium. We add a second layer of simulation to check for additional
equilibria at a given draw for error terms. We weight simulations in inverse proportion to the number of pure-strategy equilibria in
the likelihood, then re-estimate the model to produce the fully corrected parameter estimates.
To conduct our welfare analysis, we simulate the model to predict the optimal lineups from the perspective of viewers, adver-
tisers and stations. We find that observed program lineups provide less local news than lineups that maximize television viewing.
Simulations show the average shortfall to be 7.4 half-hour local news broadcasts per market each day during the evening news hours
of 5:00–8:00 p.m., or about 0.77 half-hour broadcasts per local station. Stated another way, 12.8% of local station broadcasts are
misallocated to entertainment programming during the evening news hours. Most of the estimated shortfall occurs in the 7:00-8:00
p.m. hour leading up to prime-time. Shifting programming to the configuration that maximizes viewing would increase total viewing
by 8% per market on average (25 program views per thousand households). Cumulative local news viewing would approximately
double.
With our simulations we distinguish two sources of welfare loss. The first arises from differences between the value of local
news and entertainment audiences to advertisers, which we characterize as a two-sided market loss. We find an average difference
of 4 broadcasts per market between the number of local news broadcasts that maximizes viewing and the number that maximizes
advertising revenue. In other words, total viewing would increase with more local news broadcasts, but stations do not provide them
because advertisers will pay more for entertainment broadcasts, especially in later timeslots. The difference between the viewer and
advertiser optima is greatest in percentage terms in smaller markets.
The second source of welfare loss arises from classic business stealing in station competition. Our simulations indicate that the
observed number of local news broadcasts is less than the number that would maximize joint station revenue and advertiser surplus
in 92 markets in our sample. That is, joint revenue for both advertisers and stations would rise if at least one station would switch
to local news from entertainment programming, but no single station has an incentive to differentiate. The average shortfall from
the station optimum is 2.8 local news broadcasts per market, about one third of the total shortfall of 7.4 from the viewer maximum.
Our results indicate that fully coordinated programming decisions would increase advertising prices by $0.30 (2.8%) per advertising
second, or $1,071 per station per day. Advertiser surplus would also increase by $0.27 (3.5%) per second at the collusive outcome,
1
Our approach also shares some features of Berry et al. (2015), who investigate differentiation in radio markets. However our focus is on strategic interaction
among established competitors rather than entry, reflecting institutional differences between radio and television markets. We also examine efficiency aspects of
strategic broadcast decisions.
2
See George and Hogendorn (2012) for a theoretical treatment of advertising context and Furnham et al. (2002) and Goldfarb and Tucker (2011) for evidence
from experiments.
2
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
or $960 per station per day on average. Total viewing at the station optimum would increase by 7.5 program views per thousand,
or 2.5%. The amount of misallocation and revenue losses from business stealing are higher in larger markets, consistent with game-
theoretic results that coordination problems increase with more players.
From a policy standpoint, our results suggest that long-standing attention to provision of local news has been warranted. However,
policy attention with respect to mergers and cross-ownership has focused on small markets with few competitors (Makuch and
Levy, 2021). Our result that under-provision of local news is greater in larger, more diverse markets highlights the importance of
understanding sources of inefficiency. In particular, both the economics literature and the FCC have emphasized that because news
and other media products are produced with fixed costs, larger markets can sustain more outlets than smaller ones. However, our
results indicate that business stealing incentives reduce the equilibrium amount of local news more in larger markets with more
participants. Viewing losses overall and for local news are also higher in larger markets. As a result, heightened competition will
not necessarily bring forth more supply. Although a direct analysis of cross-ownership rules and other anti-trust provisions in news
markets is beyond the scope of this paper, our results suggest that coordination can improve outcomes.
Our research contributes empirical evidence to an expansive theoretical literature on inefficiencies in differentiated product
markets, which can arise in both the number and mix of products. Most familiar is the potential for excess entry, which can occur if
products offered by entrants are close substitutes for existing varieties so that new entrants divert consumers from existing options.3
Also well known are inefficiencies in the product mix, which can arise when firms face incentives to cluster in regions of product space
with high demand, or to excessively differentiate in order to sustain high prices.4 Taken together, this literature demonstrates that
under a range of consumer preference distributions and timing assumptions, product location choices can fall well short of first-best.
Location models without prices, such as the median voter result of Downs (1957), suggest even more pronounced distortions.
An important class of differentiated product markets, namely advertiser-funded media, are also two-sided. Product positioning in
two-sided markets has not been extensively studied theoretically, but conceptually any divergence between the marginal value of
differentiated products to consumers versus advertisers introduces the potential for distortions in the product mix. Heterogeneous
valuations for media types among advertisers can arise from the demographic mix on the consumer side, but also through better
alignment with advertised products or affective context more amenable to persuasion. The source of inefficiency matters in the
television market, as distortions driven by competition are likely more readily tackled with market structure regulation whereas
distortions arising from two-sided market tradeoffs suggest a need for subsidies or other price remedies.
Our study is also informed by a well-developed empirical literature on variety and consumption in media markets. Much of this
literature explicitly or implicitly considers the relationship between product variety and market size. Overall, the positive relationship
between market size, available variety and consumption has been demonstrated in radio (Waldfogel, 2003), newspapers (George and
Waldfogel, 2003) and entertainment television (Waldfogel et al., 2004). This literature suggests that the welfare implications of
variety are particularly important for minority taste groups, as larger minority populations are generally found to increase per capita
consumption among these groups. We might expect similar effects to operate in local television news markets, and one contribution
of this article is to document the effect of market size and the distribution of tastes on the supply of local news programs and local
news viewing.
Our article also contributes to the debate on localism, which is the subject of an interesting literature on the competition between
national and local media products. George and Waldfogel (2006) documents that the national expansion of the New York Times
attracted highly educated readers away from local media, triggering re-positioning in local newspapers. George (2008) documents
the effect of the spread of the internet on the composition of the local newspaper audience. Anderson et al. (1997) offer a theoretical
framework for thinking about competition and welfare when national and local media compete. This literature is driven by the
intuition that firms producing national products can spread fixed quality investments over a larger market than local producers. As
most of the expansion in both news and entertainment programming associated with improvements in television technology has been
national, this mechanism might be expected to operate in television markets. One key aspect of our estimation procedure and model
is the interaction between local and national television news.
Our work complements recent efforts to empirically estimate welfare outcomes in two-sided media markets with differentiated
products. Most closely related to our work is Berry and Waldfogel (1999)’s study of excess entry in radio broadcasting, which was
extended by Berry et al. (2015) to consider excess entry when products are differentiated. Also closely related are recent studies
of mergers in two-sided media markets. Fan (2013) considers the welfare effect of mergers on newspaper characteristics, including
the local news share. Filistrucchi et al. (2012) estimates merger effects on prices and welfare in the Dutch newspaper market, and
Jeziorski (2014) considers the effect of mergers on advertising prices and quantities in radio markets.
From a methodological perspective, our empirical approach builds on research that incorporates ideas from game theory into
model estimation.5 The literature on estimating discrete games centers on using information revealed in market outcomes to estimate
profit functions in the absence of information about profits. Our data, however, has detailed post-outcome information in the form of
both viewership and advertising expenditure, which constitute the bulk of station revenue. In this regard, our methods relate more
closely to the work of Zhu et al. (2009), Ellickson and Misra (2008), and Ellickson and Misra (2011a).
3
Models of entry and competition in this spirit begin with Chamberlin (1933) and are extended in important contributions by Spence (1976a), Spence (1976b),
Dixit and Stiglitz (1977) and Sutton (1991).
4
A large literature starting with Hotelling (1929) and developed by d’Aspremont et al. (1979) documents inefficiencies in the location choices of firms.
5
Early literature includes Reiss and Spiller (1989), Bresnahan and Reiss (1991), and Bjorn and Vuong (1984). For recent reviews, see Berry and Reiss (2007),
Ackerberg (2009), Bajari et al. (2010a), Bajari et al. (2010b), Ellickson and Misra (2011b), or Aradillas-López (2020).
3
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
The article proceeds as follows. Section 2 provides relevant background on local television news markets. Section 3 describes
programming, viewing and advertising data during the evening news hours. Section 4 provides a preliminary outline of potential
welfare losses in program choice. Section 5 outlines our econometric model and discusses identification. Section 6 describes our
estimation approach. Section 7 evaluates our simulation results and tackles the welfare analysis and section 8 concludes the paper.
We provide detail on our estimation methods in an appendix to the main text.
Since the start of the broadcast era, television stations have been licensed by the FCC to broadcast programming over specific
portions of the frequency spectrum. Hence, there has never been truly free entry in local television. Because broadcast signals can
interfere with each other, the number of stations in any particular region is limited by the technology available to utilize spectrum.
To accommodate these technological capacity constraints, the Federal Communications Commission in the 1950’s allocated three
stations in the largest markets and fewer in smaller cities, setting the stage for the three-network regime that dominated television
through the 1980’s. The limited number of stations in small cities gave rise early on to concerns about monopoly provision and
under-supply of local programming, especially local news. But at the same time, the number of broadcast stations licensed in very
large markets was not much larger than the number in small markets, leading to a wide disparity in the number of stations per capita
across the US. The potential for under-provision was thus a subject of concern even in large markets.
Entry barriers for local stations need not translate into restricted entry for local news, or even under-provision of local news,
as stations have many scheduling options to satisfy demand. In practice, the limited number of station licenses in each market
did likely limit local news programming. Local station license-holders negotiate contracts with national networks to carry network
programs. Station scarcity meant substantial rents had to be paid by networks to local license holders for airing national programs.
The opportunity cost of forgoing national entertainment programming in favor of additional local news broadcasts has thus always
been very high. These opportunity costs were highest in the largest, most constrained markets with the greatest number of viewers
per station. As a result, the amount of local news programming during the broadcast era did not vary substantially across markets.
The spread of cable television dramatically lowered entry barriers for national programming. By both offering an alternative
outlet for network entertainment and diverting viewers from local stations, the spread of cable reduced the networks willingness to
pay for placement on local stations. This effectively lowered the opportunity cost of airing local news, and is likely the reason that
more local news programming is broadcast today than in the broadcast and early cable eras. Cable expansion has led to entry of
some stations carrying local and regional news, and we will consider the effect of these stations in our analysis. But limits to “must
carry” rules combined with cable system maps that do not fully coincide with broadcast geography have limited these local stations
to only the very largest markets.
In sum, economic theories of differentiated product markets justify policy attention to potential inefficiencies in the supply of
local news, but offer little practical guidance on where to look for distortions with a reduced-form approach. Our structural model
can both measure the extent of under- or over-provision and pinpoint its causes. This approach can also uncover the demographic or
demand characteristics associated with inefficiencies. With this background, we turn to our data and model.
We estimate our model with viewing, advertising and station programming data in the 101 largest Designated Market Areas
(DMA’s) in the US over four weeks in February 2010. We work with a single, averaged observation for each station in each of the six
half-hour timeslots in the 5:00–8:00 p.m. time period. We focus on this time period because it covers the traditional early evening
“news hours” when local stations have discretion over programming. Earlier in the day, a large fraction of the population does not
watch television, while during prime time hours national network affiliates dictate programming for affiliated local stations. We
work with a monthly average because we observe little schedule variation at the category level over days of the week, and we want
to remove the high-frequency variation in viewing associated with work or weather that do not factor into strategic scheduling.
Raw viewing data come from Nielsen. The data report the number and share of households viewing television during each fifteen-
minute timeslot each day on all local stations and the 100 largest cable stations in each DMA. We average viewing shares to the
half-hour level to match with program timeslots. We record viewing data for both broadcast and cable stations. Raw advertising
data come from Kantar Media and are at the advertisement level. Kantar surveys stations, station representatives and media agency
buyers in the largest 101 markets each quarter and algorithmically allocates expenditure to individual advertisements. We average
advertisement-level prices per second across advertisers in each half hour block to correspond to program timeslots, then average
again over weeks to match our viewing data. To convert advertising prices into advertising revenue we assume 10 minutes (3,600
seconds) of advertising time per half-hour timeslot.6 We do not observe cable advertising prices.
Program schedules come from Nielsen and Kantar Media. We classify programming as local news, national news, general en-
tertainment or cable entertainment using Kantar categories supplemented by online search. This categorization allows us to focus
on the discrete choices faced by viewers, advertisers and local stations. Viewers in our framework choose one of the four program
categories available each timeslot or an outside option of not watching television. Local stations choose lineups of local news or
6
We observe little variation in the number of advertising seconds in our data, and much of the variation we do observe is spuriously related to missing advertise-
ments.
4
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
Table 1
Station and broadcast counts per market, by market category.
N Mean SD 5th Pct. 95th Pct.
Small Markets (Under 0.5 Million HH)
- Local Stations 40 7.4 1.4 5.5 10.5
- Natl. Cable Stations 40 64.7 9.8 53.0 84.0
- Local News Broadcasts 40 7.2 1.9 5.0 10.0
Medium Markets (0.5-1.5 Million HH)
- Local Stations 43 9.9 2.1 7.0 14.0
- Natl. Cable Stations 43 93.0 14.4 61.0 110.0
- Local News Broadcasts 43 11.2 3.0 7.0 17.0
Large Markets (Over 1.5 Million HH)
- Local Stations 18 15.6 3.7 10.0 22.0
- Natl. Cable Stations 18 94.2 7.5 81.0 107.0
- Local News Broadcasts 18 14.6 3.9 9.0 22.0
All Markets
- Local Stations 101 9.9 3.6 6.0 18.0
- Natl. Cable Stations 101 82.0 18.2 56.0 107.0
- Local News Broadcasts 101 10.2 3.9 6.0 17.0
Table 2
Total news and television viewing per market, by market category.
N Mean SD 5th Pct. 95th Pct.
Small Markets (Under 0.5 Million HH)
- Local News Broadcasts 40 7.2 1.9 5.0 10.0
- All TV Viewing 40 422.5 82.1 322.9 567.2
- Local News Viewing 40 73.9 18.1 47.5 106.9
Medium Markets (0.5-1.5 Million HH)
- Local News Broadcasts 43 11.2 3.0 7.0 17.0
- All TV Viewing 43 269.9 84.6 145.7 419.6
- Local News Viewing 43 41.3 14.9 20.1 66.0
Large Markets (Over 1.5 Million HH)
- Local News Broadcasts 18 14.6 3.9 9.0 22.0
- All TV Viewing 18 84.8 29.2 24.3 125.6
- Local News Viewing 18 11.6 4.9 2.9 20.1
entertainment, taking national network news schedules as given. We treat entertainment programming on cable television as distinct
from entertainment on local stations, reflecting that the subscription business model for cable tends to produce programming for
niche audiences.7 For expositional efficiency, we write “cable stations” rather than “national cable stations”, but the small number
of local cable stations in our data make the same choices as local broadcast stations.
To construct our working data, we merge viewing shares, advertising prices per second and program categories by station,
market and timeslot. This produces a single sequence of programming for each station covering the six timeslots from 5:00–8:00
p.m. Tables 1–2 offer a general characterization of local news offerings and viewing in our data. Station and broadcast counts per
market for small, medium and large markets are summarized in Table 1. Small DMAs have an average of 7.4 local stations with 7.2
local news broadcasts per evening. The largest markets average 15.6 local stations with 14.6 local news broadcasts per evening, and
mid-sized markets lie in between. The number of major cable stations also increases with market size, ranging from an average of
64.7 in smaller markets to 94.2 in large markets. The average number of local stations across all markets is 9.9.
Table 2 shows an analogous breakdown with viewing data. The share of households viewing television (expressed as viewers
per thousand households) is first summed over timeslots and stations in each market, then averaged over each market category.
Both overall and local news viewing declines steeply with increasing market size. Local news viewing averages 7% (74 per thousand
households) in small markets but only 1% in the largest markets. Total viewing falls from 42% in small markets to 8% in the largest
markets. The totals in this table reflect both a larger share of households watching television in any single timeslot, but also each
viewing household watching more timeslots through the evening. Along with information in Table 1, the data suggest that both total
television viewing and the share of viewing devoted to local news is lower in larger markets.
We characterize programming and viewing by timeslot in Table 3. The table reports the average number of broadcasts and
viewers per thousand households for each of the four program types: local news, national news, local station entertainment and
cable entertainment. Local news broadcasts are heavily clustered on the early timeslots, falling off considerably at 6:30 p.m. At the
same time, total viewing increases markedly over the evening. Local news viewing declines with the fall in local news broadcasts.
7
The subscription model for cable television centers on channels targeted at niche audiences that can be offered in bundles by cable operators. The literature
treats such content quite differently than advertiser-funded mass entertainment offered by broadcast stations (see, for example, Crawford and Cullen (2007)). We
thus designate a separate category for cable entertainment that viewers might watch but local broadcast stations do not offer. Our estimated parameters will,
however, measure the actual substitutability of cable programming and broadcast entertainment from the viewer perspective. More practically, cable entertainment
programming is determined nationally so must be taken as given by local stations.
5
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
Table 3
Average broadcast count and viewing by timeslot and program type.
Local News National News Entertainment Cable Total
Broadcasts
5:00 3.4 1.3 6.3 80.9 91.9
5:30 1.9 3.2 5.9 80.9 91.9
6:00 3.5 2.0 5.3 81.1 91.9
6:30 0.8 3.8 6.3 81.1 91.9
7:00 0.5 1.2 8.7 81.5 91.9
7:30 0.2 1.0 9.3 81.5 91.9
Total 1 10.2 12.5 41.8 487.0 551.5
Viewing
5:00 16.8 0.4 4.8 20.1 42.0
5:30 8.2 9.9 5.1 21.3 44.5
6:00 21.1 0.8 5.2 22.2 49.2
6:30 1.6 11.3 13.7 23.7 50.4
7:00 1.2 0.7 25.9 27.2 55.0
7:30 0.1 0.3 27.4 28.5 56.2
Total 1 48.9 23.3 82.1 143.0 297.3
Broadcast and viewing information is summarized visually in Fig. 1 and Fig. 2. The shaded bars in Fig. 1 highlight that local news
is disproportionately broadcast in the 5:00 and 6:00 timeslots, and is often followed by national news. Viewing, shown in Fig. 2,
follows a similar pattern. As prime time approaches, broadcasts and viewing shift from news to entertainment.
Table 4 tabulates the most common programming sequences offered by local stations. The table shows that entertainment pro-
gramming (represented by “E”) in all six timeslots is the most common station lineup in 34% of timeslots. This strategy is typical
for small local stations. The next most common configurations are local news early in the evening followed by national news then
general entertainment. We represent the lineups visually in a sequence plot in Fig. 3. The plot groups the most common sequences
together in bars, showing the progression of programming over the evening news period. As sequence plots represent a convenient
way of visualizing lineups, we return to this approach when discussing optimal scheduling.
We summarize advertising prices in Table 5. The table reports average prices per second for local news, national news and general
entertainment programming each hour. (We do not observe advertising prices for cable stations). The last column shows average
prices for all program types. Prices in this table are averaged across all stations in a market, then averaged across markets. Overall,
prices per second rise through the evening for most categories, partially reflecting higher total viewership at later timeslots. Local
news prices are highest in the 6:00-7:00 timeslot. National news prices are about 25% higher than local news prices on average,
though because national networks dictate scheduling the price difference is not central to decision making. On average, local news
prices are about 10% higher than entertainment prices, $9.28 versus $8.48.
Before turning to the specifics of our model, we first develop two examples in a simplified setting that illustrate the welfare
tradeoffs we consider with our simulations. Table 6 presents two scenarios where two stations (𝑆1 , 𝑆2 ) must choose whether to
broadcast local news or entertainment programming in a single timeslot. The top portion of each panel shows the household viewing
6
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
Table 4
Common programming sequences.
Program lineups No. Col % Cum %
EEEEEE 213 34.2 34.2
LNLEEE 106 17.0 51.2
LLLNEE 96 15.4 66.6
EELNEE 35 5.6 72.2
LNLLEE 21 3.4 75.6
LNEEEE 21 3.4 79.0
LLLNLE 17 2.7 81.7
LLEEEE 16 2.6 84.3
LLLEEE 15 2.4 86.7
EEELEE 10 1.6 88.3
Other 73 11.7 100.0
Total 623 100.0
Fig. 3. Sequence plot of observed broadcast decisions by program type and timeslot.
share associated with each combination of choices. The middle section shows advertiser surplus for each combination, where surplus
is measured in dollars per second. The bottom section shows station revenue for each combination in terms of advertising price per
second.
In the left-hand panel, the programming combination that maximizes total television viewing (our viewer optimum) is for one
station to broadcast local news and the other station to broadcast entertainment programming. Total viewing is 0.078, which is
7
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
Table 5
Average advertisement prices per second by timeslot and program type.
Local News National News Entertainment Average
Broadcasts
5:00 $8.73 $10.82 $3.57 $5.80
6:00 $11.05 $13.66 $6.19 $8.44
7:00 $9.35 $16.53 $15.32 $15.26
Total Average $9.28 $11.99 $8.48 $10.06
Table 6
Examples of welfare loss from station broadcast choice.
Two-Sided Market Distortion Business Stealing Distortion
Household Viewing
S2 S2
𝐿𝑜𝑐𝑎𝑙 𝑁𝑒𝑤𝑠 𝐸𝑛𝑡𝑒𝑟𝑡𝑎𝑖𝑛𝑚𝑒𝑛𝑡 𝐿𝑜𝑐𝑎𝑙 𝑁𝑒𝑤𝑠 𝐸𝑛𝑡𝑒𝑟𝑡𝑎𝑖𝑛𝑚𝑒𝑛𝑡
S1 𝐿𝑜𝑐𝑎𝑙 𝑁𝑒𝑤𝑠 (0.024, 0.024) (0.033, 0.045) S1 𝐿𝑜𝑐𝑎𝑙 𝑁𝑒𝑤𝑠 (0.019, 0.019) (0.028, 0.028)
𝐸𝑛𝑡𝑒𝑟𝑡𝑎𝑖𝑛𝑚𝑒𝑛𝑡 (0.045, 0.033) (0.035, 0.035) 𝐸𝑛𝑡𝑒𝑟𝑡𝑎𝑖𝑛𝑚𝑒𝑛𝑡 (0.028, 0.028) (0.024, 0.024)
greater than it would be if both stations broadcast local news (0.048) or if both stations broadcast entertainment (0.070). One
interpretation of this outcome is that viewers have a taste for variety: there is little substitution between programming types and
viewers watch their preferred program or nothing at all.
The center and lower panels show advertiser surplus and station revenue associated with each combination of programming
choices. Since advertising revenue governs station decisions, the pure-strategy Nash equilibrium occurs with both stations opting for
entertainment. Thus, in equilibrium the two stations provide less variety in programming than viewers would prefer. Advertisers
also prefer undifferentiated entertainment programming, as aggregate advertiser surplus is 46 when both stations broadcast enter-
tainment, as opposed to 44 when two different formats are offered. The outcome with both stations broadcasting entertainment also
maximizes joint station profits. Thus, the only dimension to welfare loss in this case is that viewers get less variety than they would
at their optimal lineup choice. Advertisers and stations cannot improve on their outcome unilaterally or through collusion. In our
welfare analysis, we characterize this as a two-sided market loss.
The example on the right panel, also developed from our model, produces a different outcome. Viewers again have a preference
for variety, with total viewing highest (56) with differentiated broadcasts. However in this case, both advertisers and stations would
see higher total surplus or revenue with differentiation. Competition leads to a Nash equilibrium where both stations broadcast
entertainment programming. This outcome involves welfare losses to all market participants, which we characterize in our welfare
analysis as losses from business stealing.
In the first example, stations’ decisions are not the ones preferred by viewers, but they are optimal for advertisers and the stations
themselves. In the second case, stations would be jointly better off differentiating programming, but asymmetric payoffs mean that
no station individually has an incentive to provide local news. From a policy standpoint, subsidies or other supply-side interventions
might induce differentiation in the first case, but may not improve outcomes in the second.
Although in our full model stations choose lineups over the six evening timeslots, the simple examples highlight tradeoffs that
emerge from our simulations. Both examples were constructed using formulas that follow from our model. For the left panel, parame-
ters are chosen so: 1) viewer utility from watching local news is lower than for watching entertainment; 2) local news broadcasts are
more substitutable for each another than entertainment broadcasts; and 3) advertisers have equal willingness to pay for local news
viewers and entertainment viewers. The parameter choices in the second example instead assume that viewers have equal utility for
watching local news and entertainment broadcasts, but that advertisers have lower willingness to pay for local news viewers. Our
results suggest that both of these configurations of viewer, advertiser, and station interests reflect actual outcomes.
5. Model
5.1. Overview
Our model includes three types of agents: viewers, advertisers, and local television stations.
8
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
We model viewing with a nested multinomial logit (NMNL), which allows for a simple, closed-form expression for viewer utility
and is suitable in environments, like television, where the number of competitors and choices varies across markets. Each timeslot,
a representative viewer first chooses among four program formats: local news, national news, entertainment, or cable programming,
plus an outside option of not watching television. After selecting a program format, the viewer selects a particular broadcast. We
specify a log-linear model of advertising revenue, extending Berry and Waldfogel (1999) to allow advertiser elasticity of demand
for viewers to differ across program formats. We view this contextual valuation as an important and realistic feature of television
advertising, but our approach is also consistent with a reduced-form approach to modeling demographic differences in the audience
for different program categories.8 Overall, our viewer and advertiser models are similar to Berry and Waldfogel (1999) and Berry et
al. (2015) in radio markets, though we focus on strategic interaction among established competitors rather than entry.
Local stations compete for advertising revenue, which depends on viewing. Stations choose program lineups across the evening,
which is different than optimizing in each half-hour timeslot. While it would be simpler to estimate a model in which station
programming decisions are independent across each half-hour block, our approach allows us to incorporate the complementarity
across program types and dynamic aspects viewing known to be important in television.9 Of course, our lineup-choice setup also
allows for independent programming decisions.
We require that station lineup choices: (1) be consistent with revenue maximization; and (2) be consistent with a pure-strategy
Nash equilibrium. In other words, observed lineup choices are revenue-maximizing best responses to lineup choices of competing
stations. We note that our set-up rules out mixed strategies, which we discuss further in estimation.10
We focus on revenue maximization because we have high-quality data on revenue and no information on costs. Programming
costs are fixed costs in the sense that they do not depend on viewing, and heterogeneity across stations is subsumed into station
effects. Our framework effectively assumes that programming is horizontally rather than vertically differentiated within stations,
with competing stations facing common costs through a shared local talent pool. We consider the implications of our assumption in
section 7 after discussing results.11
In the subsections below, we describe the viewer model, advertiser model and station game, then discuss identification of param-
eters with our data.
5.2. Viewers
A representative viewer chooses among four program categories: local news (𝑙 ), national news (𝑛), general entertainment (𝑒),
cable entertainment (𝑐 ); and an outside option of not watching television. The nesting structure dictates that viewers first choose a
program type and then select a particular broadcast of that type. We denote the viewer choice set as 𝐵 = {𝑙, 𝑛, 𝑒, 𝑐}. We let 𝑆𝑏 , 𝑏 ∈ 𝐵
denote the set of stations broadcasting a program of type 𝑏 ∈ 𝐵 , and let 𝑏𝑖 ∈ 𝐵 represent the type of broadcast of station 𝑖. Following
the utility maximization framework in Berry and Waldfogel (1999) and Berry et al. (2015), a viewer 𝑝 of a program of type 𝑏 offered
by station 𝑗 in market 𝑚 at time 𝑡 earns utility:
8
See George and Hogendorn (2012) for a theoretical treatment of advertising context and Furnham et al. (2002) and Goldfarb and Tucker (2011) for evidence
from experiments.
9
The importance of lags and lead-in effects in television viewing is well documented. See, for example, Esteves-Sorenson and Perretti (2012) and Wilbur (2008).
Textbook discussions for setting broadcast lineups also contain detailed discussions of how to factor in various features of programming dynamics, including strategies
such as ‘tentpoling’ and ‘hammocking’ (Gross et al., 2005, Chapter 9). Moreover, these discussions also discuss dynamic aspects of lineup choice with respect to
the lineups offered by other stations. From a research perspective, lead-in effects and other complementarities in viewing also figure into more complex models of
viewership such as in Crawford et al. (2018).
10
One way to view multiplicity in our set-up is that instead of mixing over strategies, in the event of multiple equilibria players move sequentially with equal
probability. We find this simplification reasonable in our setting because we observe very stable schedules over long time horizons.
11
There is empirical support for this approach: a study of differentiation in local television for the FCC found minimal evidence of either horizontal or vertical
differentiation in local news George and Oberholzer-Gee (2010).
12
The superscript 𝑣 is used to distinguish terms in the viewership equation from similar terms in the advertising revenue equation.
9
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
where mean utility depends on a broadcast-type dummy, 𝛼𝑏𝑣 , and explanatory variables 𝑋𝑗𝑚𝑡 𝑣 through coefficients 𝛽 . The term 𝜔𝑣
𝑣 𝑗
𝑣
is a station-level error common to all broadcasts on the station, 𝜔𝑚𝑡 is a market-time effect common to all stations in a market at a
given timeslot, and 𝜀𝑣𝑗𝑚𝑡 is a station-market-time error.
We include station-level effects to account for unobserved popularity of individual stations. We include market-timeslot effects to
account for unobserved preferences for overall television viewing in a market through the evening. We model station and market-
time effects as random effects to reduce the complexity of estimation, but after estimation is complete we recover a predicted value
for these effects.
Our data is at the broadcast level, so we cannot directly control for time-varying preferences for particular program formats
with explanatory variables. We do, however, control for program sequencing with a dummy variable indicating whether or not the
broadcast in the current timeslot is national news and in the preceding timeslot was local news. We also include lagged viewing,
and lagged viewing interacted with the sequencing indicator. These terms can capture diminishing marginal product of viewing time
through the evening as well as spillover effects known to be important in television (Esteves-Sorenson and Perretti, 2012).13 We also
allow the contribution of local news broadcasts to mean utility, 𝛼𝑙𝑣 , to depend on the number of viewing households in each market.
Specifically, we specify 𝛼𝑙𝑣 = 𝛼𝑙0𝑣 + 𝛼 𝑣 (𝐻), where 𝐻 is the number of viewing households. Added flexibility in the viewing parameter
𝑙1
can capture heterogeneity in demand that may arise in larger, more diverse markets.
The nested form (2) implies that shares can be written as a product of within-group and group shares; 𝑠𝑖 = 𝑠𝑖|𝑏 𝑠𝑏 , where 𝑠𝑖|𝑏 is
station 𝑖’s share of stations with broadcast type 𝑏, and 𝑠𝑏 is the broadcast group share. Berry (1994) shows how this results in a linear
expression for the relative share (omitting subscripts for ease of exposition):
𝑦𝑖 = ln 𝑠𝑖 − ln 𝑠0 = 𝛿𝑏𝑖 + 𝜇𝑏 ln 𝑠𝑖|𝑏
which in our case can be written:
In addition to allowing us to simulate counterfactual viewing, the share form NMNL allows for straightforward calculation of the
total expected utility of the representative consumer from viewing. McFadden (1980) showed that the NMNL derives from a utility
function with the form (omitting the 𝑚, 𝑡 subscripts for simplicity)14 :
[ ]
∑ 1−𝜇𝑘 ∑ 𝛿𝑛𝑘
𝑈 = ln 𝐷𝑘 , 𝐷𝑘 = 𝑒 1−𝜇𝑘 (5)
𝑘∈𝐵 𝑛∈𝑆𝑘
For our purposes, the important thing about expression (5) is that it links observed or counterfactual viewing shares to utility. As
𝑠𝑖
𝑠𝑖|𝑏 = , mean utilities can be expressed as:
𝑠𝑏
[( )1−𝜇𝑏 ( )𝜇𝑏 ]
𝑠𝑖 𝑠𝑏
𝛿𝑏𝑖 = ln (6)
𝑠0 𝑠0
Inserting (6) into the term inside the log operator in (5) results in:
( ) 𝜇𝑏
∑ 𝑠𝑖 𝑠𝑏 1−𝜇𝑏
𝐷𝑏 = (7)
𝑠
𝑘∈𝑆 0
𝑠0
𝑏
∑
As 𝑠𝑏 = 𝑘∈𝑆𝑏 𝑠𝑘 , 𝐷𝑏 in (7) simplifies to:
( ) 1
𝑠𝑏 1−𝜇𝑏
𝐷𝑏 = (8)
𝑠0
Substituting this last result into the utility function (5), we have:
[ ]
∑ 𝑠𝑘
𝑈 = ln = − ln 𝑠0 (9)
𝑠
𝑘∈𝐵 0
13
Our approach can also be seen as a way of characterizing some of the features of the detailed viewership model of Crawford et al. (2018), who have access to
individual-level viewing data and are therefore able to build cumulative viewing time into the viewership model more directly.
14
Equation (5) provides an alternative derivation of shares based on application of Roy’s identity - differentiation of (5) with respect to 𝑢𝑖 (𝑏𝑖 ) yields 𝑠𝑖 .
10
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
The last part of (9) follows from the fact that the sum of group shares must sum to one. Hence, the utility gains to viewers vary
in inverse proportion to the fraction of non-viewers. We use this result in our welfare analysis to link changes in viewing under
counterfactual programming to viewer welfare.
Equation (4) gives an expression for the mean utility a viewer derives from watching a particular local station at a given time:
𝛿𝑖 = 𝛿(𝑏𝑖 , 𝑧𝑖 , 𝜖𝑖 , 𝜃𝑣 ) (10)
In equation (10), 𝑧𝑖 collects all additional control variables, including lags of viewership, to make notation as compact as possible.
The audience share obtained by station 𝑖 then depends on viewership characteristics 𝜃𝑣 , the 𝛿 value for a given station 𝑖, and the
value of 𝛿 for its competitors:
5.3. Advertisers
We generally follow Berry and Waldfogel (1999) in modeling advertising. Our goal is to model advertising in a way that allows
calculation of consumer surplus to advertisers where valuations depend on both viewing and program type. As with our viewer
model, our approach seeks to balance a realistic framework with analytic tractability to produce sensible counterfactual results.
Suppose advertisers have diminishing willingness to pay for additional viewers, and that price-per-viewer is of an exponential
form:
𝑝𝑝𝑣𝑗 = 𝐾𝑗 𝑣𝜂−1
𝑗 (12)
where 𝜂 ∈ (0, 1). Multiplying 𝑝𝑝𝑣 in (12) by the number of viewers 𝑣𝑗 gives an expression for how advertising revenues depend on
total viewership:
𝑟 = 𝑝𝑝𝑣𝑗 ∗ 𝑣𝑗 = 𝐾𝑗 𝑣𝜂𝑗
That is, price-per-viewer multiplied by the number of viewers gives total advertising revenues per unit time, and 𝜂 can be interpreted
as an elasticity. Taking logs gives the linear relationship:
ln 𝑟𝑗 = ln 𝐾𝑗 + 𝜂 ln 𝑣𝑗 (13)
Equation (13) is a log-linear equation describing how total advertising revenues depend upon viewing. As was the case with the
viewing model, the linear estimating equation can be readily matched to data through inclusion of fixed or random effects.
The willingness to pay function (12) describes advertiser surplus in relation to viewing. Integrating (12) with respect to total
𝐾
viewers gives 𝑣𝜂 , which is the gross utility advertisers derive from contacting 𝑣 viewers. Subtracting advertiser payments to
𝜂
stations 𝑟 gives advertiser surplus as:
𝐾𝑗 1−𝜂
𝐴𝑆𝑗 = 𝑣𝜂𝑗 − 𝐾𝑗 𝑣𝜂𝑗 = 𝐾𝑗 𝑣𝜂𝑗 (14)
𝜂 𝜂
Thus, advertiser surplus amounts to weighting observed revenues in a way that depends upon the elasticity of willingness to pay with
respect to viewers. By allowing the parameters of (13) to vary with broadcast type, we allow advertisers to value the same viewer
watching different programming differently.
In estimation, we specify the constant ln 𝐾𝑗 in (13) as a function of observable covariates, market-time random effects, and
station random effects. We also allow the elasticity of revenue with respect to total viewership to depend upon programming type.
The resulting revenue equation is:
𝑝 𝑝 𝑝
where 𝜔𝑗 is a station-specific effect, 𝜔𝑗𝑡 is a market-timeslot specific effect, and 𝜀𝑗𝑚𝑡 is an idiosyncratic error term. As with the
viewing model, our ultimate aim is to recover station and market-time level effects, which we compute after estimating the model.
To simplify exposition, we also use the following shorthand notation for (15):
11
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
5.4. Stations
Local television stations choose broadcast lineups to maximize advertising revenue, which depends on viewing.
Of the four programming categories (𝐵 = {𝑙, 𝑒, 𝑛, 𝑐}), local stations can only choose between local news (𝑙 ) and general enter-
tainment (𝑒). Local stations treat national news as a fixed component of the lineup and treat cable schedules as exogenous. Hence,
when a station is free to choose programming, in each time slot it selects from the binary choice set 𝐵 ′ = {𝑙, 𝑒}. Although the binary
nature of programming choice is a simplification, it admits a large number of potential broadcast menus over the six early-evening
timeslots. A station that does not broadcast national news makes a binary 𝑙 or 𝑒 programming decision in six timeslots, which creates
26 = 64 possible programming sequences over the course of the evening.
Viewership 𝑣𝑖 is simply the viewing population 𝑀 multiplied by the viewing share 𝑠. Thus, using equations (11) and (16), we
can write revenue as:
Our maintained assumption is that the observed broadcasting lineup in any market is a pure-strategy Nash equilibrium. This
requires that the broadcast menu chosen by each station is revenue-maximizing given the broadcast lineups chosen by other stations.
Using (18), this means that:
∀𝑏′ ∈ 𝑖 , ∀𝑖 ∈ 𝐿 (19)
It bears emphasizing that the error terms used for the likelihood describing stations’ broadcasting choices rely on the vectors
𝜖𝑣′ , 𝜖𝑝′ , which are the unobserved components of price and viewership from the broadcasting options that stations did not choose.
5.5. Identification
We estimate our model with viewing, advertising and station programming data in the 101 largest DMA’s in February 2010. We
work with a single, averaged observation for each station in each of the six half-hour timeslots in the 5:00–8:00 p.m. time period.
We work with an averaged observation since we observe little schedule variation at the format level and want to remove viewing
variation associated with idiosyncratic events realized after scheduling and advertising decisions are made.
Our viewer model includes station and market-time random effects. Station effects capture unobserved preferences for particular
stations correlated with the error term 𝜔𝜈𝑗 that do not vary by format. An example might be station affiliation with a particular
national network. Market-timeslot effects capture unobserved, market-specific preferences for the timing of overall television viewing
in the error term 𝜔𝜈𝑚𝑡 . An example here might be commute times that affect viewing over the evening. With station and market-time
effects, viewing parameters are identified from variation within stations across timeslots, and across stations in each market within
timeslots.
Because our underlying data are at the broadcast level, we cannot fully control for format-specific preferences that might vary
over time within stations. Our viewing parameters are thus subject to bias from some unobserved preference patterns within markets
over the evening. An example might be a preference for (or against) later local news on Spanish-language stations due to longer
commute times by Hispanic viewers.
Within our model, we address the potential for time-varying preferences for local news in several ways. We include a sequencing
term for local news followed by national news to control for one form of timing preferences within stations. We include lagged
viewing, alone and interacted with sequencing, to control for diminishing or increasing marginal returns to viewing within stations.
To address the potential for unobserved heterogeneity in demand for local news versus entertainment over time, we allow the
contribution of local news broadcasts to mean utility to depend on market size. Intuitively, viewers in larger, more diverse markets
might consider local news broadcasts less substitutable for other formats than viewers in smaller markets. This might lead, for
example, to higher station loyalty in larger markets. Overall station loyalty would be captured in our station effect. However, format-
specific loyalty might alter the benefit of later local news broadcasts, or additional local news broadcasts, vis-á-vis entertainment
12
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
options. Since viewers can watch only one program at a time, format-specific station preferences can thus translate into format-
specific time preferences. Allowing the substitutability of local news for other programming to vary with market size helps account
for unobserved demand for variety.15
As a final note on unobserved timing preferences, we note that neither market size nor any other market-level observable in our
data can explain variation in the timing of local news broadcasts across markets. For example, a regression of the share of “late” local
news broadcasts in the latter three timeslots (13% of broadcasts) on viewing households, Black household share, Hispanic household
share, median income, median age and the number of local stations explains only 9% of the variation (unadjusted 𝑅2 ), with none of
the observables statistically significant even at the 10% level.
On the advertiser side, our advertiser model specifies a diminishing willingness to pay for additional viewers that varies by
program format. As with the viewer model, we include station and market-time random effects. Station effects control for advertiser-
specific preferences for viewers of particular stations. Market-time effects capture unobserved differences in average willingness to
pay for audiences in different timeslots. Because we allow advertisers to value additional viewers differently in different programming
contexts, the potential for bias from unobservables at the format-market-timeslot level is less of a concern than with the viewer model.
Institutional features of the market also reduce the potential for bias from time-varying unobservables. In particular, local television
advertising on local stations is typically sold by “daypart”, one of which is the evening news period that is the subject of our analysis.
Although advertisers can in some instances select particular broadcasts, media buying is heavily focused on stations and dayparts,
with limited advertiser choice of narrower time blocks.
To summarize, while none of our modeling approaches can fully eliminate the potential for bias from unobserved viewer or
advertiser preferences, taken together with elements of the raw data we are confident that our approach is robust to the most likely
and most important preference patterns.
6. Estimation
We first describe the likelihood function built from the viewing, advertising and station components of the model. We then
describe our approach to estimating the model and report estimated parameters.
The contribution of each market 𝑚 to the likelihood function has four components: the viewership model (𝑣𝑚), the advertising
model (𝑝𝑚), the station revenue maximization condition (𝑟𝑚), and a correction term arising from the assumption that station choices
reflect the outcome of a game (𝑐𝑚). The last term is needed because condition (19) does not fully determine a unique market outcome
(see Ciliberto and Tamer (2009), Bajari et al. (2010b), or Ellickson and Misra (2011a) for discussion of the details of this “coherence”
problem).
We write likelihood contribution of a market as a function of its four components,
𝜖 𝑣 = 𝑦 − 𝜇𝑏 ln 𝑠.|𝑏 − 𝛼𝑏 − 𝑋 𝑣 𝛽𝑣 (21)
which expresses the joint likelihood of 𝑦 in terms of 𝜖𝑣 , where 𝜖𝑣 collects error terms, with components 𝑣
𝜖𝑗𝑚𝑡 = 𝜔𝑣𝑗 + 𝜔𝑣𝑚𝑡 + 𝜀𝑣𝑗𝑚𝑡 .
Accordingly, we have:
| 𝑑𝜖 𝑣 |
𝐿𝑣𝑚 = 𝑓𝑣 (𝜖 𝑣 , 𝑋𝑣 , 𝜃𝑣 ) || |
| (22)
| 𝑑𝑦 |
| 𝑑𝜖 |
𝑣
where | | is the Jacobian determinant of 𝜖 𝑣 with respect to 𝑦. In the Appendix, it is shown that this determinant is:
| 𝑑𝑦 |
| 𝑑𝜖 𝑣 | ∏
| | (1 − 𝜇𝑏 )𝑁𝑏 −1
| 𝑑𝑦 | =
| | 𝑏∈𝐵
That is, to compute the Jacobian determinant, it is necessary to count the number of broadcasts of each type, subtract one from
this count, and raise the terms (1 − 𝜇𝑏 ) to the power 𝑁𝑏−1 . We assume that the terms are jointly normally distributed according to
15
Our welfare results are robust to an alternative specification that does not allow the viewing parameter to depend on market size. Allowing the viewing parameter
to depend on the number of stations in the market instrumented with market size and other market demographics also produces qualitatively similar results. In
general, allowing flexibility in the substitutability parameter leads to a closer fit of our model to observed outcomes, which reduces differences across counterfactual
scenarios.
13
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
the covariance matrix Ω𝑣 , where Ω𝑣 is built to take into account common market-time and station-level random effects. We employ
random effects because in subsequent components of the likelihood, we will have terms for which differencing out fixed effects
would be impossible. Random effects therefore allow us to estimate the model with a (much) smaller number of parameters while
also introducing market-time effects.
𝜖 𝑝 = ln 𝑟 − 𝛼𝑏𝑝 − 𝜂𝑏 ln 𝑣 (23)
𝑝 𝑝 𝑝 𝑝
where in similar fashion to the viewership likelihood, 𝜖 𝑝 collects error terms: 𝜖𝑗𝑚𝑡 = 𝜔𝑗 + 𝜔𝑚𝑡 + 𝜀𝑗𝑚𝑡 .
As the transformation from dependent variable to error term for (23) is unitary, we have simply
𝐿𝑝𝑚 = 𝑓𝑝 (𝜖 𝑝 , 𝑋𝑝 , 𝜃𝑝 ) (24)
We again assume that 𝑒𝑃 is normally distributed with covariance matrix Ω𝑝 , where Ω𝑝 considers market-time and station-level
random effects for the same reasons we rely on random effects in the viewership likelihood.
where Γ𝑖 is the integration region coincident with (25). But in practice this integral is difficult to calculate, so we instead simulate it
′𝑝
by drawing 𝜖𝑖′ 𝑣 and 𝜖𝑖 . We detail the simulation procedure in Appendix B; essentially, the key aspect of generating counterfactual
draws is that they occur subject to the restriction that the resulting advertising revenues be less than observed revenues, holding
constant the behavior of other stations.
Even when well-approximated, equation (26) may overstate the likelihood because more than one equilibrium may be consistent
with a parameter set and a given draw of error terms. This is an example of the well-known “coherence” problem that arises in
estimating discrete games. We address this by adjusting the likelihood based on the number of pure-strategy equilibria.
Let 𝐁 = [𝐵1 × 𝐵2 × … 𝐵𝑛 ] denote the strategy space, and let 𝚪 = [Γ1 × Γ2 … Γ𝑛 ] denote the overall regions of integration described
by (25). The equilibrium set is:
[ ]
𝑏 (𝑋, 𝜃, 𝜖, 𝜖 ′ ) = 𝑏 ∈ 𝐁, 𝜖 ′ ∈ 𝚪 and 𝑏 is a Nash Eq.|𝑋, 𝜃, 𝜖, 𝜖 ′ (27)
If 𝑛(𝑏 ) denotes the number of elements in a set, we add to the likelihood a term:
1
𝐿𝑐𝑚 = 𝑃 (𝑏|𝑏 ∈ 𝑏 ) = (28)
𝑛(𝑏 )
Our approach effectively assumes that if there are multiple pure-strategy equilibria for a given draw of the error terms, each is
equally likely to be played. While less nuanced than recent approaches to multiplicity that incorporate imperfect information, our
weighting technique is transparent and allows for straightforward counterfactual simulation. Simple weighting also has a tradition
in the literature, for example Bjorn and Vuong (1984) and Kooreman (1994).16
By construction, 𝑏 always has at least one member (the observed market outcome) and therefore, for any set of simulations, there
is always at least one pure-strategy equilibrium. Checking for the existence of additional equilibrium broadcast profiles is not trivial.
We use a simulation-based approach to finding additional equilibria. Some exploration of model results revealed that multiplicity
was not extensive, in that most of the time the simulated error terms led to the observed market outcome as the unique outcome.
When multiplicity did occur, the alternative equilibria were typically “close by,” involving broadcast changes among a small number
of stations.
Based on this observation, we chose an approach that starts with perturbations of the observed outcomes of the model. For a
random set of stations we introduce a random number of changes to their strategies. We then iterate best responses until reaching
an equilibrium, noting if the outcome was different from the observed outcome. We count the number of distinct equilibria obtained
in this way and compute (28) accordingly. See Appendix C for additional details.
16
Bajari et al. (2010a) also suggest weighting equilibria by their characteristics (i.e., whether they are Pareto-optimal), or by their relative payoffs. In fact, de Paula
(2013) points out that weighting equilibria in inverse proportion to the equilibrium count is a special case of the Bajari et al. (2010a) approach.
14
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
∏𝑀
With the four expressions (22), (24), (26), and (28), we completely specify 𝐿𝑚 . We write the full model likelihood as 𝐿 = 𝑚=1 𝐿𝑚
and turn to discussion of estimation. Additional details are provided in the appendix.
In the previous section, we described a method for simulating the likelihood function. To use this likelihood in parameter
estimation, we follow Bajari et al. (2010b) and Chernozhukov and Hong (2003) and use a simulation-based estimation method
that does not rely on maximization of the likelihood function but instead relies on Markov Chain Monte Carlo sampling from the
parameter distribution implied by the likelihood.17
We follow a two-step estimation procedure. In the first step, we use the naïve model to get starting values and importance
sampling weights. To obtain starting values, we fit the viewing and advertising models in simple linear form, that is, without assuming
that observed outcomes are profit-maximizing and without correcting for the possibility of multiple equilibria. Based on the initial
parameter estimates, we simulate 10 counterfactual errors and associated viewing and revenue outcomes, and compute importance
weights for each draw. In the second step, we apply MCMC-based estimation, restricting all of the multinomial substitutability
parameters to the (0, 1) range.18
We do not rely on a classical estimation method for practical reasons. As our model has several layers and components that
are simulation-based, not to mention large amounts of data, maximization of a complicated objective function is difficult and time-
prohibitive using standard methods. On the other hand, as we have relied on likelihood functions for all components of the model,
we have a well-defined probability distribution for all parameters from which to draw. Further, in contrast to maximization-based
methods of estimation, the difficulty of estimation poses no problems for an MCMC-based approach. Similarly, it might be the case
that our likelihood, a complex object with some simulated parts, has flat portions and is otherwise ill-behaved. MCMC estimation
method sidesteps these problems as well, and carries the additional benefit of generating standard errors for parameters as part of the
estimation process, rather than requiring computation of a matrix of second derivatives. An overview of the estimation procedure,
method of drawing error terms, theory of MCMC-based estimation, and development of importance sampling weights is provided in
the appendix.
Parameter estimates are reported in Table 7. The two columns on the left show estimates and standard errors using the viewing and
advertising likelihoods alone. The right two columns show estimates from the full model that controls for selection and multiplicity.
The top portion of the table reports parameters from the viewing model. Estimated values of the NMNL substitutability parameters
from the full model indicate that national and local news programs are more differentiated from each other than entertainment
programs. However, all of the estimated substitution parameters are small, indicating that the nesting parameters are not that
important in describing viewer choice. In essence, nesting adds little over a standard multinomial logit model in explaining viewing.
In practical terms, the estimates suggest that viewers tend to look at the spectrum of broadcasts and make a choice rather than first
choosing a category.19 The program category measures increase in magnitude between the initial and full specification (the omitted
category is cable entertainment), indicating that our maintained assumption that observed lineups maximize revenue is important in
estimation.
The local news indicator interacted with viewing households is negative, indicating that local news broadcasts are less differen-
tiated from the viewer perspective in larger markets. This result to some degree counters the intuition that viewers in larger, more
diverse markets might prefer more tailored news programs. However, the estimate is consistent with summary statistics in Table 2
indicating a lower overall viewing share for local news in large markets.20
The coefficient for the sequencing dummy is large and positive, suggesting that time preferences play a role in viewer demand. In
the full model, lagged viewing is less important than sequencing alone or sequencing interacted with lagged viewing. Together, these
results indicate that complementarity across broadcast types is more nuanced and more important in broadcast competition than
simple lead-in measures. The estimated standard deviations of the random effects differ by specification. In particular, the complete
model attributes more of the variation in viewership to station and, especially, market-time effects than the naïve model.
Estimates for advertising prices and revenues are shown in the middle “Advertising” portion of the table. Advertising prices for
local news broadcasts are slightly higher on average than general entertainment prices. (The omitted category here is national news,
as we do not observe local advertising prices for cable TV.) Advertiser elasticity of demand for national news viewers is lower than
for local news viewers, which in turn is (slightly) less than elasticity of demand for entertainment viewers. The difference means that
an increase in viewing raises the advertising price per second more for entertainment broadcasts than for local news broadcasts. This
17
For further discussion and examples, see Baker (2014).
18
We also considered alternative specifications with additional dynamic explanatory variables (i.e., more lagged and lagged-interacted viewing terms) and also
with variables that increased the flexibility of the NMNL specification as suggested by Ackerberg and Rysman (2005). Model adjustments did not substantially change
results, confirming our intuition that station and market-time effects in the viewing and advertising equations adequately control for these factors.
19
This result contrasts with the estimates in Berry and Waldfogel (1999) for radio, where broadcast substitutability parameters approach unity revealing that
categories are highly differentiated from the listener perspective.
20
The result is also consistent with results in Waldfogel et al. (2004), which finds that larger markets offer substantially greater variety in entertainment programming
targeting minority viewers, attracting viewers away from other programming and the outside option.
15
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
Table 7
Estimation results: starting values and full model.
Eq-by-Eq ML Full Model
b se b se
Viewing
𝜇: local news 0.010 0.015 0.0003 0.0004
𝜇: general entertainment 0.0274 0.0230 0.0002 0.0003
𝜇: national news 0.010 0.009 0.001 0.002
𝜇: cable entertainment 0.00002 0.00001 0.00001 0.00002
Local news 1.550 0.026 3.256 0.048
Local news × households -0.014 0.009 -0.119 0.003
General entertainment 0.891 0.067 1.009 0.019
National news 0.132 0.053 0.178 0.029
Sequence local-national news (t-1) 1.698 0.055 2.134 0.069
Viewing share(t-1) 0.290 0.106 0.135 0.002
Viewing share(t-1), sequence local-national (t-1) 0.129 0.006 0.146 0.012
Market Size (viewing households) -0.229 0.055 -0.494 0.003
Constant -3.220 0.090 -4.344 0.033
Viewership: log(sd) station RE 0.704 0.148 0.898 0.006
Viewership: log(sd) market-time RE 1.212 0.377 2.727 0.058
Viewership: log(sd) model 0.372 0.057 0.631 0.001
Advertising
Local news viewer elasticity 0.608 0.002 0.598 0.002
Entertainment viewer elasticity 0.709 0.001 0.686 0.002
National news viewer elasticity 0.283 0.002 0.262 0.002
Local news -2.973 0.018 -3.162 0.016
General entertainment -3.578 0.017 -3.633 0.014
Market Size (viewing households) 0.868 0.001 0.863 0.001
Constant -12.469 0.027 -12.609 0.009
Revenue: log(sd) station RE 0.367 0.011 0.365 0.009
Revenue: log(sd) market-time RE 0.313 0.009 0.388 0.009
Revenue: log(sd) model 0.294 0.002 0.289 0.002
result suggests why stations tend to schedule entertainment broadcasts later in the evening when a greater share of the population
chooses television over outside options.
7. Welfare analysis
The goal of our welfare analysis is to compare counterfactual equilibrium program configurations that maximize total viewing,
advertiser surplus and joint station revenue to each other and to observed lineups. These distinct equilibrium outcomes allow us to
evaluate inefficiencies in program allocation overall, and to examine the mechanisms underlying misallocation, which is relevant for
policy.
To build our counterfactuals, we first estimate the random effects for each station and market-time block. With these terms, we
simulate the model 10 times. For each simulation, we calculate market equilibria, then redraw so as not to overweight simulations
with multiple equilibria. Because of the large strategy space, it is impractical to check all strategies for maximized viewing, advertiser
surplus and joint station revenue. We instead use a simulation-based approach to maximization in which we randomly perturb a
station’s broadcast profile, check to see that the proposed change in strategy increases the quantity of interest, and repeats the
process until further guesses fail to find improvement over many attempts. We then average results for the ten simulations for our
welfare analysis.21
We make three key comparisons. First, we assess how far observed program schedules lie from the configuration that maximizes
total television viewing. This initial comparison offers an overall sense of market performance and speaks to the central question
of our paper, whether stations provide too much, or too little, local news. Second, we compare observed schedules to the program
configuration that maximizes station revenue. This comparison characterizes the extent to which business stealing reduces welfare
in relation to a fully collusive outcome. Third, we compare the configuration that maximizes viewing to the configuration that max-
21
Both Seim and Waldfogel (2013) and Fan and Yang (2020) also have a “large strategy space” problem. Seim and Waldfogel (2013) solve their monopoly problem
by using integer programming and optimization. They solve their Nash equilibrium problem by using sub-optimal heuristics. Fan and Yang (2020) solve their problem
in a way similar to ours, using deviations in strategies and iterating best responses until a Nash equilibrium is reached. While there is no perfect way to deal with a
large-strategy space, our approach can usefully be applied with different types of objective functions.
16
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
Table 8
Observed vs. optimal broadcasts and viewing.
Observed Viewer Max Advertiser Max Station Max
Local News Broadcasts 10.2 17.6 13.6 13.0
National News Broadcasts 12.5 12.5 12.5 12.5
Entertainment Broadcasts 41.8 34.3 38.4 39.0
Cable Broadcasts 487.0 487.0 487.0 487.0
imizes advertiser surplus. This third comparison measures the scope for divergence between the interests of viewers and advertisers
in the context of a two-sided market. With these comparisons, we proceed to evaluate the characteristics of markets most likely to
experience welfare loss from sub-optimal program choices.
We summarize our welfare calculations in Table 8. The top portion of the table shows the number of local news and entertainment
broadcasts that maximize viewing, advertiser surplus and joint station revenue along with observed outcomes. (We show national
news and cable broadcasts for completeness, but these are taken as given by stations so do not vary across profiles.) Broadcast totals
are summed over all stations in a market and all six evening timeslots, then averaged across markets. The middle portion of the table
reports total viewers per thousand households associated with each program configuration, also summed across stations and timeslots
and averaged across markets. Because the same household may watch multiple broadcasts, viewing figures are best interpreted as
program views per thousand households.
The bottom two rows of the table report advertiser surplus and station revenue at the observed and maximizing program allo-
cations. Advertiser surplus and station revenue is averaged rather than summed across markets and shown in dollars per second of
advertising. With an average of ten local stations per market, multiplying station averages by ten closely approximate market values
per second.
Overall, results in the table reveal under-provision of local news during the evening news hours relative to the viewer maxi-
mum. We observe 10.2 local news broadcasts per day on average. Total television viewing would be maximized at 17.6 local news
broadcasts, a shortfall of 12.8% of local station broadcasts.
The middle portion of the table shows total viewing associated with the three maximizing broadcast allocations. At the viewer
maximum, local news viewing would increase from 48.9 to 102 program views per thousand households, about double observed
levels. Entertainment viewing would fall from 82.1 to 55.1 program views per thousand households. Local news viewing would
increase more than entertainment and national news viewing would fall, with a total average viewing increases of 8%.
The bottom section of the table reports results for advertiser surplus and station revenue. A shift to the schedule that maximizes
viewing would reduce advertiser surplus by $0.21 per advertising second, or 2.5%. Joint station revenue would fall by $0.66 per
advertising second, about 5.2% on average. With 3,600 advertising seconds per evening, a move from the observed program lineup to
the viewer maximum would decrease advertiser surplus by a total of $757 per station each day. A move from the observed program
lineup to the viewer maximum would decrease joint station revenue by $2, 390 per station each day. With an average of about 10
stations per market, the combined cost to advertisers and stations of moving to the viewer maximum would be about $36, 087 per
market on average, or $0.19 per television viewer.
It is worth emphasizing at this point that we cannot evaluate whether the benefits of shifting to the equilibrium that maximizes
viewing outweigh the estimated cost in station revenue and advertiser surplus, as we cannot quantify the value of television broad-
casts to the representative viewer relative to the outside option of not watching television. This limitation has been recognized in the
literature on free-to-air television and radio markets since the foundational work of Steiner (1952) and Beebe (1977). So while we
refer to the “viewer, advertiser and station optima” for expositional consistency, from the viewer standpoint we are characterizing
the allocation that maximizes program views over the six timeslots.
We can, however, make welfare comparisons more concrete by comparing observed outcomes to those that maximize joint station
profits. A comparison of the first and last data columns of Table 8 shows that the 10.2 observed local news broadcasts are lower
even than the 13 broadcasts that would maximize joint station profits. Out of 60 evening timeslots on average, this is an overall
mis-allocation of 4.5%. In terms of total viewing, the station optimum is only about 2.5% lower than observed viewing, but local
news viewing would increase from 48.9 to 63.9 program views at the station optimum, or 33%. A shift to the station optimum
would increase joint station revenue by $0.30 per ad-second, or 2.8%. Total revenue would increase by $1,071 per station per day,
or $12,870 per market per day on average. Advertiser surplus would also increase at the station optimum by $0.26 per ad-second,
which translates to an average of $960 per station per day, or $11,708 per market.
17
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
Table 9
Market Example in Syracuse, NY.
7:00-7:30 pm
Broadcast Ad Price Viewing
Station Network Observed Station Optimum Observed Station Optimum Observed Station Optimum
WSTQ CW Entertainment Local News $1.34 $1.18 0.788 0.885
WSTM NBC Entertainment Entertainment $8.71 $10.29 5.478 6.572
WSYR ABC Entertainment Entertainment $7.72 $7.04 6.797 6.788
WSYT FOX Entertainment Entertainment $4.40 $5.62 7.483 7.48
WTVH CBS Entertainment Entertainment $4.05 $4.24 7.338 7.329
WNYS MNT Entertainment Entertainment $2.26 $2.53 2.191 2.2
Evidence that observed revenue falls below the level that maximizes joint station profits suggests that business stealing is an
important source of welfare loss in advertiser-funded television. We highlight the mechanism with a market example in Table 9,
which shows the broadcast schedule in Syracuse, New York, during the 7:00-7:30 timeslot. The broadcast columns indicate that
all six local stations in the market show entertainment programs during this period. The station optimum includes one local news
broadcast. In this example, if the CW affiliate WSTQ were to switch from entertainment to local news, our simulations show that
revenue at WSTQ would fall by $0.16 per ad-second. However, collective gains to competitors broadcasting entertainment would
rise by $2.42 per ad-second, increasing total revenue in the market by 8.5%. Total viewing would increase by 3.9% and local news
viewing by 12.3%. Consistent with classic examples from the theoretical literature, in this market no single station has an incentive
to alter programming, but joint station profits would increase with differentiation.
We can evaluate losses from a different perspective by comparing the viewer and advertiser equilibria. Table 8 indicates that the
program allocation that maximizes viewing differs from the allocation that maximizes advertiser surplus by 4 broadcasts on average.
This difference measures the scope for two-sided market distortions that arise when stations choose programming to satisfy the tastes
of advertisers over viewers. The difference is only slightly larger than the difference between the observed outcome and the station
optimum, 2.8 broadcasts per market on average. These magnitudes suggest that both business stealing and two-sided market forces
play a meaningful role in biasing programming away from the viewer ideal. Coordinated scheduling would remove about one third
of the deficit induced by the divergence between advertiser and viewer preferences. It is also worth noting that advertisers would
collectively gain even more than stations from overcoming business stealing and shifting to station optimum. Finally, the table shows
that the advertiser optimum of 13.6 local news broadcasts offers advertiser surplus very close to the station maximizing allocation.
Table 8 reports market totals, but timing plays an important role in broadcast outcomes. Table 10 shows observed and maximizing
lineups by timeslot, averaged across markets. The differences between the observed and optimal lineups for viewers, advertisers and
stations are small early in the evening, increasing later, with the largest gap in the 7:30-8:00 timeslot. The difference in viewing
between observed and optimal programming also diverges as the evening progresses. The time pattern in mis-allocation is similar
for each of the three equilibria, with the largest gap for viewers and smallest gap for stations.
The time patterns in broadcast outcomes are illustrated in Fig. 4, which shows a sequence plot of observed broadcast patterns over
the evening in relation to the viewer, advertiser and station optima. The value of more differentiated programming across the evening
can be seen in the multicolored bars of the optimal allocation relative to the solid areas on the observed outcome. The sequence plots
also highlight that our simulations do not just identify local news deficits in lineups. All three maximizing allocations have more
entertainment broadcasts early in the evening than the observed allocation, though in the case of the viewer optimum the difference
is small. Overall, across six timeslots in 101 markets, we observe 351 timeslots with less local news than would maximize joint station
revenue and 95 timeslots with more local news than would maximize joint station revenue, with underprovision concentrated in later
timeslots and over provision in earlier timeslots.
Table 11 summarizes the cost in station revenue associated with shifting to the allocation that maximizes viewing at the timeslot
level. Switching program configurations to the viewer maximum would increase average prices by a small amount, 0.33 - 1.19%
in the first four evening timeslots. A switch would reduce average prices later in the evening, by less than 1% at 7:00 but by 3.8%
in the 7:30 timeslot on average. Similarly, reallocation of programming to the viewer maximum would increase total viewing by
just a small amount early in the evening, with larger increases in the late timeslots. Again the largest increase would be in the 7:30
timeslot, raising local news viewing by 7.58%.
We can also use results of the simulation to better understand the conditions that contribute to inefficiency in programming. To
do this, we link the patterns in Table 8 to market characteristics using ordinary least squares regressions. While we cannot rule out
bias from unobserved market characteristics in these cross-sectional regressions, they nonetheless offer useful insights on the nature
of losses and areas of policy interest.
Results are shown in Tables 12 and 13. Table 12 examines misallocation in station programming. The dependent variable in the
first column is the difference between the number of local news broadcasts at the configuration that maximizes joint station revenue
and the observed number of local news broadcasts. This is our measure of misallocation from business stealing. The dependent
variable in the second column is the difference in local news broadcasts between the viewer optimum and advertiser optimum. This
is our measure of misallocation induced by differences in viewer and advertiser values for program formats. The dependent variable
in column (3) is the difference between the number of local news broadcasts that maximizes viewing and the number observed,
which is a measure of total deviation from viewer preferences. The average value of each difference is positive, indicating local news
shortfalls on average.
18
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
Table 10
Observed vs. optimal broadcasts and viewing by timeslot.
Observed Viewer Max Advertiser Max Station Max
Local News Broadcasts
5:00 3.4 3.7 3.5 3.5
5:30 1.9 2.2 2.0 2.0
6:00 3.5 4.1 3.8 3.7
6:30 0.8 1.4 0.9 0.9
7:00 0.5 1.7 0.9 0.9
7:30 0.2 4.6 2.4 2.0
Entertainment Viewing
5:00 4.8 4.5 4.8 5.0
5:30 5.1 4.7 5.3 5.3
6:00 5.2 4.5 5.2 5.5
6:30 13.7 11.3 13.3 13.3
7:00 25.9 20.2 24.6 24.5
7:30 27.4 9.9 18.8 21.5
Cable Viewing
5:00 20.1 20.1 20.1 20.1
5:30 21.3 21.3 21.3 21.3
6:00 22.2 22.2 22.2 22.2
6:30 23.7 23.7 23.7 23.7
7:00 27.2 27.2 27.2 27.2
7:30 28.5 28.5 28.5 28.5
19
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
Table 11
Observed vs. optimal advertising prices per second by timeslot.
Timeslot Observed Price Viewer Max Price %Difference Observed Viewing Viewer Max %Difference
5:00 $5.53 $5.74 1.19% 42.03 42.13 0.06%
5:30 $6.07 $6.20 0.60% 44.47 44.62 0.09%
6:00 $7.87 $7.97 0.48% 49.24 49.41 0.10%
6:30 $9.00 $9.36 0.33% 50.37 51.14 0.31%
7:00 $14.98 $14.73 -0.34% 55.00 57.33 0.94%
7:30 $15.54 $13.34 -3.83% 56.24 77.68 7.58%
Table 12
Welfare loss and market characteristics (local news broadcasts).
Business Stealing Gap Two-Sided Market Gap Total Viewer Gap
(Station Max - Obs.) (Viewer Max - Ad. Max) (Viewer Max - Obs.)
(1) (2) (3)
Market Households (1,000) 0.532** 0.246 0.823**
(0.166) (0.163) (0.256)
Black HH Share 0.188 1.158 1.081
(0.779) (0.767) (1.201)
Hispanic HH Share 2.389** 2.559** 5.360**
(0.725) (0.714) (1.119)
Median Income (1,000) 0.028 0.013 0.046
(0.020) (0.019) (0.030)
Market Median Age -0.008 -0.009 -0.016
(0.062) (0.061) (0.095)
Constant 0.698 2.850 3.819
(2.698) (2.656) (4.162)
Table 13
Welfare loss and market characteristics ($ per second).
Business Stealing Gap Two-Sided Market Gap Total Viewer Gap
(Station Max - Obs.) (Viewer Max - Ad. Max) (Viewer Max - Obs.)
(1) (2) (3)
Market Households (1,000) 0.211** -0.395** -0.279**
(0.013) (0.082) (0.082)
Black HH Share 0.098 -0.142 0.064
(0.059) (0.384) (0.385)
Hispanic HH Share 0.254** 0.103 0.305
(0.055) (0.358) (0.359)
Median Income (1,000) -0.003+ 0.004 0.002
(0.001) (0.010) (0.010)
Market Median Age 0.011* 0.070* 0.091**
(0.005) (0.030) (0.031)
Constant -0.242 -3.223* -3.949**
(0.205) (1.331) (1.334)
Results in column (1) indicate that the misallocation due to business stealing increases with market size. This relationship is
consistent with game-theoretical results that coordination is more difficult with more players and higher stakes. The result is echoed
in column (3), which indicates that the overall difference in viewing between the program allocation that maximizes viewing and
observed programming is greatest in largest markets. The divergence between the viewer and advertiser allocation in column (2)
is also positively related to market size, but the magnitude of the effect is smaller and not statistically different from zero. Taken
together, these results suggest that underprovision from the viewer perspective is driven more strongly by business stealing in larger
markets.
20
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
The coefficient for the Hispanic share of the population is positive in all three columns of Table 12. The result suggests that
local news may be under-provided for preference minorities, echoing results in Oberholzer-Gee and Waldfogel (2009) that Hispanic
population is associated with more Spanish-language local news. No other market demographics show a statistically-significant link
to allocation inefficiencies in the table.
Table 13 links broadcast misallocation to monetary losses, measured in average advertising prices per second (station revenue).
Paralleling the columns in Table 12, the dependent variable in column (1) is the difference between the price per second at the
program configuration that maximizes joint station revenue less the observed price. The dependent variable in column (2) is the
difference between the price at the program configuration that maximizes total viewing less prices at the allocation that maximizes
advertiser surplus. The dependent variable in column (3) is the difference between the price at the viewer optimum and the observed
price, a measure of the cost to maximize television views. The average difference in columns (1) is positive, indicating that observed
prices fall short of the maximizing price. The average difference in columns (2) is negative, indicating the expected difference
between the allocation that maximizes advertiser and viewer surplus. In this case the mean difference in column (3) is also negative,
indicating that two-sided market loss is larger than the business sealing loss on average.
The top row of column (1) indicates that the cost of business stealing to stations is highest in larger and more diverse markets.
These are also the markets where the allocation distortion is greatest. But the difference between station revenue at the viewer and
advertiser optimum in column (2) is negatively correlated with market size. This is the same pattern as in column (3), the difference
between station revenue at the viewer maximum and observed levels. We interpret this result as evidence that advertisers wield
more influence over stations in smaller markets where total television viewing per household is substantially higher. Higher median
age is associated with a larger difference between viewer and advertiser preferences (also higher business stealing effects), possibly
driven by stronger preferences for local news among older audiences.
More broadly, the simulations show a shortfall of one or more local news broadcasts compared to the viewer optimum in 44%
of markets between 5:00-7:00 and all markets between 7:00-8:00. We find a shortfall of one or more local news broadcasts relative
to programming that maximizes joint station revenue in 11% of markets between 5:00-7:00 and 81% between 7:00 and 8:00. The
shortfall in local news broadcasts tends to be greater in larger, more diverse markets. Advertising prices per second, our measure of
station revenue, are in all markets below what could be achieved under a fully collusive outcome. Observed prices are even lower
than those estimated for the viewer optimum in 5 markets. Overall, misallocation from business stealing is about one third of the
difference due to two-sided market incentives.
From a policy perspective, our results indicate that regulatory attention to localism generally and local news in particular remains
important. However our analyses also indicate that under-provision is greater in larger, more diverse markets, which runs counter
to policy emphasis. Although advertises prefer more entertainment programming than optimal from the viewer perspective, classic
business stealing is also important, and substantial gains might be achieved from coordination among stations.
Our results thus suggest that policies such as cross-ownership rules, designed to increase competition, may be counterproductive
in incentivizing the production of local news. Although the ability of multi-product firms to internalize business stealing when entry
is limited has been recognized since the early days of television Steiner (1952) and Beebe (1977), the vast expansion of entertainment
alternatives through the growth of cable television reduced attention to this topic. More recent policy work focuses on incentives for
diversity in news production. But our results suggest that gains from coordination can have the greatest benefit in diverse markets.
Our analysis is limited by the fact that we observe revenue but not programming costs. The costs of programming do not rise
with the number of viewers, so are subsumed into station and format effects. If local stations in a market face a common pool of
technical and acting talent, these controls are adequate to support our findings. But we can not rule out format-specific costs that
shift payoffs of broadcasting particular formats. In this case we might overstate or understate the gap between maximal joint station
revenue and observed outcomes. However we also find that many of the gains we predict arise from re-allocating broadcasts across
timeslots rather than net shifts in format. Re-allocation of programming within station over time requires no incremental in cost.
It is also worth considering how our results generalize to broader settings beyond our February 2010 data. Our time period sits
before the widespread shift in attention to mobile devices and social media, and prior to substantial advances in digital advertising.
But as remarked at the outset, local television news markets have remained highly stable over time, with only modest changes in the
number of local television news broadcasts, viewing shares, and advertising revenue. According to annual PEW reports, local news
viewing has declined from approximately 4 million in 2013 to 3 million in 2002. Advertising revenue has declined from 18.6 to 17.8
billion between 2010 and 2022 (not adjusted for inflation). The average number of local TV news broadcast hours has increased
from 5.3 to 6.2 2010-2022, with broadcast increases concentrated in early morning hours (Center, 2023). We do not expect that our
results extend to daytime markets, when total television viewing is much lower.
Our data also cover February, which is a winter month with high television viewing compared to warmer spring and summer
months. Overall advertising spot prices are higher in these months, but otherwise advertising does not display strong seasonal patters.
The exception to this are the weeks leading up to November elections, especially presidential elections, where campaign spending
can affect prices in local markets.22
22
This effect is strong enough that it has been used as a source of identifying variation in economic and marketing research, though recent work by Moshary et al.
(2021) documents only modest crowding out of commercial for political advertising.
21
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
8. Conclusions
We estimate the welfare consequences of local news broadcasting decisions in advertiser-funded television, finding a shortfall in
local news provision relative to the viewer optimum. The shortfall is greatest late in the evening news hours from 7:00–8:00 p.m.,
and is greater in large markets.
Higher advertiser value for entertainment programming during these timeslots explains some of the deficit. Advertisers prefer
fewer local news broadcasts in most markets and timeslots than the number that would maximize total television viewing. However,
we also find that the number of local news broadcasts is in many cases less than the number that would maximize joint station
revenue, suggesting that classic business-stealing also reduces welfare in local television markets.
Our results that viewer and advertiser preferences over program lineups can diverge suggest that long-standing policy attention
to localism in media markets has been warranted. But our finding that business stealing incentives play a role in program choice
highlights the need to understand different sources of inefficiency, as they point to different policy approaches. In particular, policies
that seek to strengthen competition in local news markets such as ownership caps and cross-ownership rules do not necessarily
incentivize greater supply of local news, and may exacerbate under-provision from the viewer perspective. Although our results do
not explicitly consider consolidation, our results indicate that an extension of our approach to these areas might prove worthwhile.
Matthew J. Baker: Writing – review & editing, Writing – original draft, Methodology, Investigation, Formal analysis, Data
curation, Conceptualization. Lisa M. George: Writing – review & editing, Writing – original draft, Methodology, Investigation,
Formal analysis, Data curation, Conceptualization.
Data availability
Acknowledgements
Support for this project was provided by PSC-CUNY Award 64711-00-42, jointly funded by The Professional Staff Congress
and The City University of New York. We thank Tobias Klein, Lapo Filistrucci, and seminar participants at the Media Economics
Workshop, Cornell University, and FCC for helpful comments.
Appendix
We present technical details in five sections. Appendix A shows the Jacobian transformation of the nested multinomial logit
(NMNL) and likelihood calculations. Appendix B describes the process for drawing error terms to simulate revenues. Appendix C
describes the procedure for identifying multiple equilibria. Appendix D discusses importance sampling, and Appendix E describes
our application of the Markov Chain Monte Carlo (MCMC) simulation technique.
The nested multinomial logit model of viewership share results in share expression that can be written as:
In the under-braces, we have indicated how the right-hand side of (A.1) can be broken into a substitution component, an observ-
able component, and an unobserved component which has an error structure that includes market-time, station, and idiosyncratic
components.
This is a useful way of writing the viewership equation because we can now write the viewership model in the form (see equation
(6)):
𝑣 𝑣 𝑣
⎡ 𝛼 +𝑋
𝑏
𝛽 +𝜖
𝑗𝑚𝑡 𝑣 𝑗𝑚𝑡 ⎤
⎢ 𝑒 1−𝜇𝑏 ⎥ 𝑣 𝑣 𝑣
𝑦𝑗𝑚𝑡 = 𝜇𝑏 ln ⎢ 𝛼 𝑣 +𝑋 𝑣 𝛽𝑣 +𝜖 𝑣 ⎥ + 𝛼𝑏 + 𝑋𝑗𝑚𝑡 𝛽𝑣 + 𝜖𝑗𝑚𝑡 (A.2)
⎢∑ 𝑏 𝑘𝑚𝑡 𝑘𝑚𝑡 ⎥
⎣ 𝑏𝑘 ∈𝑏 𝑒 1−𝜇𝑏
⎦
To compute the density, note that the density of 𝑦, 𝑓 (𝑦), can be written in terms of the density of 𝑓 (𝑒) using the Jacobian of the
density of 𝑦 with respect to 𝑒:
𝑑𝜖 1
𝑓 (𝑦) = 𝑓 (𝜖)| | = 𝑓 (𝜖) 𝑑𝑦
𝑑𝑦 | |
𝑑𝜖
22
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
This Jacobian can be calculated using expression (A.2). Expanding and simplifying, we have:
( 𝛼 𝑣 +𝑋 𝑣 𝛽𝑣 +𝜖 𝑣
)
1 ∑ 𝑏 𝑘𝑚𝑡 𝑘𝑚𝑡
𝑦𝑗𝑚𝑡 = (𝛼 𝑣 + 𝑋𝑗𝑚𝑡
𝑣 𝑣
𝛽𝑣 + 𝜖𝑗𝑚𝑡 ) − 𝜇𝑏 ln 𝑒 1−𝜇𝑏 (A.3)
1 − 𝜇𝑏 𝑏 𝑏𝑘𝑚𝑡 ∈𝑏
𝑣 , and using the fact that 𝑗 ’s share within the group is:
Differentiating (A.3) with respect to 𝜖𝑘𝑚𝑡
𝛼 𝑣 +𝑋 𝑣 𝛽𝑣 +𝜖 𝑣
𝑏 𝑗𝑚𝑡 𝑗𝑚𝑡
𝑒 1−𝜇𝑏
𝑠𝑗𝑚𝑡|𝑏𝑚𝑡 = 𝛼 𝑣 +𝑋 𝑣 𝛽𝑣 +𝜖 𝑣
∑ 𝑏 𝑘𝑚𝑡 𝑘𝑚𝑡
𝑏𝑘𝑚𝑡 ∈𝑏 𝑒
1−𝜇𝑏
𝜕𝑦𝑗𝑚𝑡 1 𝜇𝑏
= − 𝑠
1−𝜇𝑏 𝑘|𝑏𝑚𝑡
𝑗=𝑘
𝜕𝜖𝑘𝑚𝑡 1−𝜇𝑏
𝜇
= − 1−𝜇𝑏 𝑠𝑘|𝑏𝑚𝑡 𝑗 ≠ 𝑘; 𝑗, 𝑘 ∈ 𝑏𝑚𝑡
𝑏
=0 otherwise (A.4)
For a given market and timeslot, we then have a block matrix in which stations are grouped according to broadcast type:
⎡ 𝐉𝐥 𝟎 𝟎 𝟎 ⎤
⎢𝟎 𝐉𝐞 𝟎 𝟎 ⎥
𝐉=⎢
⎢𝟎 𝟎 𝐉𝐧 𝟎 ⎥⎥
⎣𝟎 𝟎 𝟎 𝐉𝐜 ⎦
Each block is a square matrix of dimension 𝑁𝑏 , with shape.
⎡ −𝜇𝑏 ⎤ ⎡ 𝑠1|𝑏𝑚𝑡 ⎤
⎢ −𝜇𝑏 ⎥ ⎢ 𝑠 ⎥
𝑢𝑏 = ⎢ ⎥, 𝑣𝑏 = ⎢ 2|𝑏𝑚𝑡 ⎥
⋮
⎢ ⋮ ⎥ ⎢ ⎥
⎣ −𝜇𝑏 ⎦ ⎣ 𝑠𝑁𝑏 |𝑏𝑚𝑡 ⎦
It follows the determinant of 𝐽𝑏 is:
( )𝑁𝑏
1
|𝐉𝐛 | = (1 + 𝐮𝐛 ′ 𝐯𝐛 )
1 − 𝜇𝑏
Where
𝑁𝑏
∑
1 + 𝐮𝐛 ′ 𝐯 𝐛 = 1 − 𝜇 𝑏 𝑠𝑖|𝑏𝑚𝑡 = (1 − 𝜇𝑏 )
𝑖=1
because the sum of shares within group must add to one. It follows that:
( )𝑁𝑏 −1
1
|𝐉𝐛 | =
1 − 𝜇𝑏
Therefore, the (log of the) reciprocal of this determinant must be added to the likelihood. Baltagi shows how the density of errors
with a two-way random effect assuming normally distributed error terms can be computed (Baltagi, 2008, p. 42). Adding on the
Jacobian term, we have the log-likelihood as:
23
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
1 1 ∑6 ∑
ln 𝐿𝑣𝑚 = constant − ln |𝛀𝐯 | − 𝜖𝑗′ 𝛀𝐯 −1 𝜖𝑗 + (𝑁𝑏𝑚𝑡 − 1) ln(1 − 𝜇𝑏 ) (A.6)
2 2 𝑡=1 𝑏𝑚𝑡∈𝐵
Where
𝛀𝐯 = 𝜎𝑚𝑡𝑣
2
(𝐈𝐦𝐭𝐯 ⊗ 𝐉𝐬𝐯 ) + 𝜎𝑠𝑣
2
(𝐉𝐦𝐭𝐯 ⊗ 𝐈𝐬𝐯 ) + 𝜎𝑠𝑚𝑡𝑣
2
(𝐈𝐦𝐭𝐯 ⊗ 𝐈𝐬𝐯 ) (A.7)
and 𝐈𝐦𝐭𝐯 , 𝐈𝐬𝐯 (respectively, 𝐉𝐦𝐭𝐯 , 𝐉𝐬𝐯 ) are identity matrices of dimension 𝑚𝑡𝑣 and 𝑠𝑣, where 𝑚𝑡𝑣 are the total number of viewership
shares across the market, and 𝑠𝑣 are the number of observed shares per station across the market (i.e., 𝑠𝑣 = 6). Similarly, 𝐉𝐦𝐭𝐯 , 𝐉𝐬𝐯
are matrices of ones of dimensions 𝑚𝑡𝑣 and 𝑠𝑣. Thus, the Jacobian transform that must be used is akin to multiplying the substitution
term by the number of observations in a nest minus one. This puts downward pressure on the likelihood, as, absent other concerns,
the likelihood would be maximized by setting 𝜇𝑘 = 0.
The revenue equation is:
As before, this two-way random components structure has a likelihood mirroring that described in (A.6) and (A.7), which can be
written as:
1 1
ln 𝐿𝑝𝑚 = constant − ln |𝛀𝐩 | − 𝜖 𝑝′ 𝛀−𝟏
𝐩 𝜖
𝑝
(A.9)
2 2
Where
𝛀𝐩 = 𝜎𝑚𝑡𝑝
2
(𝐈𝐦𝐭𝐩 ⊗ 𝐉𝐬𝐩 ) + 𝜎𝑠𝑝
2
𝐉𝐦𝐭𝐩 ⊗ 𝐈𝐬𝐩 ) + 𝜎𝑠𝑚𝑡𝑝
2
(𝐈𝐦𝐭𝐩 ⊗ 𝐈𝐬𝐩 ) (A.10)
and 𝐈𝐦𝐭 , 𝐈𝐬𝐩 (respectively 𝐉𝐦𝐭𝐩 , 𝐉𝐬𝐩 ) are identity matrices of dimension 𝑚𝑡𝑝 and 𝑠𝑝, where 𝑚𝑡𝑝 is the number of revenue observations
across all timeslots in a market, and 𝑠𝑝 is the number of station-level observations (i.e., 𝑠𝑝 = 6).
Recall from section 6.1.3 that station revenue maximization requires that we calculate the likelihood that observed revenues are
greater than possible alternative revenues, expressed in equation (25) as:
( )
𝐿𝑟𝑚 = Prob 𝑅𝑖 (𝛿𝑖 (𝑏𝑖 ), 𝛿−𝑖 (𝑏−𝑖 ), 𝜖 𝑣 , 𝜖 𝑝 , 𝜃) ≥ 𝑅𝑖 (𝛿𝑖 (𝑏′𝑖 ), 𝛿−𝑖 (𝑏−𝑖 ), 𝜖 ′ 𝑣 , 𝜖 ′ 𝑝 , 𝜃); 𝑏′ ∈ 𝑖
In principle, we could calculate 𝐿𝑟𝑚 by the integral:
where Γ𝑖 is the integration region coincident with (25). This integral is difficult to calculate, so we instead simulate it by drawing
𝜖𝑖′ 𝑣 and 𝜖𝑖′ 𝑝 .
To do this, we first combine expressions (16) and (22) to get an expression for total revenues:
6 (
∑ )
𝑅𝑗 = 𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚𝑡𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚𝑡 + 𝜔𝑗𝑚𝑡
𝑡=1
Using the representation of viewing in the text and above, we express total viewing as market population 𝑀𝑚 multiplied by the
viewing share:
𝛼 𝑣 +𝑋 𝑣 𝛽𝑣 +𝜔𝑣 +𝜔𝑣 𝛼 𝑣 +𝑋 𝑣 𝛽𝑣 +𝜔𝑣 +𝜔𝑣 +𝜖
𝑏 𝑗𝑚𝑡 𝑗 𝑚𝑡 +𝜖𝑗𝑚𝑡 ∑ 𝑏 𝑘𝑚𝑡 𝑘 𝑚𝑡 𝑘𝑚𝑡
𝑒 1−𝜇𝑏
𝑘∈𝑏 𝑒 1−𝜇𝑏
𝑣𝑗𝑚𝑡𝑏 = 𝑀𝑚 𝛼 𝑣 +𝑋 𝑣 𝛽𝑣 +𝜔𝑣 +𝜔𝑣 ( )1−𝜇𝑏 (B.2)
+𝜖
𝑘 𝑚𝑡 𝑘𝑚𝑡 𝛼 𝑣 +𝑋 𝑣 𝛽𝑣 +𝜔𝑣 +𝜔𝑣
∑ 𝑏 𝑘𝑚𝑡
∑ ∑ 𝑏 𝑘𝑚𝑡
+𝜖
𝑘 𝑚𝑡 𝑘𝑚𝑡
)
𝑘∈𝑏 𝑒 1−𝜇𝑏
1+ 𝑒 1−𝜇𝑏
𝑘∈𝑏 𝑘∈𝑏
Beginning with draws for market-time and station effects, we then compute viewership errors 𝑒̂𝑣𝑗𝑚𝑡 . To get shares due to one firm
deviating while all others hold decisions constant, we draw a new set of viewership errors 𝑒̃𝑣𝑗𝑚𝑡 and use these to predict counterfactual
shares for the whole market for each broadcast menu that a firm might follow. This produces a set of hypothetical viewership patterns
for the deviating firm (in fact, for all firms) that can be used in the revenue equation as 𝑣𝑗𝑚𝑡𝑏 .
24
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
Simulating all potential viewership involves computing viewership patterns for all possible combinations of observed and hypo-
thetical actions and error terms. We tackle this recursively, drawing error terms by working backwards. This is necessary because of
the presence of lagged terms in viewership.
To illustrate, at the very end of prime time it must be the case that (using the notation (6) to denote the hypothetical scenario
that only programming in the final, sixth timeslot is altered):
∑
5
𝑅𝑗 = 𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚𝑡𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚𝑡 + 𝜔𝑗𝑚𝑡
𝑡=1
+𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚6𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚6 + 𝜔𝑗𝑚6
≥
∑
5
𝑅(6)
𝑗 = 𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚𝑡𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚𝑡 + 𝜔𝑗𝑚𝑡
𝑡=1
+𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚6𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚6 + 𝜔̃ 𝑗𝑚6 (B.3)
Which reduces to:
∑
6
𝑅𝑗 = exp(𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚𝑡𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚𝑡 + 𝜔𝑗𝑚𝑡 )
𝑡=1
≥
∑
4
𝑅(56)
𝑗 = exp(𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚𝑡𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚𝑡 + 𝜔𝑗𝑚𝑡 )
𝑡=1
[
+ max exp(𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚5𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚5 + 𝜔̃ 𝑗𝑚5 )
+ exp(𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚6𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚6 + 𝜔̃ 𝑗𝑚6 ),
exp(𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚5𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚5 + 𝜔̃ 𝑗𝑚5 )
]
+ 𝑒𝑥𝑝(𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚6𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚6 + 𝜔̃ 𝑗𝑚6 ) (B.6)
Which becomes:
∑
6
𝑅𝑗 = exp(𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚𝑡𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚𝑡 + 𝜔𝑗𝑚𝑡 )
𝑡=1
≥
∑
4
𝑅(56)
𝑗 = exp(𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚𝑡𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚𝑡 + 𝜔𝑗𝑚𝑡 )
𝑡=1
[
+ max exp(𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚5𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚5 + 𝜔̃ 𝑗𝑚5 )
+ exp(𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚6𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚6 + 𝜔̃ 𝑗𝑚6 ),
exp(𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚5𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚5 + 𝜔̃ 𝑗𝑚5 )
]
+ 𝑒𝑥𝑝(𝛼𝑏𝑝 + 𝜂𝑏 ln 𝑣𝑗𝑚6𝑏 + 𝜔𝑝𝑗 + 𝜔𝑝𝑚6 + 𝜔̃ 𝑗𝑚6 ) (B.7)
That is, we require the error term be such that changing broadcast in the 5th timeslot is less than what is observed when the
continuation strategy could be either to follow the observed broadcast in period 6, or to change broadcasts in period 6 along with a
change in period 5. Out of this logic, we find an upper bound for the error term of 𝜔̃ 𝑗𝑚5 , 𝑇5 , and draw accordingly.
We proceed in this fashion until we arrive at the first broadcast period, at which point we have a bound 𝑇1 . We can then calculate
the density as
25
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
( )
𝜔̃ ′𝑣𝑗𝑚1
𝜙 𝑝
𝜎𝑗𝑚𝑡
𝑓= ( )
𝑇1
Φ 𝑝
𝜎𝑗𝑚𝑡
The large strategy space associated with station choice of program lineups makes it impractical to check all strategy profiles
for multiple equilibria. To reiterate the illustration in the text, with 4 stations making a choice of local news or entertainment in
each of 6 time periods, each station has 64 possible broadcast lineups, giving 624 = 14776336 potential strategy profiles across the
viewership market.
Because it is impractical to check all strategies, we instead follow an approximate procedure that resembles an adaptive Markov
chain Monte Carlo (MCMC) technique. For each station, we first pare down the strategy space to include only those strategies likely
to be viable alternatives. We can do this because in simulating prices and viewing shares some strategies generate significantly lower
revenue than what is observed. For example, broadcasting local news in all timeslots slots, or broadcasting local news at 7:30 generate
low revenues. We select the ten best strategies as potential alternatives. In the 4 station example above, this reduces the available
strategy space to 104 = 10000. Although this is an improvement, it is still hard to check all possibilities - especially in large markets.
We instead select a set of stations randomly, perturb their strategies using choices from the likely set of strategies, then iterate to a
pure-strategy Nash equilibrium. Whenever this computed pure-strategy Nash equilibrium is different than the observed equilibrium,
we make note of it and add it to the count of equilibria. We then add a term to the likelihood that adjusts for multiplicity, which is
1
the inverse of the number of equilibria, .
𝑛(𝑏 )
This section explains how we apply a simulation-based estimation strategy to our problem in the style of Ackerberg-Keane-
Wolpin. A simulation approach is needed because it is computationally difficult and time consuming to simulate Nash equilibria of
the broadcast menu choice game, the error terms operate over a difficult-to-define region of integration, and it is costly to compute
counterfactual shares.
Following Ackerberg (2009), suppose the goal is to calculate a likelihood function that includes data 𝑥 and parameters 𝜃 and
which requires some integration of an unobserved variable 𝜖 over a region Γ. The variable 𝜖 could be, and often is, multidimensional.
Thus:
The function 𝐹 (𝑥, 𝜃, 𝜖) can be costly and/or time-consuming to calculate, but Ackerberg (2009) shows how the problem may be, in
certain circumstances, recast. First, if it is possible to form an index 𝑢(𝑥, 𝜃, 𝜖), 𝐿 can be rewritten:
The function 𝑢 is often a linear index function of the form 𝑢 = 𝑥𝛽 + 𝜖 , where 𝛽 ∈ 𝜃 . Then, introducing a change of variables:
𝐿= 𝐹 (𝑢)ℎ(𝑢, 𝜃, 𝑥)𝑑𝑢
∫
𝑢∈Γ′
If there is a way to simulate values of 𝑢 that does not depend upon 𝜃 , then an importance sampler can be introduced as follows:
𝑔(𝑢, 𝑥)
𝐿= 𝐹 (𝑢)ℎ(𝑢, 𝜃, 𝑥) 𝑑𝑢
∫ 𝑔(𝑢, 𝑥)
𝑢∈Γ′
Values of 𝑢 can be drawn from 𝑔(𝑢), which in practice often derives from some approximate, yet reliable and easy-to-estimate model,
so 𝑔(𝑢, 𝑥) is in fact 𝑔(𝑢, 𝑥, 𝜃0 ). Then, an approximation of 𝐿 is obtained using 𝑆 simulated values of 𝑢:
𝑆
1 ∑ 𝐹 (𝑢𝑠 )
𝐿≈ ℎ(𝑢𝑠 , 𝜃, 𝑥)
𝑆 𝑠=1 𝑔(𝑢𝑠 , 𝑥)
The convenience of this last expression lies in the fact that the problem now relies on calculating ℎ(𝑢𝑠 , 𝜃, 𝑥) instead of repeated
calculation of 𝐹 (𝑥, 𝜃, 𝜖). Essentially, simulated observations are re-weighted by the estimation procedure.
In our problem, we seek to integrate over the unobserved features of the problem, which include unobserved viewership error
terms and unobserved advertising price error terms. Calculating counterfactual viewership and prices from unobserved choices is
time consuming, as is checking the coherence of the results (i.e., checking for multiple pure-strategy Nash equilibria). Accordingly,
we apply the Ackerberg (2009) procedure by first simulating viewership for a set of draws. We then simulate prices so that observed
26
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
prices constitute the revenue-maximizing prices. Given these conditions, we check for existence of alternative pure-strategy Nash
equilibria.
Specifically, we first, note that viewership can be written as:
𝛼 𝑣 +𝑋 𝑣 𝛽𝑣 +𝜔𝑣 +𝜔𝑣 𝛼 𝑣 +𝑋 𝑣 𝛽𝑣 +𝜔𝑣 +𝜔𝑣 +𝜖
𝑘 𝑚𝑡 𝑘𝑚𝑡
𝑏 𝑗𝑚𝑡 𝑗 𝑚𝑡 +𝜖𝑗𝑚𝑡 ∑ 𝑏 𝑗𝑚𝑡
𝑒 1−𝜇𝑏
𝑘∈𝑏 𝑒 1−𝜇𝑏
𝑣𝑗𝑚𝑡𝑏 = 𝑀 𝛼 𝑣 +𝑋 𝑣 𝛽𝑣 +𝜔𝑣 +𝜔𝑣 ( )1−𝜇𝑏 (D.1)
+𝜖
𝑘 𝑚𝑡 𝑘𝑚𝑡 𝛼 𝑣 +𝑋 𝑣 𝛽𝑣 +𝜔𝑣 +𝜔𝑣
∑ 𝑏 𝑗𝑚𝑡
∑ ∑ 𝑏 𝑗𝑚𝑡
+𝜖
𝑘 𝑚𝑡 𝑘𝑚𝑡
)
𝑘∈𝑏 𝑒 1−𝜇𝑏
1+ 𝑒 1−𝜇𝑏
𝑘∈𝑏 𝑘∈𝑏
𝑘∈𝑏′ 𝑒 1−𝜇𝑏′
1+ 𝑘∈𝑏 𝑘∈𝑏 𝑒 1−𝜇𝑏
The nuance here is that we have a set of bounds, as detailed previously, that the advertising price revenues have to satisfy. These
give bounds for 𝑇𝑗𝑠1 , 𝑇𝑗𝑠2 , … , 𝑇𝑗𝑠6 . Which results in our values of 𝑢𝑝 belonging to a truncated normal density. Hence, we have:
The advantages of MCMC estimation in circumstances in which the likelihood is difficult to maximize numerically arise because
MCMC estimation relies on drawing from the distribution of parameters implied by the data. While this technique is commonly
associated with Bayesian estimation methods, Chernozhukov and Hong (2003) show how the approach can be applied to classical
estimation. Practical aspects of implementation and examples are detailed in Baker (2014).
27
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
The likelihood to be maximized in this paper with components (22), (24), (25), and (28), results in an expression for the log-
likelihood contribution of a market to overall likelihood as:
Typically, 𝜃 would be chosen to maximize 𝐿(𝐷|𝜃). MCMC estimation instead starts with Bayes’ law:
𝑃 (𝜃|𝐷)𝑃 (𝐷)
𝑃 (𝐷|𝜃) =
𝑃 (𝜃)
If there are no prior opinions about parameters, the procedure works from the relationship between the distribution of parameters
and the data implied by the likelihood:
1. Given 𝜃𝑡−1 , draw a proposed value 𝜃𝑡′ from a proposal distribution 𝜙(𝜃𝑡 ).
[ 𝑃 (𝜃𝑡′ |𝐷)
]
2. Compute 𝑎 = min , 1 . Set 𝜃𝑡 = 𝜃𝑡′ with probability 𝑎 and set 𝜃𝑡 = 𝜃𝑡−1 with probability 1 − 𝑎.
𝑃 (𝜃𝑡−1 |𝐷)
The algorithm takes on its MCMC character because 𝜙(𝜃𝑡 ) = 𝜙(𝜃𝑡 |𝜃𝑡−1 ); each draw depends upon the proceeding draw.
There are some practical issues that one must address in implementation. A proposal distribution is needed to generate values
for 𝜃𝑡′ ; and the length of the burn-in period must be determined (that is, how many draws have to be taken to ensure the algorithm
has settled down to cover the distribution well). Methods for dealing with these matters are discussed in Baker (2014), while general
concepts are discussed in Chernozhukov and Hong (2003). The advantages of the estimation method are clear, however, in this
algorithm. If the likelihood can be computed, then one can draw parameters from the implied distribution of parameters and obtain
implied standard errors as part of estimation. Further, the MCMC approach sidesteps the issue caused by the relationship between
likelihood and parameter distribution being proportional and not equal by using the likelihood in ratio form.
References
Ackerberg, D., 2009. A new use of importance sampling to reduce computational burden in simulation estimation. Quant. Mark. Econ. 7, 343–376. https://link.
springer.com/article/10.1007/s11129-009-9074-z.
Ackerberg, D., Rysman, M., 2005. Unobserved product differentiation in discrete-choice models: estimating price elasticities and welfare effects. Rand J. Econ. 36,
771–788. https://www.jstor.org/stable/4135256.
Anderson, S.P., Gabszewicz, J.J., 2006. The media and advertising: a tale of two-sided markets. In: Ginsburgh, V., Throsby, D. (Eds.), Handbook of the Economics of
Art and Culture, vol. 1. Elsevier, Amsterdam, pp. 567–614. Chapter 18.
Anderson, S.P., Goeree, J.K., Ramer, R., 1997. Location, location, location. J. Econ. Theory 77, 102–127. https://doi.org/10.1006/jeth.1997.2323.
Aradillas-López, A., 2020. The econometrics of static games. Annu. Rev. Econ. 12, 135–165. https://doi.org/10.1146/annurev-economics-081919-113720.
Bajari, P., Hong, H., Krainer, J., Nekipelov, D., 2010a. Estimating static models of strategic interactions. J. Bus. Econ. Stat. 28. https://www.jstor.org/stable/20750856.
Bajari, P., Hong, H., Ryan, S.P., 2010b. Identification and estimation of a discrete game of complete information. Econometrica 78, 1529–1568. https://www.jstor.
org/stable/40928965.
Baker, M.J., 2014. Adaptive Markov chain Monte Carlo sampling and estimation in mata. Stata J. 14, 623–661. http://www.stata-journal.com/article.html?article=
st0354.
Baltagi, B.H., 2008. Econometric Analysis of Panel Data. Wiley, Chichester, UK.
Beebe, J.H., 1977. Institutional structure and program choices in television markets. Q. J. Econ. 91, 15–37.
Berry, S., 1994. Estimating discrete-choice models of product differentiation. Rand J. Econ. 30, 242–262. https://www.jstor.org/stable/2555829.
Berry, S., Eizenberg, A., Waldfogel, J., 2015. Optimal product variety in radio markets. Rand J. Econ. 47, 463–497. https://onlinelibrary.wiley.com/doi/abs/10.1111/
1756-2171.12134.
Berry, S., Reiss, P., 2007. Empirical models of entry and market structure. In: Armstrong, M., Porter, R. (Eds.), Handbook of Industrial Organization, vol. 3. Elsevier,
Amsterdam, pp. 1845–1886. Chapter 29.
Berry, S., Waldfogel, J., 1999. Free entry and social inefficiency in radio broadcasting. Rand J. Econ. 30, 397–420. https://www.jstor.org/stable/2556055.
Bjorn, P.A., Vuong, Q.H., 1984. Simultaneous Equations Models for Dummy Endogenous Variables: A Game Theoretic Formulation with an Application to Labor Force
Participation. Working Papers 537. California Institute of Technology, Division of the Humanities and Social Sciences. https://ideas.repec.org/p/clt/sswopa/537.
html.
Bresnahan, T., Reiss, P.C., 1991. Entry and competition in concentrated markets. J. Polit. Econ. 99, 977–1009. http://EconPapers.repec.org/RePEc:ucp:jpolec:v:99:y:
1991:i:5:p:977-1009.
Cardell, N.S., 1997. Variance components structures for the extreme-value and logistic distributions with application to models of heterogeneity. Econom. Theory 13,
185–213. https://www.jstor.org/stable/3532724.
Center, P.R., 2023. TV News Fact Sheet. Technical Report. Pew Research Center.
28
M.J. Baker and L.M. George International Journal of Industrial Organization 94 (2024) 103068
Chamberlin, E., 1933. The Theory of Monopolistic Competition. Harvard Economic Studies, vol. XXXVIII. Harvard University Press. https://books.google.com/books?
id=UOdKdA6VvboC.
Chernozhukov, V., Hong, H., 2003. An mcmc approach to classical estimation. J. Econom. 115, 293–346. https://doi.org/10.1016/S0304-4076(03)00100-3.
Chib, S., Greenberg, E., 1995. Understanding the Metropolis-Hastings algorithm. Am. Stat. 49, 327–335. https://doi.org/10.2307/2684568.
Ciliberto, F., Tamer, E., 2009. Market structure and multiple equilibria in airline markets. Econometrica 77, 1791–1828. https://doi.org/10.3982/ECTA5368.
Crawford, G.S., Cullen, J., 2007. Bundling, product choice, and efficiency: should cable television networks be offered à la carte? Inf. Econ. Policy 19, 379–404.
https://doi.org/10.1016/j.infoecopol.2007.06.005.
Crawford, G.S., Lee, R.S., Whinston, M.D., Yurukoglu, A., 2018. The welfare effects of vertical integration in multichannel television markets. Econometrica 86,
891–954. https://doi.org/10.3982/ECTA14031. https://onlinelibrary.wiley.com/doi/abs/10.3982/ECTA14031.
d’Aspremont, C., Gabszewicz, J.J., Thisse, J.F., 1979. On Hotelling’s ‘stability in competition’. Econometrica 47, 1145–1150. https://ideas.repec.org/a/ecm/emetrp/
v47y1979i5p1145-50.html.
Dixit, A.K., Stiglitz, J.E., 1977. Monopolistic competition and optimum product diversity. Am. Econ. Rev. 67, 297–308. https://ideas.repec.org/a/aea/aecrev/
v67y1977i3p297-308.html.
Downs, A., 1957. An Economic Theory of Democracy. Harper and Row, New York. https://books.google.com/books?id=kLEGAAAAMAAJ.
Ellickson, P.B., Misra, S., 2008. Supermarket pricing strategies. Mark. Sci. 27, 811–828. https://doi.org/10.1287/mksc.1080.0398.
Ellickson, P.B., Misra, S., 2011a. Enriching interactions: incorporating outcome data into static discrete games. Quant. Mark. Econ. 10, 1–26. https://link.springer.
com/article/10.1007/s11129-011-9112-5.
Ellickson, P.B., Misra, S., 2011b. Estimating discrete games. Mark. Sci. 30, 997–1010. https://www.jstor.org/stable/41408414.
Esteves-Sorenson, C., Perretti, F., 2012. Micro-costs: inertia in television viewing. Econ. J. 122, 867–902. http://EconPapers.repec.org/RePEc:ecj:econjl:v:122:y:2012:
i:563:p:867-902.
Fan, Y., 2013. Ownership consolidation and product characteristics: a study of the US daily newspaper market. Am. Econ. Rev. 103, 1598–1628. https://www.aeaweb.
org/articles?id=10.1257/aer.103.5.1598.
Fan, Y., Yang, C., 2020. Competition, product proliferation, and welfare: a study of the US smartphone market. Am. Econ. J. Microecon. 12, 99–134. https://
doi.org/10.1257/mic.20180182. https://www.aeaweb.org/articles?id=10.1257/mic.20180182.
Filistrucchi, L., Klein, T.J., Michielsen, T.O., 2012. Assessing unilateral merger effects in a two-sided market: an application to the Dutch daily newspaper market. J.
Compet. Law Econ. http://jcle.oxfordjournals.org/content/early/2012/05/24/joclec.nhs012.abstract.
Furnham, A., Gunter, B., Richardson, F., 2002. Effects of product–program congruity and viewer involvement on memory for televised advertisements. J. Appl. Soc.
Psychol. 32, 124–141. https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1559-1816.2002.tb01423.x.
George, L., Hogendorn, C., 2012. Aggregators, search and the economics of new media institutions. Inf. Econ. Policy 24, 40–51. https://doi.org/10.1016/j.infoecopol.
2012.01.005.
George, L.M., 2008. The Internet and the market for daily newspapers. B.E. J. Econ. Anal. Policy 8, 1–33. https://doi.org/10.2202/1935-1682.1944.
George, L.M., Oberholzer-Gee, F., 2010. Diversity in Local Television News. Technical Report. Federal Communications Commission. https://www.fcc.gov/general/
2010-media-ownership-studies.
George, L.M., Waldfogel, J., 2003. Who affects whom in daily newspaper markets? J. Polit. Econ. 111, 765–784. https://www.journals.uchicago.edu/doi/10.1086/
375380.
George, L.M., Waldfogel, J., 2006. The New York times and the market for local newspapers. Am. Econ. Rev. 96, 435–447. https://www.aeaweb.org/articles?id=10.
1257/000282806776157551.
Goldfarb, A., Tucker, C.E., 2011. Privacy regulation and online advertising. Manag. Sci. 57, 57–71. https://doi.org/10.1287/mnsc.1100.1246.
Gross, L., Gross, B., Perebinossoff, P., 2005. Programming for TV, Radio & the Internet: Strategy, Development & Evaluation. Routledge, New York.
Hotelling, H., 1929. Stability in competition. Econ. J. 39, 41–57. http://www.jstor.org/stable/2224214.
Jeziorski, P., 2014. Effects of mergers in two-sided markets: the US radio industry. Am. Econ. J. Microecon. 6, 35–73. https://www.aeaweb.org/articles?id=10.1257/
mic.6.4.35.
Kooreman, P., 1994. Estimation of some models of discrete games. J. Appl. Econom. 9, 255–268. https://www.jstor.org/stable/2285272.
Makuch, K., Levy, J., 2021. Market Size and Local Television News. OEA Working Paper 52. Technical Report. Federal Communications Commission.
McFadden, D., 1980. Econometric models of probabilistic choice. In: McFadden, D., Manski, C. (Eds.), Structural Analysis of Discrete Data. MIT Press, Cambridge,
Mass., pp. 198–272.
Moshary, S., Shapiro, B.T., Song, J., 2021. How and when to use the political cycle to identify advertising effects. Mark. Sci. 40, 283–304. https://doi.org/10.1287/
mksc.2020.1258.
Oberholzer-Gee, F., Waldfogel, J., 2009. Media markets and localism: does local news en español boost Hispanic voter turnout? Am. Econ. Rev. 99, 2120–2128.
http://www.aeaweb.org/articles.php?doi=10.1257/aer.99.5.2120.
de Paula, Á., 2013. Econometric analysis of games with multiple equilibria. Annu. Rev. Econ. 5, 107–131. https://doi.org/10.1146/annurev-economics-081612-
185944.
Reiss, P.C., Spiller, P.T., 1989. Competition and entry in small airline markets. J. Law Econ., S179–S202. https://www.jstor.org/stable/725595.
Seim, K., Waldfogel, J., 2013. Public monopoly and economic efficiency: evidence from the Pennsylvania liquor control board’s entry decisions. Am. Econ. Rev. 103,
831–862. https://www.jstor.org/stable/23469684.
Spence, A., 1976a. Product differentiation and welfare. Am. Econ. Rev. 66, 407–414. http://EconPapers.repec.org/RePEc:aea:aecrev:v:66:y:1976:i:2:p:407-14.
Spence, A., 1976b. Product selection, fixed costs, and monopolistic competition. Rev. Econ. Stud. 43, 217–235. http://EconPapers.repec.org/RePEc:bla:restud:v:43:y:
1976:i:2:p:217-35.
Steiner, P.O., 1952. Program patterns and preferences, and the workability of competition in radio broadcasting. Q. J. Econ. 66, 194–223. https://doi.org/10.2307/
1882942. http://qje.oxfordjournals.org/content/66/2/194.abstract.
Sutton, J., 1991. Sunk Cost and Market Structure. MIT Press, Cambridge, MA.
Waldfogel, J., 2003. Preference externalities: an empirical study of who benefits whom in differentiated-product markets. Rand J. Econ. 34, 557–568. https://
ideas.repec.org/a/rje/randje/v34y2003i3p557-68.html.
Waldfogel, J., Holmes, T.J., Noll, R.G., 2004. Who benefits whom in local television markets? [with comments]. Brookings-Wharton Pap. Urban Aff. , 257–305.
https://muse.jhu.edu/article/170977.
Wilbur, K.C., 2008. A two-sided, empirical model of television advertising and viewing markets. Mark. Sci. 27, 356–378. https://www.jstor.org/stable/40057141.
Zhu, T., Singh, V., Manuszak, M.D., 2009. Market structure and competition in the retail discount industry. J. Mark. Res. 46, 453–466. https://www.jstor.org/stable/
20618908.
29