The Cross-Section of Non-Professional Analyst Skill

The Cross-Section of Non-Professional Analyst Skill
Michael Farrell, Russell Jame, and Tian Qiu*
August 2020
Abstract
We examine the cross-section of skill among non-professional analysts (NPAs) on Seeking Alpha,
a prominent crowdsourced investment research platform. We estimate that 60% of NPAs are
skilled, and we document substantial dispersion in skill. Even after accounting for bid-ask spreads
and allowing for a three-day investment delay, following NPAs in the top quintile of past skill
earns annualized abnormal returns of 10%. In contrast, an unconditional strategy that follows all
NPAs earns insignificant returns. An examination of retail and institutional order imbalances
following NPA recommendations suggests that neither group recognizes the sizeable differences
in ability across NPAs.
JEL: G14
Keywords: Social Media, Investment Research, Performance Persistence, Trading Strategy
*
Farrell is from the Darden School of Business, University of Virginia, [email protected]. Jame and Qiu
are from the Gatton College of Business and Economics, University of Kentucky, [email protected],
[email protected].
Electronic copy available at: https://ssrn.com/abstract=3682490

1. Introduction
Over the last decade there has been a proliferation of non-professional analysts sharing
investment research on social media. According to a recent survey, nearly one in three affluent
investors in the United States now rely on social media to inform their investment decisions.1 One
particularly prominent source of non-professional investment research is Seeking Alpha which
disseminates more than 7,000 research reports per month and attracts roughly 15 million monthly
visitors.2 Investors willingness to embrace non-professional research on Seeking Alpha appears to
be well-justified. For example, Chen, De, Hu, and Hwang (2014) find that the tone of Seeking
Alpha research forecasts stock returns, and Farrell, Green, Jame, and Markov (2020) find that
Seeking Alpha research facilitates more informative trading among retail investors.
Despite the increasing importance of Seeking Alpha, there has been virtually no research
on the cross-section of skill across the thousands of Seeking Alpha contributors. For example,
what fraction of contributors possess true stock picking skill (i.e., α > 0)? How large is the
dispersion of skill among contributors? Can investors use past performance to identify skilled
contributors? If so, how much value does this create for investors? Are there other contributor
characteristics, apart from past skill, that are associated with superior performance?
This paper answers these questions by conducting the first comprehensive analysis of the
cross-section of skill among non-professional analysts writing research reports on the Seeking
Alpha platform (hereafter: NPAs or contributors). Similar to Crane and Crotty (2020), we measure
analyst skill as the hypothetical abnormal returns that an investor would earn by following the
recommendation of an NPA for a fixed holding period. We classify reports into buy or sell
1
http://www.experiencetheblog.com/2013/04/four-recent-studies-on-rapid-adoption.html.
2
https://static.seekingalpha.com/uploads/pdf_income/sa_media_kit_04_2019_generic.pdf

recommendations using either 1) the authors’ disclosed position (Campbell, DeAngelis, and Moon,
2019) or the sentiment of the report (Chen et al., 2014), and we measure abnormal returns (six-
factor alphas) over holding periods of either five or 63 trading days. To the extent that markets
efficiently incorporate the content of Seeking Alpha reports, focusing on a relatively short horizon
(e.g., the five-day window) offers a more powerful test of skill. However, much of the five-day
return may be difficult for the typical investor to realize given natural delays in processing
investment research and the substantial transaction costs associated with a high-turnover strategy.
Further, a 63-day window allows for the possibility that the market may overreact or underreact to
NPA research.
As a starting point, we model NPA performance as a mixture of multiple skill distributions.
This approach uses information from the cross-section of NPA skill to reduce noise and control
for false discoveries (see, e.g., Barras, Scaillet, and Wermers, 2010; Chen, Cliff, and Zhao, 2017;
and Crane and Crotty, 2020). We find that a substantial fraction of NPAs are skilled. For example,
using a 63-day (5-day) horizon, we find that roughly 60% (71%) of NPAs have positive abnormal
returns. We also document sizeable dispersion in skill among NPAs (σ). Specifically, we estimate
a σ of 2.48% over the 63-day holding period, or 10.15% annualized. As a reference, this estimate
is roughly eight times greater than typical estimate of σ for mutual funds.3
The significant heterogeneity in NPA skill suggests that investors can improve their
performance by limiting their attention to NPAs that have historically issued more informative
research. Indeed, we show that conditioning on past alpha, defined as the average six-factor alpha
across a contributor’s past 10 investment recommendations yields economically large benefits to
3
For example, using bootstrap simulations Fama and French (2010) estimate a true annualized σ of roughly 1.25%.
Similarly, the noise reduced alpha model of Harvey and Liu (2018) yields an estimate of 1.19%.

investors. For example, a strategy that only follows reports written by contributors in the top
quintile of past alpha earns abnormal returns of 1.78% per month, which is more than double the
0.79% monthly abnormal return associated with the unconditional strategy of following all
contributors.
The significant outperformance of contributors in the top quintile of past alpha is robust to
different risk-adjustments, different measures of past performance, and different holding periods.
In addition, the findings continue to hold (with similar magnitudes) after excluding SA research
reports that coincide with other major information events including earnings announcements, sell-
side research reports, and media coverage. This finding suggests the skill of the top performing
NPAs extends beyond merely “piggybacking” off of major information releases.
Having established significant and persistent differences in NPA skill, we next explore
whether investors can profit from following the top NPAs after incorporating bid-ask spreads and
allowing for reasonable delays in information processing. We develop an implementable trading
strategy where investors buy (sell) at the ask (bid) price and subsequently sell (buy) at the bid (ask)
price at the end of the holding period. In addition, we relax the assumption that investors trade on
NPA research instantaneously by imposing investment delays of 24 to 72 hours following the
release of the report. We find that even after incorporating bid-ask spreads and a 72-hour
investment delay, a trading strategy following NPAs in the top quintile of past alpha yields a
statistically significant monthly return of 0.84% or roughly 10.00% annualized. In contrast, the
analogous strategy that follows all NPAs earns a statistically insignificant -0.09% per month.
We also explore whether other contributor characteristics, apart from past alpha, are
associated with more informative research. We find that NPAs who are more active in commenting
on other research reports issue more impactful research. Report are also more informative when

they are authored by more specialized contributors, as measured by an NPA’s tendency to either
write about similar topics across all reports (Across-Report Focus) or discuss a small number of
topics deeply within an report (Within-Report Focus). However, the economic importance of these
contributor characteristics is more modest than past alpha. For example, a one-standard deviation
increases in either Across-Report (Within-Report Focus) is associated with 0.23% (0.22%) higher
returns over the subsequent 63 days, whereas the corresponding increase for past alpha is 1.21%.
Finally, to more directly examine how investors respond to differences in skill across
contributors, we examine retail and institutional order imbalances following NPA
recommendations While both retail and institutional order imbalances exhibit a significant
correlation with NPA recommendations, we find little evidence that this correlation varies
systematically with measures of NPA quality, including past alpha. This finding is perhaps
surprising given retail investors tendency to chase mutual fund managers with superior past
performance (e.g., Ippolito, 1992; and Sirri and Tufano, 1998). One potentially important
difference between NPAs and mutual funds is that past performance for NPAs is not disclosed,
making it a far less salient attribute. The fact that neither retail nor institutional investors recognize
the sizeable differences in skill across NPAs helps explain why following the recommendations of
the top NPAs remains a profitable investment strategy even after incorporating trading delays of
up to three days.
Our analysis relates to the literature that explores whether investors can profit from sell-
side (i.e., professional) analysts. Barber, Lehavy, McNichols, and Trueman (2001) find that the
average sell-side analyst investment recommendations are valuable, and Mikhail, Walther, and
Willis (2004) documents significant persistence in sell-side analyst skill; however both conclude
that incorporating transaction costs and small investment delays eliminates all potential trading

gains. The stronger trading profits associated with following the best NPAs is at least partially
attributable to the market being far less efficient in recognizing differences in skill among NPAs.
Our findings also differ dramatically from the literature on mutual fund performance,
which concludes that at best, a very small fraction of managers are skilled enough to outperform
the market after expenses (see, e.g., Fama and French, 2010; and Harvey and Liu, 2018).4 The
stark contrast between mutual funds and NPAs is likely attributable to the fact NPAs are free of
several of the constraints that erode mutual fund performance including fund expenses, decreasing
returns to scale (Berk and Green, 2004; Pastor, Stambaugh, and Taylor, 2015) and liquidity-
motivated trading (Edelen, 1999; Alexander, Cici, and Gibson, 2007). This contrast also raises the
interesting possibility of whether NPAs will begin to offer more mutual fund like services using
an organizational structure that is free from many of the limitations that diminish mutual fund
performance. For example, Seeking Alpha recently launched SA Marketplace which allows
individuals access to suggested portfolios, exclusive investment research, and private chat-room
communities for a fixed monthly fee.
Finally, our paper relates to the literature that explores the consequences of social media
for financial markets. This literature has focused primarily on whether the investment
recommendations across different social media sites contain value (see, e.g., Chen et al. 2014;
Jame, Johnston, Markov, and Wolfe, 2016; Avery, Chevalier, and Zeckhauser, 2016; and Bartov,
Faurel and Mohanram 2018). We extend this literature by providing a more complete picture of
the distribution of contributor skill. We believe our findings will be of natural interest to retail
investors, who tend to have limited access to sell-side research, and thus rely more heavily on non-
4
There is, however, evidence that investors can outperform by investing in the best hedge funds (see, e.g., Kosowski,
Naik, and Teo, 2007; Jagannathan, Malakhov, and Novikov, 2010; and Chen, Cliff, and Zhao, 2017).

professional investment research (Farrell et al., 2020). We also expect that our findings will
become increasingly relevant for institutional investors. In particular, the recent passage of MiFID
II in Europe now requires institutional investors to pay directly for sell-side research. Initial
evidence suggests that the price of sell-side research can be quite high, making Seeking Alpha a
relatively more valuable source of investment research, particularly among smaller institutions
with more limited research budgets.5 Lastly, our results should be of interest to regulators who
have repeatedly expressed concerns about investors relying on social media for investment advice
(e.g., SEC 2011, 2015, 2017). Our findings suggest that requiring more disclosure of contributors’
past investment recommendations or even distributing education materials that encourage
investors to track historical performance could be beneficial to investors.
2. The Seeking Alpha Sample
2.1. Data and variable construction
We collect research reports of non-professional analysts (NPAs) from Seeking Alpha (SA),
a prominent social media website that crowdsources investment research. As of 2018, more than
7,000 NPAs publish 10,000 investing ideas every month for more than 15 million unique visitors.6
NPA reports are intended to provide thorough investment analysis and research to support their
opinions, and all reports undergo an editorial review.
We obtain all reports that were published between 2005 and 2017 on the SA website. For
each reports, we collect the date and time of the report publication, the complete text of the report,
the ownership disclosures of the contributor (i.e., information on whether the author has any
5
The consequences of MiFID II have been discussed extensively in the media, including on Seeking Alpha (see,
e.g., https://seekingalpha.com/report/4102922-im-going-consume-sell-side-research-foreseeable-future).
6
https://seekingalpha.com/page/about_us

position in the stocks she is discussing), and the ticker (or tickers) associated with each report.
Following Chen et al., (2014), we limit the sample to reports associated with a single ticker. We
further limit the sample to common stocks (CRSP share codes 10 and 11) with available data in
both NYSE TAQ and the CRSP-Compustat merged database. Our resulting sample includes
192,398 reports by 9,130 unique contributors for 5,080 firms.
We assign each report as either: positive, negative, or neutral using a two-step procedure.
First, following Campbell, DeAngelis, and Moon (2019) we classify all reports in which an NPA
discloses a long (short) position in the stock as positive (negative). For all remaining positions, we
follow Chen et al. (2014) and compute the tone of the report as the percentage of negative words
in the report (Percent Negative), where the negative word list is taken from Loughran and
McDonald (2011). We assign reports in the bottom (top) tercile of Percent Negative relative to the
distribution of report tone on the previous day as positive (negative). Overall, we classify roughly
45% of reports as positive, 30% of reports as negative, and the remaining 25% of reports as
neutral.7 Neutral reports are excluded from the sample.
Following Crane and Crotty (2020), we define the estimated abnormal return (𝛼̂) associated
with each report k issued by NPA i at time t as:
𝛼̂𝑖𝑘𝑡 = 𝐷𝑖𝑘 [∏𝑥𝑡=0(1 + 𝑟𝑘𝑡 ) − ∏𝑥𝑡=0(1 + 𝑟𝑏𝑡 )], (1)
where 𝐷𝑖𝑘 is equal to 1 for positive reports and -1 for negative reports, 𝑟𝑘𝑡 is the return on stock
discussed in report k on day t, and 𝑟𝑏𝑡 is the return on a benchmark. We measure returns starting from
day 0. For reports issued outside of trading hours, the day 0 return is the CRSP return for the trading
day following the recommendation. For reports issued during trading hours, we obtain the prevailing
7
The tilt towards positive reports is attributable to the fact that contributors are far more likely to disclose long
positions relative to short positions.

midpoint quote from TAQ for the stock at the time the report is published on Seeking Alpha, and we
calculate the day 0 return from the quoted midpoint to the closing price.
We measure the cumulative returns from day 0 through day x, where we set x equal to either five
or 63 trading days. The five-day horizon allows us to benchmark our findings for NPAs to Crane and
Crotty (2020) who consider a five-day horizon in their analysis of sell-side analysts. Further, if markets
are able to efficiently incorporate the information content of NPA research, focusing on the relatively
short-horizon offers a more powerful test of contributor skill that is less susceptible to research design
choices (e.g., risk-model choices). However, the evidence in Chen et al. (2014) suggests that the market
is not efficient in incorporating the investment value of SA research. In addition, much of the five-day
returns may be difficult for the typical investor to realize given natural delays in processing investment
research and the substantial transaction costs associated with high-turnover trading strategies. For this
reason, we also consider the longer 63-day horizons, and in some tests, we also explore the impact of
processing delays and transaction costs.
We compute the benchmark returns (𝑟𝑏𝑡 ) for full trading days based on the Fama-French (2015)
five-factor model augmented to include the Carhart (1997) momentum factor. Specifically, we define
the benchmark return as: 𝑟𝑓𝑡 + 𝛽̂𝑘,1 (𝑀𝐾𝑇𝑅𝐹) + 𝛽̂𝑘,2 (𝑆𝑀𝐵) +𝛽̂𝑘,3 (𝐻𝑀𝐿) +𝛽̂𝑘,4 (𝑅𝑀𝑊) +𝛽̂𝑘,5 (𝐶𝑀𝐴)
+𝛽̂𝑘,6 (𝑀𝑂𝑀), where the beta of stock k with respect to each factor is estimated from the six-factor model
with daily returns over the [-272, -21] trading day window relative to the report release.8 In robustness
tests, we also consider alphas for alternative factor models as well as market-adjusted -returns (see Table
5). Since the intraday returns are not readily available for each factor, the intraday benchmark (i.e., the
8
The returns for the factor portfolios are taken from Ken French’s website:
(https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html).

return from the report release until closing on the same trading day) is the return on S&P 500 exchange-
traded fund (SPY) over the same period.
We measure contributor skill by computing the average alpha for each NPA i as:
1
𝛼̅
̂𝑖 = ∑𝑛𝑖=1 𝛼̂𝑖𝑘𝑡 , (2)
𝑛
where n equals the number of positive and negative research reports published by NPA i. To be
included in the sample, we require that the NPA have at least 10 reports. The final sample includes
1,879 NPAs who have authored 123,120 positive or negative research reports. Conditioning on the
10 reports cutoff, the median (average) NPA has authored 25 (66) reports.
2.2 Descriptive Statistics
Table 1 reports descriptive statistics for 𝛼̅̂ and t(𝛼̅̂ ), defined as 𝛼̅̂ scaled by its standard
error. We find that the average 𝛼̅̂ across all the NPAs in the sample is 0.38% over a 5-day horizon
and 0.68% over the 63-day horizon. Both estimates are consistent with the average contributor
having economically meaningful skill in their investment recommendations. One caveat is that the
analysis limits the sample to contributors with at least 10 research reports. To the extent that
contributors with stronger initial performance are more likely to remain on Seeking Alpha (i.e.,
survivorship bias), our estimates could be biased upwards. In unreported tests, we also examine
the average 𝛼̅̂ across all NPAs in the sample with less than 10 research reports. We find the mean
for the five-day (63-day) holding period is 0.28% (0.56%). The estimates are lower than the
corresponding estimates for NPAs with more than 10 research reports and suggest that
survivorship bias likely results in moderately inflated estimates of average skill. However,
survivorship bias will not impact tests focusing on out-of-sample performance (e.g., Tables 4 -7),
which is the primary focus of the paper.

The summary statistics also indicate substantial dispersion in 𝛼̅̂ across contributors. For the
63-day horizon, the standard deviation is 6.62% with an interquartile range of -2.21% to 3.06%.
The estimates are also measured with considerable noise. The cross-sectional average standard
error over the 5-day (63-day) horizon is 1.44% (3.82%) and the average t-statistics are less than
0.30. Similarly, only 10% (13%) of NPAs have an 𝛼̅̂ that is significantly greater than 0 over the 5-
day (63-day) window, roughly double what one would expect by chance.
3. Mixture Models and the Cross-Section of NPA Skill
3.1 The Mixture Model
The summary statistics from the previous section suggest that there is substantial dispersion
in NPA skill. This dispersion could be attributable to two factors: 1) differences in true skill and
2) estimation error (i.e., luck). Following Crane and Crotty (2020), we attempt to disentangle these
two components by modeling the performance of NPAs as a mixture of multiple normal
distributions. We assume there is an unknown number of J groups of NPAs with different skill
levels. We ultimately find that a mixture of J =2 skill groups results in the best fit for the 63-day
horizon, so our subsequent discussion assumes J=2.9 For each group j (j = 1 and 2) the skill of
NPAs is assumed to follow a normal distribution centered at 𝜇𝑗 . The model assumes that each
individual NPA belongs to a specific group and that his (or her) true skill, 𝛼𝑖 , is a function of both
the group mean, 𝜇𝑗 , and an individual component 𝜔𝑖 . Hence, the true alpha for NPA i who belongs
to group j is: 𝛼𝑖 = 𝜇𝑗 + 𝜔𝑖 , where 𝜔𝑖 is normally distributed with mean zero and variance 𝜎𝑗2 .
9
Specifically, we estimate the maximum likelihood model, Equation (4) for J = 1, 2, 3 and,4 and select the model
that minimizes the Bayesian Information Criterion (BIC).
10

Estimated abnormal performance (𝛼̂𝑖 ) is measured with estimation error 𝑒𝑖 , which is assumed to
be independent of 𝜔𝑖 and is normally distributed with a mean of zero and a variance of 𝑠𝑖2 . Thus,
the estimated abnormal performance is 𝛼̂𝑖 = 𝜇𝑗 + 𝜔𝑖 + 𝑒𝑖 . Under these assumptions, the density
function for the estimated average abnormal return for NPA i is:
̂ 𝑖 − 𝜇0
𝛼 ̂ 𝑖 − 𝜇1
𝛼
𝑓(𝛼̂𝑖 ) = 𝜋0 × 𝜙 ( ) + 𝜋1 × 𝜙 ( ),
√𝜎02 +𝑠𝑖2 √𝜎12 +𝑠𝑖2 (3)
where 𝜙(. ) represents the standard normal density function. The likelihood function 𝐿 for a
sample of estimated average abnormal returns of N NPAs is:
𝐿(𝛼̂1 , 𝛼̂2 … , 𝛼̂𝑁 | 𝑠1 , 𝑠2 , … , 𝑠𝑁 , 𝜃 ) = ∏𝑁 ̂ 𝑖 ),

𝑖=1 𝑓(𝛼 (4)
where 𝜃 is the set of parameters to be estimated: 𝜋0 , 𝜋1 , 𝜇0 , 𝜇1 , 𝜎0 and 𝜎1 . We estimate the
parameters (𝜃) via maximum likelihood subject to the restrictions that: 0 ≤ 𝜋0 ≤ 1, 𝜋1 = 1 − 𝜋0 ,
and 𝜎𝑗2 ≥ 0 for j = 0, 1.
3.2 The Cross-Section of NPA Skill
Panel A of Table 2 reports the maximum likelihood estimates from Equation (4) for both
the five-day and 63-day holding period.10 The estimates for the 63-day holding period suggest that
roughly 94% of NPAs belong to the lower skill group. However, this group is still characterized
by positive performance. The average alpha for this group is 0.37% and the dispersion in true alpha
for the group (i.e., 𝜎0 ) is 1.44%. The remaining 6% of NPAs belong to a higher skill group with
10
We estimate the mixture model with 1-, 2-, 3-, and 4-component distributions. We find that the 2-component
distribution results in the lowest BIC for both holding periods. We chose to report the 1-component distribution for
the five-day holding period because the BIC improvement from moving from 1 to 2 components is negligible (less
than 0.1%) and the percentage of NPAs belonging to the 2nd component is very small (less than 0.5%).
11

an average alpha of 1.48%. This group also exhibits substantially more dispersion in (true) alpha,
with a standard deviation of 8.21%.
The mixture model also allows us to estimate the distribution of alpha across the full
sample, which we report in Panel B.11 The average alpha is 0.43% and the standard deviation is
2.48%. The standard deviation of estimated alpha reported in Table 1 is 6.62%. This suggests that
estimation error accounts for roughly 63% of the dispersion in estimated alphas, while true
dispersion in skill accounts for the remaining 37% of the dispersion. Thus, eliminating estimation
error results in considerably more precise estimates. For example, we now estimate that 60% of
NPAs have positive skill. This compares favorably to Table 1 which reports that only 56% of
NPAs have positive alphas, and only 13% of NPAs have alphas that are significantly positive at a
5% level.
Although estimation error accounts for a sizeable fraction of the variability in the observed
performance in NPAs, the true dispersion in skill of 2.48% over a 63-day holding period (9.92%
annualized) remains economically sizeable. As a benchmark, Harvey and Liu (2018) and Fama
and French (2010) estimate a true annualized standard deviation for mutual funds of 1.19% and
1.25% respectively. Thus, dispersion in true ability among NPAs is roughly eight times larger than
dispersion in mutual fund performance. The larger dispersion is perhaps not surprising since
mutual funds tend to hold much more diversified portfolios. Further, capacity constraints and fund
expenses likely shrink the performance of skilled mutual funds towards zero (Berk and Green,
2004).
11
To calculate the quantiles of the mixture distribution, we solve for the return value q that solves:
𝑞− 𝜇0 𝑞− 𝜇0
𝑋 = 𝜋0 𝜙 ( ) + 𝜋1 𝜙 ( ) for percentile X, where 𝜙(. ) represents the standard normal density function.
𝜎0 𝜎0
12

The above findings have two potentially important implications. First, the average investor
may be better off following the investment recommendations of NPAs rather than delegating their
money to mutual funds managers. Second, given the sizeable dispersion in NPA ability, investors
can likely earn superior returns by limiting their attention to the subset of the NPAs with a track
record of excellent past performance. In the next section, we quantify the potential profits to
investors from following NPA recommendation, both for the entire set of NPAs and the subset of
NPAs with the best past performance.
4. The Returns to Following NPA Research
4.1 Portfolio Construction
To quantify the trading profits that accrue to investors from following NPA
recommendations, we construct transaction-based calendar-time portfolio (see, e.g., Seasholes and
Zhu, 2010, and Jame, 2018). We begin by describing the unconditional strategy that follows all
NPA recommendations. For this strategy, each time an NPA issues a positive report, we place $1
of the stock in the long portfolio. Similarly, each time an NPA issues a negative report, we place
$1 in the short portfolio. We hold the position for 63 trading days, which mimics the 63-day
holding period studied in the previous analysis. Each additional positive (negative) report on the
stock results in an additional $1 long (short) investment at the time of the report. If existing reports
offer conflicting recommendations, we unwind existing positions rather than include the same
stock in both the long and short portfolio.12
12
For example, if there is a positive report for a stock on trading day 5 and a negative report for the same stock on
trading day 10, the trading strategy would: 1) initiate a long position starting on trading day 5, 2) close the long position
on trading day 10, 3) initiate a short position on trading days 68 (63 trading days after the initial long position), and
4) close the short position on trading day 73.
13

Each day, we calculate the dollar-weighted average abnormal return of stocks in the long
and short portfolios. The return calculation is identical to the previous analysis. Specifically, the
returns for days [1,63] are taken from CRSP. For day [0] returns, if the report was issued outside
of trading hours, the day [0] return is the CRSP return for the trading day following the
recommendation; if the report is issued during trading hours, we obtain the prevailing midpoint
quote from TAQ for stock i at the time the report is published on Seeking Alpha, and we calculate
the day 0 return from the quoted midpoint to the closing price. We note that while our methodology
is appropriate for quantifying contributor skill, it likely overstates the potential trading profits since
it ignores trading costs and assumes investors can instantaneously process NPA research. We view
these assumptions as providing a useful upper bound, and we consider more realistic assumptions
in subsequent analysis.
This approach results in a single time-series of daily returns for both the long and short
portfolio, starting in January 2007 and ending in March of 2018.13 We compute the six-factor alpha
for the long (or short) portfolio by regressing the daily return on the long (short) portfolio in excess
of the risk-free rate on the Fama-French (2015) five-factors and the Carhart (1997) momentum
factor.
The estimation of the abnormal returns to the conditional strategies are similar, except we
first limit the analysis to the subset of NPAs that meet a performance requirement (e.g., (𝛼̂𝑖10 ) in
the top quintile of the distribution). For each SA report, we measure the past performance of the
NPA who authored the report as the average abnormal return across her most recent 10 reports
(𝛼̂𝑖10 ). In calculating 𝛼̂𝑖10 we exclude reports that were issued within the past 63 trading days,
13
Our sample of Seeking Alpha research reports ends in December 2017. However, stocks remain in the buy and
sell portfolio for 63-trading days after the research report is published.
14

since the 63-day return would not be known to investors at the time of the report release. We then
group all NPAs into quintiles based on their most recent 𝛼̂𝑖10 relative to the distribution of 𝛼̂10 as
of the end of the previous month. The distribution includes all contributors who have issued at
least one research report in the past year and at least five reports since the start of the sample.
4.2 Portfolio Characteristics and Returns
Panel A of Table 3 provides summary statistics on the portfolio size and the factor loadings
of the long, short, and long-short portfolio for the unconditional strategy. On an average day, the
long portfolio is based on 1,556 reports from the trailing 63-trading days, resulting in long
positions in 451 different stocks, while the short portfolio is based on 786 reports resulting in short
positions in 270 different stocks. The long-short portfolio tends to have a tilt towards larger stocks,
growth stocks, and momentum stocks. Panels B and C reports analogous results for the strategy
that only follows contributors in the top or bottom quintile of past alpha. We find that relative to
contributors in the bottom quintile of past alpha, contributors in the top quintile of past alpha tilt
their recommendations towards momentum stocks and stocks with strong past profitability.
Panel A of Table 4 reports the six-factor alpha of the long and short portfolios for the
strategy that follows all NPAs. For ease of interpretation, we convert the daily alpha to a monthly
alpha by multiplying by 21. We find that the alpha of the long portfolio is 0.40% per month which
is statistically significant at a 1% level. The alpha of the short portfolio is -0.39% per month.
Although the estimate is similar in magnitude (in absolute value terms), it is not reliably different
from zero. The long-short portfolio earns an economically sizeable 0.79% per month. The
significant return on the long-short portfolio is broadly consistent with the evidence in Table 2
which suggests that majority of NPAs are skilled.
15

Panel B of Table 4 reports the results for trading strategies that condition on past
performance. We find that conditioning on past 𝛼̂𝑖10 can generate significantly larger portfolio
returns. For example, a strategy that only follows NPAs in the top quintile of 𝛼̂𝑖10 generates a long-
short return of 1.78% per month or more than 21% annualized. The long-short spread
monotonically declines and becomes significantly negative for NPAs in the bottom quintile of
𝛼̂𝑖10 . Panel C also confirms that following NPAs in the top quintile of 𝛼̂𝑖10 outperforms the
unconditional strategy by a statistically significant 0.99% per month.
4.3 Portfolio Returns – Robustness
In Table 5, we examine whether the findings from Table 4 are robust to key research design
choices. For reference, we report the key estimates from the baseline setting (as reported in Table
4) in Row 1. Rows 2 through 4 confirm that our results are similar for different definitions of past
skill including: measuring skill using only the past five reports (Row 2), measuring skill using all
reports issued over the prior 12 months (Row 3), or measuring skill using t(𝛼̂𝑖10 ), defined as 𝛼̂𝑖10
scaled by its standard error (Row 4). Rows 5 through 8 show that the results are robust to a variety
of different risk adjustments including: market-adjusted returns, alphas from the Fama-French
(1993) three-factor model, the Fama-French (1993) three-factor model augmented to include the
Carhart (1997) momentum factor, and the Fama-French (2015) five-factor model. Rows 8 and 9
document that the findings are qualitatively similar if we replace the one-quarter holding period
with holdings periods of one-month or six-months, respectively. Row 10 indicates that the results
are similar if we exclude microcap stocks, defined as stocks below the NYSE 20th percentile. This
finding suggests that the results are not being driven by the smallest stocks, which tend to be highly
illiquid.
16

One concern is that NPA research may simply “piggyback” on other major news events,
making it difficult to separate the impact of the event itself from the NPA’s analysis of the event.
We address this concern using two approaches. In our first approach, we exclude SA research
reports issued after trading hours (57% of the sample). Focusing on intraday reports, coupled with
our calculation of intraday returns, helps isolate the price discovery associated with the SA
research report, rather than any previous news. Admittedly, it is still possible that some of the
return following the SA report is attributable to a delayed reaction to some underlying news event.
While this distinction has implications for understanding how NPAs add value, this subtlety is
likely irrelevant to investors, since the trading profits that accrue to following an NPA’s
recommendation would remain the same. Our second approach to addressing this concern is to
exclude the roughly 60% of SA research report that are issued on the same day or the day after an
earnings announcement, sell-side research report, or media article. The results of these two
approaches are reported in Rows 12 and 13 of Table 5. Although both approaches eliminate the
majority of SA research reports, we continue to find statistically significant abnormal returns to
following NPAs in the top quintile of past skill. Further, the point estimates are roughly 80%-90%
of the baseline estimates. This finding suggests that much of skill of the best NPAs extends beyond
simply processing recent information events.
We also examine whether the outperformance of the top NPAs is concentrated in certain
time periods. Figure 1 plots the trading profits associated with following contributors in the top
quintile of 𝛼̂𝑖10 for each year in the sample period (2007-2017). We find that the trading profits
are positive in 10 of the 11 years and are statistically significant (at a 5% level) in eight of the 11
years. Further, the average magnitude is similar in the first five years of the sample (1.89%) and
17

the last five years of the sample (1.77%), suggesting that the returns to the trading strategy have
remained large despite the increasing popularity of the Seeking Alpha platform.
4.4 Incorporating Bid-Ask Spreads and Investment Delays
The results from the previous sections indicate that the recommendations of NPAs in the
top quintile of 𝛼̂𝑖10 generate economically larger returns over the subsequent 63 trading days.
However, as noted earlier, our estimated returns overstate the potential trading profits that would
accrue to investors since they do not account for the bid-ask spreads and investment delays. In this
section, we repeat the analysis after incorporating more realistic assumptions.
Panel A of Table 6 reports the results for top quintile of 𝛼̂𝑖10 . Row 1 reports the baseline
results from Table 4 as a reference. Row 2 incorporates bid-ask spreads but continues to assume
no investment delays. Specifically, following positive reports, investors now purchase the stocks
at the ask price at the time of the report publication and sell the stock at the bid price at the end of
the 63-day holding period. Similarly, following negative reports, investors sell the stocks at the
bid price at the time of the report publication and repurchase the stock at the ask price at the end
of the 63-day holding period. We find that incorporating bid-ask spreads reduces the long-short
spread from 1.78% to 1.19%, a roughly 30% decline.14 However, the 1.19% estimate remains
highly significant.
This finding suggests that investors who trade immediately following the recommendations
of NPAs in the top quintile of past alpha can earn significant abnormal returns even after
accounting for bid-ask spreads. One point of caution, however, is that our analysis excludes other
14
Our estimates incorporate 4 transactions (initiating and closing positions on both the long and short side), implying
a half-spread of roughly 0.15% ((1.78% – 1.19%)/4). This estimate is very similar to the 0.16% effective half-spread
reported in Boehmer et al. (2020).
18

transaction costs, including trading commissions and price impact. While the effects of price
impact are likely to be modest for smaller retail investors, they can be substantial for larger
institutional investors (e.g., Korajczyk and Sadka, 2004; and Frazzini, Israel, and Moskowitz,
2018), which may help explain why the potential trading gains are not completely eliminated by
the trading of large sophisticated investors.
Rows 3 and 4 continue to incorporate the bid-ask spread, but now also examine the impact
of investment delays of 24 or 72 hours. For example, if a positive research report was published at
10:30 am on Monday, the 24 (72) hour delay would assume investors purchased the stock at the
ask price as of Tuesday (Thursday) at 10:30 am. Farrell et al. (2020) find that many retail investors
respond to SA research within 30 minutes of the release of the SA report, suggesting that these
investment delays are likely to be far longer than necessary for many attentive investors.
Nevertheless, they provide a useful benchmark for assessing the sensitivity of trading profits to
investment delays.
We find that incorporating a 24-hour delay reduces the trading profits by roughly 20%
(from 1.19% to 0.98%). The 72-hour delay results in the trading profits falling further to 0.84%,
an additional 15% decline relative to the 24-hour delay, or a roughly 50% decline relative to the
baseline results. Nevertheless, the 0.84% long-short spread reported in Row 4 is still statistically
significant. It is also economically sizeable as it translates to an annualized excess return of greater
than 10%. In addition, we note that the long-only portfolio also generates a highly significant
monthly abnormal return of 0.55%, indicating that the value of the trading strategy is not
contingent on investors’ ability to short sell.
Panel B of Table 6 reports analogous results for the unconditional trading strategy. We find
that incorporating bid-ask spreads and investment delays eliminates all of the abnormal returns.
19

For example, Row 4 indicates that after incorporating bid-ask spreads and a 72-hour investment
delays, the long-short spread for the unconditional strategies falls to -0.09%. In fact, even
incorporating bid-ask spreads eliminates the trading profits associated with following all NPAs,
indicating that even the most attentive investors would be unable to profit from the unconditional
strategy.
The evidence from Panel B of Table 6 is consistent with the findings of Barber et al. (2001)
who find that incorporating transaction costs and modest investment delays eliminates the profits
associated with following the recommendations of sell-side analysts. However, our evidence that
investors can profit from conditional strategies contrast with the findings on sell-side analysts. In
particular, Mikhail, Walther, and Willis (2004) consider trading strategies that condition on the
past performance of sell-side analyst recommendations. Like us, they find persistent differences
in stock-picking ability among analysts. However, they find that after incorporating transaction
costs and a three-day investment delay, the returns to both conditional and unconditional trading
strategies are insignificant. This is primarily attributable to the fact that the majority of returns that
accrue to sell-side analyst recommendations are incorporated within three trading days of the
report release.15 In contrast, we find that much of the returns following NPA recommendations,
particularly the most skilled NPAs, are only incorporated into prices with a significant delay.
5. Additional Analysis
5.1 Additional Contributor Characteristics and the Informativeness of NPA Research
15
A secondary consideration is that the bid-ask spreads in their sample, which pre-dated decimalization, were also
considerably larger.
20

In this section, we explore whether contributor characteristics apart from past performance
are associated with more informative research. While past performance is a natural predictor of
future performance, the evidence from the mixture models in Table 2 suggest that a large fraction
of past performance is attributable to luck rather than skill, which points to the possibility that
other contributor characteristics may be valuable in explaining the informativeness of research
reports. We emphasize that the objective of this section is not to develop implementable trading
strategies, but rather to simply explore whether any other NPA attributes are associated with more
impactful research.
We begin by considering several variables related to prior academic accomplishments and
industry experience, as (self) reported in the NPA’s biography on the SA platform. Prior work
finds that fund managers with a PhD, an MBA, or an undergraduate degree from a top university
are more skilled (see, e.g., Chaudhuri, Ivkovic, Pollet, and Trzcinka, 2020; Chevalier and Ellison,
1999; Li, Zhang, and Zhao, 2011). 16 Accordingly, we include indicators equal to one if the
contributor’s bio mentions having a PhD (PhD), an MBA (MBA), or any degree from a school in
the top 50, as measured by the school’s 75th percentile SAT score based on the 2015 vintage of
stateuniversity.com (Top School). There is also evidence that hedge funds and private equities
funds tend to be relatively skilled (see, e.g., Kaplan and Schoar, 2005 and Kosowski, Naik, and
Teo, 2007), which motivates the inclusion of indicators equal to one if the contributor’s bio
mentions having prior work experience with a hedge fund (HF) or private equity fund (PE).
We also explore whether NPAs who engage with others on the SA platform issue more
informative research. NPAs who read and comment on other SA reports are likely to be more
16
In contemporaneous work, Farrell et al. (2020) find that retail trading tends to be more informative following SA
research reports written by contributors with better academic qualifications.
21

committed to the SA research community. Further, given the evidence that SA comments tend to
be more reliable predictors of future performance than SA research reports (Chen et al., 2014), it
is plausible that NPAs who comment on other reports are likely to be more informed than other
NPAs. We include two comment variables: Self-Comments, the natural log of 1 plus the number
of comments in the past 12-months written by an NPA on any of her own reports (e.g., responding
to other comments about her report), and Other-Comments, the natural log of 1 plus the number
comments in the past 12-months on any SA report authored by a different NPA.
Our final set of variables focus on the extent to which NPAs specialize on certain topics.
To identify topic specialization, we first encode each report as a set of weights among twenty
topics, which we estimate using an unsupervised machine learning technique, Latent Dirichlet
Allocation or LDA (Blei, Ng, and Jordan, 2003). Figure 2 reports the top 10 characteristic words
for each of 20 topics. We find that many of the topics are industry related. For example, Topic 4
includes words like drug, patient, trial, treatment, FDA, etc. and naturally corresponds to the
pharmaceutical industry. However, we also see that there are non-industry topics. For example,
Topic 19 includes words like: deal, management, acquisition, shareholder, value, board, etc. and
thus captures topics related to mergers and acquisitions.
We compute two variables that are related to specialization: Within-Report and Across-
Report Focus. Within-Report Focus is calculated as the standard deviation of the topics’ weights
within a report, averaged across the contributor’s prior reports. To calculate Across-Report Focus,
we compute the standard deviation of each topic weight across the contributor’s past reports, and
then calculate the average standard deviation across each of the twenty topics. Thus, Within-
Report Focus measures a contributor’s tendency to write reports that are dedicated to a specific
topic (even if the topics vary over time), while Across-Report Focus measures a contributor’s
22

tendency to write about similar topics across different reports (even if each report covers multiple
topics). To facilitate interpretation, we multiply both variables by negative 1, so that higher values
correspond to greater levels of focus.
We next estimate the following regression:
𝛼̂𝑖𝑘𝑡 = 𝛼 + 𝛽1 𝛼̂𝑖10 + 𝛽2 𝑁𝑃𝐴_𝐶ℎ𝑎𝑟𝑖𝑡 + 𝛽3 𝐹𝑖𝑟𝑚_𝐶ℎ𝑎𝑟𝑘𝑡 + 𝑇𝑖𝑚𝑒𝑡 + 𝜀𝑖𝑘𝑡 (5)
The dependent variable, 𝛼̂𝑖𝑘𝑡 , is the 63-day six-factor alpha associated with each report k issued
by NPA i at time t, as defined in equation (1). The main variables of interest are past alpha (𝛼̂𝑖10 )
and NPA_Char which includes PhD, MBA, Top School, HF, PE, Self-Comments, Other-
Comments, and Within-Report and Across-Report Focus. We also explore whether the
informativeness of the recommendation varies with firm characteristics. We include the following
firm characteristics taken from Fama and French (2008): market capitalization (Size), book-to-
market (BM), the level of net stock issues (NS), an indicator equal to one if the firm had no new
stock issues (Zero NS), the cumulative stock return over the prior two to twelve months (Mom),
growth in assets (dA/A), profitability (Y/B), an indicator equal to one if profitability is negative
(Neg Y), and the change in operating working capital scaled by book equity (Ac/B). We further
split Ac/B into Pos Ac/B and Neg Ac/B based on whether the values of Ac/B are greater than or less
than zero. More detailed variable definitions are provided in the Appendix. We standardize all
continuous variables to have mean 0 and standard deviation 1, and we cluster standard errors by
NPA and month.
Table 7 reports the results. Specification 1 only includes past alpha (𝛼̂𝑖10 ). The results
confirm the findings from the previous analysis that NPA skill is highly persistent. The point
estimate indicates that a one standard deviation increase in contributor skill is associated with a
1.33% increase in the 63-day return associated with the NPA recommendation.
23

Specification 2 adds NPA_Char. We find that none of the measures of academic
achievement (PhD, MBA, or Top School) or industry experience (HF and PE) are associated with
more informative recommendations. We do find that more engaged NPAs, as measured by Other-
Comments, issue more informativeness recommendations. In particular, a one-standard deviation
increase in Other-Comments is associated with 0.55% greater alpha over the subsequent quarter.
This contrasts with authors who comment heavily on their own articles: a one standard deviation
increase in Self-Comments is associated with a decline in alpha of 0.41%. One potential
explanation for this is that self-comments are often responses to critical questions from skeptical
readers, perhaps as a consequence of unclear or low-quality research reports. We also find that
both focus-related variables are significantly related to returns. A one-standard deviation increase
in Across-Report Focus is associated with a 0.22% increase in 63-day ahead returns, while the
corresponding estimate for Within-Report Focus is 0.24%. We note that while both focus variables
are statistically significant at a 5% level, their magnitude are less than one-fifth of the estimated
impact of past alpha.
Specification 3 includes both past skill and NPA_Char. The inclusion of past skill does not
significantly reduce the estimates on Other-Comments, Across-Report Focus, and Within-Report
Focus, and all three estimates remain statistically significant at (at least) a 5% level. Specification
4 adds Firm_Char. We find very little evidence that the informativeness of NPAs
recommendations vary systematically with firm characteristics. Further, the inclusion of the firm
characteristics does not substantially alter the estimates on 𝛼̂𝑖10 or NPA_Char.
Finally, Specifications 5 and 6 repeat Specification 4 after decomposing the 63-day return
[0,63] into a short-term market reaction [0,1] and a subsequent drift [2,63]. We find that across all
five significant predictors of the 63-day returns (𝛼̂𝑖10 , Self-Comments, Other-Comments, Across-
24

Report Focus, and Within-Report Focus), the overwhelming majority of the 63-day return is not
impounded into prices in the short term. In fact, only one of the four predictors is statistically
significant over the [0,1] holding period (𝛼̂𝑖10 ), and even for this variable the immediate market
reaction only accounts for roughly 5% of the total 63-day return.
5.2 Do Investors Recognize Differences in NPA Skill?
The findings from Tables 6 and 7 suggest that much of the differences in skill across NPAs
is not immediately impounded into prices. One plausible explanation for this finding is that
investors simply do not recognize differences in NPA skill. To explore this possibility more
carefully, we examine retail and institutional order imbalances following the release of SA
research reports. Specifically, we estimate following regression:
𝑂𝐼𝐵𝑖𝑡 = 𝛽1 𝐿𝑜𝑛𝑔𝑖𝑘𝑡 + 𝛽2 𝐿𝑜𝑛𝑔𝑖𝑘𝑡 × 𝑁𝑃𝐴_𝐶ℎ𝑎𝑟𝑖𝑡 + 𝛽3 𝑁𝑃𝐴_𝐶ℎ𝑎𝑟𝑖𝑡 + 𝛽4 𝐶ℎ𝑎𝑟𝑖𝑡 + 𝐷𝑎𝑦𝑡 + 𝜀𝑖,𝑡 . (6)
The dependent variable, OIB, is either retail investor order imbalance (Retail OIB) or institutional
investor order imbalance (Inst OIB). Retail OIB is defined as the difference between daily retail
buy volume and retail sell volume, scaled by total daily retail trading volume, where retail buy and
sell volume are calculated using the methodology of Boehmer et al. (2020). Similarly, Inst OIB is
the difference between daily institutional buy volume and institutional sell volume, scaled by total
daily institutional trading volume, where institutional buy (sell) volume is defined as aggregate
buy (sell) volume less retail buy (sell) volume, and aggregate trading is signed using the Lee and
Ready (1991) algorithm. We measure both retail and institutional trading over the [0,1] interval
following the release of the report.
Long is an indicator equal to 1 for positive reports, and 0 for negative reports, with reports
being classified as positive or negative using the two-step procedure described in Section 2.1.
NPA_Char includes four NPA characteristics that are significantly associated with report
25

informativeness: 𝛼̂𝑖10 , Other-Comment, Across-Report Focus, and Within-Report Focus.17 Thus,
𝛽1 (𝐿𝑜𝑛𝑔) measures whether order imbalances are correlated with the direction of the report
recommendation, and 𝛽2 (𝐿𝑜𝑛𝑔 × 𝑁𝑃𝐴_𝐶ℎ𝑎𝑟) measures whether this correlation is stronger for
contributors with attributes associated with more informative recommendations. Char is a vector
of firm characteristics (taken from Boehmer et al. 2020) and includes retail or institutional order
imbalances over the prior week (Retail OIB w-1 or Inst. OIB w-1), past one-week returns (Retw-1),
past one month returns (Retm-1), past two to seven month returns (Retm-7,m-2), market capitalization
(Size), share turnover (Turnover), volatility of daily returns (Volatility), and book-to-market (BM).
With the exception of the order imbalance and return variables, all control variables are measured
at the end of the previous year and are in natural logs. The regression also includes date fixed
effects. We standardize all continuous variables to have mean 0 and standard deviation 1, and we
compute standard errors clustered by both firm and day.
Specification 1 of Table 8 reports the results for retail order imbalances prior to the
inclusion of NPA_Char. Consistent with Boehmer et al (2020), we find that retail order imbalances
are highly persistent and strongly negatively related to past one-week returns. More importantly,
consistent with Farrell et al (2020), we find that that retail order imbalances are strongly related to
NPA recommendations (i.e., 𝛽1 > 0). The point estimate indicates that retail order imbalances are
1.23 percentage points higher following positive reports relative to negative reports.
Specification 2 augments Specification 1 by including NPA Char. We find no evidence
that retail investors respond more strongly to reports authored by more skilled contributors. None
of the four NPA characteristics are significantly different from zero. The lack of a differential
17
We exclude Self-Comments since the sign is not in the predicted direction. In untabulated tests, we find no
evidence that order imbalances are correlated with either Self-Comments or Long × Self-Comments.
26

response, particularly with respect to 𝛼̂𝑖10 , is perhaps surprising given the abundance of evidence
that retail investors chase fund managers with recent past performance (e.g., Sirri and Tufano,
1998). One critical difference, however, is that mutual fund returns are featured prominently on
websites, brokerage accounts, fund prospectuses, and in the media. In contrast, the past
performance of NPAs is generally not disclosed, which makes the information far less salient. In
this sense, our findings are broadly consistent with a vast literature that suggests that the salience
of information, rather than the information content itself, is often a primary driver of mutual fund
flows (see, e.g., Kaniel and Parham, 2017; Hartzmark and Sussman, 2019; and Clifford, Fulkerson,
Jame, and Jordan, 2020).
Specifications 3 and 4 of Table 8 report the results for institutional order imbalances.
Institutional order imbalances are related to SA report recommendation (i.e. 𝛽1 > 0), although the
magnitude is much smaller than the estimates for retail investors. Institutional order imbalances
are also more strongly correlated with report recommendations when the report is authored by an
NPA who comments on other contributor’s research. However, the estimates for the other three
NPA attributes are economically small and statistically insignificant. Institutional investors failure
to response more strongly to NPAs with stronger past performance is perhaps puzzling since this
strategy earns abnormal returns even after incorporating bid-ask spreads. However, as discussed
previously, incorporating other transactions costs, most notably price impact, may reduce the
appeal of this trading strategy for large institutional investors. Alternatively, it is possible that the
institutional order imbalance measure is dominated by uninformed trading (e.g., index fund
trading, liquidity-motivated trading, and trading of unskilled asset managers). Regardless, the
evidence from Table 8 indicates that neither retail nor institutional investors react more strongly
to research reports authored by contributors with stronger past performance. The lack of a strong
27

response to the recommendations of the most skilled NPAs is consistent with the previous evidence
that the investment value of their recommendations is only incorporated into market prices after a
significant delay.
6. Conclusion
Investors are increasingly relying on the research of non-professional analysts (NPAs). In
this paper, we offer a first look at the cross-section of skill among NPAs contributing investment
research on the Seeking Alpha Platform. We estimate that a substantial fraction (60%) of NPAs
are skilled. More importantly, we document substantial dispersion in skill. In particular, after
accounting for variability due to estimator error, we find that the dispersion in true ability among
NPAs is roughly eight times as large as cross-sectional dispersion in performance across mutual
fund managers.
The market does not fully recognize skill differences across NPAs. A simple transaction-
based calendar time strategy that only follows the recommendations of NPAs in the top quintile of
past performance generates annualized abnormal returns in excess of 10% even after incorporating
bid-ask spreads and allowing for a three-day investment delay. In contrast, an analogous strategy
that follows the recommendations of all NPAs does not generate significant outperformance.
Despite the sizeable trading gains associated with following only the most skilled contributors, an
analysis of retail and institutional order imbalances around NPA recommendations suggests that
neither group recognizes differences in contributor skill.
Our findings are consistent with much of the recent literature that suggests social media
can have positive effects on financial markets and improve investment decision making. At the
same time, the markets’ failure to incorporate sizeable differences in ability across NPAs suggests
28

that many of the benefits are not being fully realized. Our findings raise the question of whether
social media sites, or possibly even regulators, should provide more readily available information
about NPA attributes, including past performance, to help investors better recognize differences
in ability.
29

Appendix A: Variable Definitions
A.1 Skill and Portfolio Variables (Tables 1 - 6)
• ̅𝑖 –the average buy-and-hold estimated abnormal return for an NPA i. We compute the
𝛼̂
buy-and-hold return for all positive and negative Seeking Alpha research over the
subsequent five [0,5] or 63 [0,63] trading days, and we multiply returns following negative
research by negative 1. Abnormal returns are based on a six-factor alpha, aggregated at the
NPA level by averaging across all the NPA’s reports.
o Negative (Positive) Report – We follow a two-step procedure: first we classify all
reports in which an investor discloses a long (short) position in the stock as positive
(negative) (Campbell et al. 2019). For all remaining positions, we follow Chen et
al. (2019) and compute the tone of the report as the percentage of negative words
in the report (Percent Negative), where the negative word list is taken from
Loughran and McDonald (2011). We assign reports in the bottom (top) tercile of
Percent Negative relative to the distribution of report tone on the previous day as
positive (negative).
o Six-factor alphas for each recommendation k is computed as: 𝑅𝑘𝑡 − ( 𝑟𝑓𝑡 +
𝛽̂𝑘,1 (𝑀𝐾𝑇𝑅𝐹) + 𝛽̂𝑘,2 (𝑆𝑀𝐵) + 𝛽̂𝑘,3 (𝐻𝑀𝐿) + 𝛽̂𝑘,4 (𝑅𝑀𝑊) + 𝛽̂𝑘,5 (𝐶𝑀𝐴)
+𝛽̂𝑘,6 (𝑀𝑂𝑀), where the beta of stock k with respect to each factor is estimated
from the six-factor model with daily returns over the [-272, -21] trading day
window relative to the report release.
• ̅
t (𝛼̂𝑖 ) – the t-statistic of estimated average alpha, based on standard errors double clustered
by firm and date.
• 𝜋𝑡𝑦𝑝𝑒 – the estimate of the fraction of high- or low-type NPAs within the mixture model.
• μ – the average location of each NPA type within the mixture model.
• σ – the dispersion in true ability of each type within the mixture model.
• σij – the average dispersion in estimated ability of each type within the mixture model.
• Past Alpha – the average six-factor alpha from a contributor’s past 10 reports, excluding
any reports issued within the past 63 trading days.
• Quantile (Mixture Model Distribution) – The quantiles of the distribution are calculated
𝑞−𝜇0 𝑞−𝜇1
numerically, by estimating the return value q that solves 𝑋 = 𝜋0 𝜙 ( )+𝜋1 𝜙 ( ) for
𝜎0 𝜎1
percentile X, where 𝜙 denotes the normal density function.
• Total Reports – the number of reports resulting in a long (or short) position over the 63-
day holding period averaged across every trading day in the sample.
• Unique Stocks – the number of different stocks in the long (or short) portfolio, averaged
across every trading day in the sample.
• Microcap stocks – stocks below the NYSE 20th percentile for market capitalization.
• Overnight reports – reports which were issued outside of trading hours (9:30 am – 4 pm).
• Confounding Event – an indicator equal to one if the Seeking Alpha report is on the day of
or the day after a media article, brokerage research report, or earnings announcement.
o Media Article – Dow Jones News Service articles from RavenPack, with relevance
and novelty scores of 100 (Source: Ravenpack).
30

o Sell-side Report – a sell-side analyst investment recommendation or earnings
forecast. (Source: IBES).
o Earnings Announcements (Source: IBES).
A.2 Cross-Sectional Variables (Tables 7 and 8)
• 𝛼̂𝑖𝑘𝑡 – the buy-and-hold six-factor alpha following an NPA reports, where alphas
following negative reports are multiplied by negative 1.
• Retail OIB – retail buy volume less retail sell volume, scaled by total retail trading volume.
Retail trades are identified and signed according to the algorithm of Boehmer, Jones,
Zhang, and Zhang (2020) (Source: TAQ).
• Institutional OIB – the non-retail share volume bought less the non-retail share volume
sold, scaled by the non-retail volume traded. Non-retail trading is signed used the Lee and
Ready (1991) algorithm. Non-retail buy (sell) volume is aggregate TAQ buy (sell) volume
less retail buy (sell) volume. When Daily Trade and Quote (DTAQ) data is available (2015-
2017), the Lee and Ready (1991) algorithm as classified by WRDS. For the Monthly Trade
and Quote (MTAQ) data sample (2007-2014), the Interpolated Lee and Ready Algorithm
of Holden and Jacobsen (2014) is used (Source: TAQ).
• Past Alpha – the average six-factor alpha from a contributor’s past 10 reports, excluding
any reports issued within the past 63 trading days.
• Across-report focus – the standard deviation of weights within each LDA topic, averaged
over topics. Measured over all existing reports written by a given NPA at time t.
• Within-report focus – the standard deviation of LDA topic weights within a report,
averaged over all existing reports written by a given NPA at time t.
• Self-Comments –the natural log of 1 plus the NPA’s total number of comments on his own
reports over the past year.
• Other-Comments – the natural log of 1 plus the NPA’s total number of comments on other
reports in the past year.
• PhD – An indicator equal to one if the contributor’s self-reported bio on Seeking Alpha
mentions a PhD.
• MBA – An indicator equal to one if the contributor’s self-reported bio on Seeking Alpha
mentions an MBA. (Source: Seeking Alpha).
• Top School – An indicator equal to one if the contributor’s self-reported bio on Seeking
Alpha mentions a degree from a school in the top 50 of SAT scores based on the 75th
percentile, as reported by the 2015 vintage of stateuniversity.com
(https://www.stateuniversity.com/rank/sat_75pctl_rank.html).
• HF – An indicator equal to one if the contributor’s self-reported bio on Seeking Alpha
mentions “hedge fund”.
• PE – An indicator equal to one if the contributor’s self-reported bio on Seeking Alpha
mentions “private equity”.
• Market Capitalization (Size) – the market capitalization, computed as share price
multiplied by total shares outstanding at the end of year t-1 (Source: CRSP).
• Book-to-Market (BM) – the book-to-market ratio, computed as the book value of equity
during the calendar year scaled by the market capitalization at the end of the calendar year.
31

Negative values are deleted, and positive values are winsorized at the 1st and 99th percentile.
(Source: CRSP and Compustat).
• NS – net stock issues: the natural log of the split-adjusted shares outstanding at the end of
year t-1 divided by the split-adjusted shares outstanding at the end of year t-2. (Source:
Compustat).
• Zero NS – an indicator equal to one if the firm had no new stock issues. (Source:
Compustat).
• dA/A – growth in assets: the natural log of the ratio of assets over the split-adjusted share
at the fiscal year end in t−1 divided by assets over the split-adjusted share at the end of year
t−2. (Source: Compustat)
• Y/B – Profitability: equity income before extraordinary, minus dividends on preferred, plus
income statement deferred taxes, divided by book equity at the end of year t−1. (Source:
Compustat)
• Neg Y – an indicator equal to one if profitability is negative. (Source: Compustat).
• Pos Y/B – profitability when profitability is positive and zero otherwise (Source:
Compustat)
• Ac/B – the change in operating working capital scaled by book equity at the end of year
t−1. (Source: Compustat).
• Pos Ac/B – change in operating working capital scaled by book equity when the change in
operating working capital is positive and zero otherwise. (Source: Compustat).
• Neg Ac/B – an indicator when the change in operating working capital is negative. (Source:
Compustat).
• Long – an indicator equal to 1 for positive Seeking Alpha reports, and 0 for negative
reports.
• Turnover – the average daily turnover (i.e., share volume scaled by shares outstanding)
during the calendar year t-1. (Source: CRSP).
• Volatility – the standard deviation of daily returns during the calendar year (Source: CRSP)
• Return (m-2, m-12) – the buy-and-hold gross return over the prior two to 12 months.
(Source: CRSP).
o Ret (w-1) – the buy-and-hold gross return over the prior one week.
o Ret (m-1) – the buy-and-hold gross return over the prior one month.
o Ret (m-2, m-7) – the buy-and-hold gross return over the prior two to seven months.
32

References:
Alexander, G., and Cici, G., and Gibson, S. 2007. Does motivation matter when assessing trade
performance? An analysis of mutual funds. Review of Financial Studies 20, 125-150.
Avery, C., Chevalier, J., and Zeckhauser, R., 2016. The “CAPS” prediction system and stock
market returns. Review of Finance 20, 1363-1381.
Barber, B., Lehavy, R., McNichols, M., and Trueman, B., 2001. Can investors profit from the
prophets? Security analyst recommendations and stock returns. Journal of Finance 56, 531-
563.
Barras, L., Scailler, O., and Wermers, R., 2010. False discoveries in mutual fund performance:
Measuring luck in estimated alphas. Journal of Finance 65, 179-216.
Bartov, E., Faurel, L., and Mohanram, P., 2018. Can Twitter help predict firm-level earnings and
stock returns? The Accounting Review 93, 25-57.
Berk, J. and Green, R., 2004. Mutual fund flows and performance in rational markets. Journal of
Political Economy 112, 1269-1295.
Blei, D.M., Ng, A.Y. and Jordan, M.I., 2003. Latent Dirichlet Allocation. Journal of Machine
Learning Research 3, 993-1022.
Boehmer, E., Jones, C.M., Zhang, X. and Zhang, X., 2020. Tracking retail investor
activity, Working paper.
Campbell, J., DeAngelis, M., and Moon, J., 2019. Skin in the game: Personal stock holdings and
investors’ response to stock analysis on social media. Review of Accounting Studies 24, 731-
779.
Carhart, M.M., 1997. On persistence in mutual fund performance. Journal of Finance 52, 57-82.
Chaudhuri, R., Ivković, Z., Pollet, J. and Trzcinka, C., 2020. A tangled tale of training and talent:
PhDs in institutional asset management. Management Science, forthcoming.
Chen, H., De, P., Hu, J., and Hwang, B.H., 2014. Wisdom of the crowds: The value of stock
opinions transmitted through social media. Review of Financial Studies 27, 1367-1403.
Chen, Y., Cliff, M., and Zhao, H., 2017. Hedge funds: The good, the bad, and the lucky. Journal
of Financial and Quantitative Analysis 53, 33-64.
Chevalier, J. and Ellison, G., 1999. Are some mutual fund managers better than others? Cross-
sectional patterns in behavior and performance. Journal of Finance 54, 875-899.
Clifford, C.P., Fulkerson, J.A., Jame, R. and Jordan, B.D., 2020. Salience and mutual fund investor
demand for idiosyncratic volatility. Management Science, forthcoming.
Crane, A., and Crotty, K., 2020. How skilled are security analysts? Journal of Finance 75, 1629-
1675.
33

Edelen, R., 1999. Investor flows and the assessed performance of open-end mutual funds. Journal
of Financial Economics 53, 439-466.
Fama, E., and French, K., 1993. Common risk factors in the returns on stocks and bonds. Journal
of Financial Economics 33, 3-56.
Fama, E., and French, K., 2008. Dissecting anomalies. Journal of Finance 63, 1653-1678.
Fama, E., and French, K., 2010. Luck versus skill in the cross-section of mutual fund returns.
Journal of Finance 65, 1915-1947.
Fama, E., and French, K., 2015. A five-factor asset pricing model. Journal of Financial
Economics 116, 1-22.
Farrell, M., Green, T.C., Jame, R., Markov, S., 2020. The Democratization of investment research
and the informativeness of retail trading, Working paper.
Frazzini, A., Israel, R. and Moskowitz, T.J., 2018. Trading costs. Working paper.
Hartzmark, S.M. and Sussman, A.B., 2019. Do investors value sustainability? A natural
experiment examining ranking and fund flows. Journal of Finance 74, 2789-2837.
Harvey, C., and Liu, Y., 2018. Detecting repeatable performance. Review of Financial Studies 31,
2499-2552.
Ippolito, R.A., 1992. Consumer reaction to measures of poor quality: Evidence from the mutual
fund industry. The Journal of Law and Economics 35, 45-70.
Jagannathan, R., Malakhov, A., and Novikov, D., 2010. Do hot hands exist among hedge fund
managers? An empirical evaluation. Journal of Finance 65, 217-255.
Jame, R., 2018. Liquidity provision and the cross-section of hedge fund returns. Management
Science 64, 3288-3312.
Jame, R., Johnston, R., Markov, S., and Wolfe, M., 2016. The value of crowdsourced earnings
forecasts. Journal of Accounting Research 54, 1077-1110.
Kaniel, R. and Parham, R., 2017. WSJ Category Kings–The impact of media attention on
consumer and mutual fund investment decisions. Journal of Financial Economics 123, 337-
356.
Kaplan, S.N. and Schoar, A., 2005. Private equity performance: Returns, persistence, and capital
flows. Journal of Finance 60, 1791-1823.
Korajczyck, R. and Sadka, R., 2004. Are momentum profits robust to trading costs? Journal of
Finance 59, 1039-1082.
Kosowski, R., Naik, N., and Teo, M., 2007. Do hedge funds deliver alpha? A Bayesian and
bootstrap analysis. Journal of Financial Economics 84, 229-264.
34

Lee, C.M. and Ready, M.J., 1991. Inferring trade direction from intraday data. Journal of
Finance 46, 733-746.
Li, H., Zhang, X. and Zhao, R., 2011. Investing in talents: Manager characteristics and hedge fund
performances. Journal of Financial and Quantitative Analysis 46, 59-82.
Loughran, T. and McDonald, B., 2011. When is a liability not a liability? Textual analysis,
dictionaries, and 10‐Ks. Journal of Finance 66, 35-65.
Mikhail, M., Walther, B., and Willis, R., 2004. Do security analysts exhibit persistent differences
in stock picking ability? Journal of Financial Economics 74, 67-91.
Pastor, L., Stambaugh, R., and Taylor, L., 2015. Scale and skill in active management. Journal of
Financial Economics 116, 23-45.
Seasholes, M., and Zhu, N., 2010. Individual investors and local bias. Journal of Finance 65, 1987-
2010.
SEC, 2011. Internet fraud.
SEC, 2015. Updated investor alert: Social media and investing – Stock rumors.
SEC, 2017. Investor alert: Beware of stock recommendations on investment research websites.
Sirri, E.R. and Tufano, P., 1998. Costly search and mutual fund flows. Journal of Finance 53,
1589-1622.
35

Table 1: Summary Statistics of Average Abnormal Returns
This table reports cross-sectional summary statistics of the average buy-and-hold estimated abnormal return for
NPA i, alpha or (𝛼̅ ̂𝑖 ), following NPA investment research issued on Seeking Alpha. We compute the buy-and-hold
return for all positive or negative research over the subsequent five [0,5] or 63 [0,63] trading days, and we multiply
returns following negative research by negative 1. A research report is defined as positive (negative) if the NPA
discloses a long (short) position at the end of the report. For reports missing a long/short disclosure, reports are
assigned as positive (negative) if the fraction of negative words in the report is bottom (top) third of the distribution
relative to the distribution of report tone on the previous day. Reports not assigned as positive or negative are
excluded from the analysis. Abnormal returns are based on a six-factor alpha as defined in Section 2.1. Alphas are
aggregated at the NPA level by averaging across all the NPA’s reports. We also report the distribution of t (𝛼̂ ̅𝑖 ),
based on standard errors double clustered by firm and date, the fraction of NPAs with positive performance,
significantly positive performance, and significantly negative performance. The sample include 1,879 NPAs issuing
at least 10 research reports.
alpha t(alpha)
[0,5] [0,63] [0,5] [0,63]
N 1879 1879 1879 1879
Mean 0.38% 0.68% 0.26 0.18
Std Dev 2.13% 6.62% 1.15 1.41
Skewness 1.50 1.35 0.06 0.19
Kurtosis 18.75 16.13 3.35 5.52
P5 -2.25% -8.02% -1.61 -2.05

P10 -1.33% -5.61% -1.18 -1.49
P25 -0.42% -2.21% -0.50 -0.74
P50 0.19% 0.52% 0.26 0.20
P75 0.96% 3.06% 1.05 1.04
P90 2.19% 6.94% 1.68 1.89
P95 3.58% 10.38% 2.13 2.35
Fraction Positive 59.1% 55.5%

Significantly Positive (5%) 10.2% 12.8%
Significantly Negative (5%) 4.7% 8.4%
36

Table 2: Distribution of NPA Skill - Mixture Models
Panel A reports estimates of the fraction (π) of high- and low-type NPAs, the average location of each NPA type
(μ), the dispersion in true ability of each type (σ), and the average dispersion in estimated ability of each type (σ ij).
The results are reported separately for holding periods of five days and 63 days. The alpha of each NPA is calculated
as in Table 1. Panel B reports statistics on the estimated cross-sectional distribution of NPA alpha. The quantiles
of the distribution are calculated numerically, by estimating the return value q that solves 𝑋 =
𝑞−𝜇 𝑞−𝜇
𝜋0 𝜙 ( 0 )+𝜋1 𝜙 ( 1 ) for percentile X, where 𝜙 denotes the normal density function. Positive reports the
𝜎0 𝜎1
fraction of NPAs with alpha greater than 0. For both panels, standard errors are bootstrapped by sampling with
replacement 100 times from the underlying data and estimating the various statistics for each bootstrap sample. The
standard deviation of the bootstrapped distribution of each statistic is reported in parentheses.
Panel A: Estimated Model Parameters
Five-Day Returns 63-Day Returns
Component 0 Component 0 Component 1
π - π 93.87% 6.13%
- (29.33%) (29.33%)
μ 0.15% μ 0.37% 1.48%
(0.02%) (0.19%) (1.00%)
σ 0.27% σ 1.44% 8.21%
(0.03%) (2.45%) (3.01%)
σij 1.38% σij 4.32% 9.16%
(0.03%) (2.45%) (3.01%)
Panel B: Mixture Return Distribution
Five-Day Returns 63-Day Returns
Estimate SE Estimate SE
Mean 0.15% (0.02%) Mean 0.43% (0.13%)
SD 0.27% (0.03%) SD 2.48% (0.19%)
P5 -0.29% (0.06%) P5 -2.29% (0.48%)

P10 -0.19% (0.04%) P10 -1.62% (0.28%)
P25 -0.03% (0.03%) P25 -0.65% (0.27%)
P50 0.15% (0.02%) P50 0.38% (0.15%)
P75 0.33% (0.03%) P75 1.41% (0.29%)
P90 0.49% (0.05%) P90 2.42% (0.27%)
P95 0.59% (0.06%) P95 3.17% (0.44%)
Positive 71% (4.28%) Positive 60% (5.87%)
N 1879 N 1879
37

Table 3: Characteristics of Portfolios Formed Based on NPA Recommendations
This table reports descriptive statistics for calendar-time transaction-based portfolios. Following a positive (negative) report, $1 worth of stock is placed in the
long (short) portfolio and held in the portfolio for 63 trading days, resulting in a time-series of portfolio holdings spanning from January 2007 through March
2018. Total Reports indicates the number of reports resulting a long (or short) position over the 63-day holding period averaged across every trading day in the
sample. Unique Stocks is the average number of different stocks in the long (or short) portfolio averaged across every trading day in the sample. Factor loadings
are the slope coefficients from a single time-series regression of portfolio excess returns (Rp – Rf) on the market excess return (MKTRF), and zero-investment
portfolios with respect to size (SMB), book-to-market (HML), momentum (UMD), profitability (RMW), and investment (CMA). Panel A reports the results for a
strategy that follows all NPAs after they have issued 10 recommendations. Panel B (C) report analogous results for a strategy that only follow NPAs in the top
(bottom) quintile of past performance. Past performance is the average alpha from a contributor’s past 10 reports, excluding any reports issued within the past
63 trading days. Quintile assignments are based on the distribution of alpha at the end of the previous month and includes all contributors who have issued at
least one research report in the past year and at least five research reports since the start of the sample.
Factor Loading from Six-Factor Model

Total Unique
Reports Stocks MKTRF SMB HML UMD RMW CMA R2
Panel A: All NPAs
Long 1556 451 1.01 0.24 -0.14 -0.02 -0.18 -0.16 96.0%
Short 786 270 1.06 0.42 0.22 -0.44 -0.28 -0.30 88.0%
Long - Short -0.06 -0.18 -0.36 0.42 0.11 0.14 50.6%
Panel B: Top Quintile of NPAs (Past Alpha)

Long 235 129 1.02 0.29 -0.19 -0.02 -0.24 -0.10 90.1%
Short 168 97 1.07 0.34 0.26 -0.50 -0.38 -0.45 77.1%
Long - Short -0.05 -0.05 -0.44 0.48 0.14 0.35 34.3%
Panel C: Bottom Quintile of NPAs (Past Alpha)

Long 277 153 1.01 0.36 -0.22 -0.13 -0.40 -0.07 90.8%
Short 162 93 1.03 0.38 0.19 -0.34 -0.31 -0.35 87.1%
Long - Short -0.02 -0.02 -0.41 0.21 -0.09 0.28 26.7%
38

Table 4: Monthly Alphas Earned by Portfolios Formed Based on NPA Recommendations
This table reports the average six-factor alphas from calendar-time transaction-based portfolios formed based on
NPA recommendations. Following a positive (negative) report, $1 worth of stock is placed in the long (short)
portfolio and held in the portfolio for 63 trading days. We compute the daily return on the portfolio, resulting in a
time-series of portfolio returns spanning from January 2007 through March 2018. The reported alphas are the
intercept from a time-series regression of the portfolio excess return (Rp – Rf) on the market excess return
(MKTRF), and zero-investment portfolios with respect to size (SMB), book-to-market (HML), momentum (UMD),
profitability (RMW), and investment (CMA). For ease of interpretation, we convert the daily alpha to a monthly
alpha by multiplying the estimate by 21. Panel A reports the results for a strategy that follows all NPAs after they
have issued 10 recommendations. Panel B report analogous results for a strategy that follow NPAs conditional on
past performance. Past performance is the average alpha from a contributor’s past 10 reports, excluding any reports
issued within the past 63 trading days. Quintile assignments are based on the distribution of alpha at the end of the
previous month and include all contributors who have issued at least one research report in the past year and at least
five research reports since the start of the sample. Panel C reports the difference between the top and bottom quintile
(Top-Bottom) and the difference between the top quintile and all NPAs (Top – All). The t-statistic, computed from
the time-series standard errors, are reported in parentheses.
Panel A: Unconditional Strategy
Long Short Long - Short
All NPAs 0.40% -0.39% 0.79%
(3.68) (-1.45) (2.90)
Panel B: Conditional Strategy based on Past Performance
Top Quintile (high past alpha) 1.02% -0.76% 1.78%
(5.66) (-1.84) (4.10)
Q2 0.65% -0.63% 1.27%
(4.89) (-2.58) (4.81)
Q3 0.40% -0.19% 0.59%
(3.01) (-0.82) (2.37)
Q4 0.20% 0.08% 0.12%
(1.42) (0.32) (0.46)
Bottom Quintile (low past alpha) -0.42% 0.23% -0.65%
(-2.31) (0.89) (-2.19)
Panel C: Strategy Comparison
Top Quintile - Bottom Quintile 1.38% -1.05% 2.43%
(6.58) (-3.05) (5.68)
Top Quintile - All NPAs 0.56% -0.43% 0.99%
(4.20) (-2.00) (3.85)
39

Table 5: Monthly Alphas Earned by Portfolios Formed Based on NPA Recommendations – Robustness
This table examines the sensitivity of the estimates in Table 4 (tabulated for convenience in Row 1) to alternative
research design choices. We report the average returns on the Long - Short portfolio for NPAs in the top quintile of
past performance (Top Quintile), and the difference in the return of the Long - Short portfolio of contributions in
the top quintile of past performance relative to contributors in the bottom quintile (Top – Bottom) or relative to all
NPAs (Top – All). In Specifications 2 through 4, past performance is measured using the prior five reports, all
reports over the prior year, or the t-statistic of alpha, (i.e., the alpha scaled by its standard error). In Specifications
5 through 8, we replace six-factor alphas with either: market adjusted returns, the Fama-French (1993) three-factor
alpha, the Carhart (1997) four-factor alpha, or the Fama-French (2015) five-factor alpha. In Specifications 9 and
10 we replace the one-quarter holding period [0,63] with holding periods of one month [0,21] or six months [0,63].
Specification 11 excludes microcap stocks, defined as stocks below the NYSE 20th size percentile, Specification
12 excludes reports issued outside of trading hours (9:30 am – 4 pm), and Specification 13 excludes reports that are
issued on the same day or the day after an earnings announcement, sell-side research report, or media article.
Return on Long-Short Portfolio
Specification Top Quintile Top - Bottom Top - All
1. Baseline 1.78% 2.43% 0.99%
(4.10) (5.68) (3.85)
Alternative Measures of Past Performance
2. Alpha over past 5 reports 1.69% 2.35% 0.90%
(4.25) (5.93) (3.71)
3. Alpha using all reports in prior year 1.65% 1.76% 0.86%
(3.20) (3.21) (2.41)
4. t-statistics of alpha 1.63% 2.29% 0.85%
(4.30) (5.77) (3.55)
Alternative Measures of Abnormal Returns
5. Market-adjusted returns 2.13% 2.62% 1.06%
(4.26) (5.97) (4.06)
6. Three-factor alpha 1.91% 2.55% 1.03%
(4.13) (5.84) (3.97)
7. Four-factor alpha 1.85% 2.51% 1.02%
(4.28) (5.88) (3.95)
8. Five-factor alpha 1.82% 2.45% 1.00%
(3.96) (5.63) (3.86)
Alternative Holding Periods
9. One-month [0,21] 2.27% 1.58% 0.78%
(4.05) (2.56) (2.07)
10. Six-months [0,126] 1.42% 2.05% 0.79%
(3.67) (5.18) (3.56)
Alternative Sample Filters
11. Exclude microcap stocks 1.62% 2.52% 0.90%
(3.52) (5.75) (3.39)
12. Exclude overnight articles 1.46% 2.04% 0.53%
(3.62) (4.28) (1.69)
13. Exclude confounding events 1.58% 1.99% 0.83%
(2.49) (3.02) (1.85)
40

Table 6: Monthly Alphas Earned by Portfolios Formed based on NPA Recommendations - Accounting for
Bid-Ask Spreads and Investment Delays
This table examines the sensitivity of the abnormal returns estimated in Table 4 (tabulated for convenience in Row
1) to bid-ask spreads and investment delays. Row 2 incorporates bid-ask spreads by assuming that all initial
purchases (sales) are executed at the ask (bid) price at the time of the transaction, and that all long (short) positions
are unwound at the end of the holding period by selling (buying) as the closing bid (ask) price. In Rows 3 (4), we
incorporate bid-ask spreads as well as a 24 (72) hour delay in reacting to investment research. Panel A reports the
returns to following the recommendations of NPAs in the top quintile of past performance (as defined in Table 4),
and Panel B reports the returns to following all NPAs. The t-statistic, computed from the time-series standard errors,
are reported in parentheses.
Panel A: Top Quintile
1. No Bid-Ask Spreads & No Delay 1.02% -0.76% 1.78%
(5.66) (-1.84) (4.10)
2. Bid-Ask Spreads & No Delay 0.74% -0.45% 1.19%
(4.08) (-1.10) (2.75)
3. Bid-Ask Spreads & 24-Hour Delay 0.65% -0.32% 0.98%
(3.64) (-0.81) (2.30)
4. Bid-Ask Spreads & 72-Hour Delay 0.55% -0.29% 0.84%
(3.05) (-0.77) (2.07)
Panel B: All NPAs
1. No Transaction Costs & No Delay 0.40% -0.39% 0.79%
(3.68) (-1.45) (2.90)
2. Bid-Ask Spreads & No Delay 0.15% -0.07% 0.22%
(1.33) (-0.27) (0.80)
3. Bid-Ask Spreads & 24-Hour Delay 0.06% 0.06% 0.00%
(0.55) (0.23) (0.00)
4. Bid-Ask Spreads & 72-Hour Delay 0.00% 0.09% -0.09%
(0.02) (0.38) (-0.36)
41

Table 7: Determinants of NPA Recommendation Informativeness
This table reports estimates from the following regression:
𝛼̂𝑖𝑘𝑡 = 𝛼 + 𝛽1 𝑃𝑎𝑠𝑡_𝐴𝑙𝑝ℎ𝑎 + 𝛽2 𝑁𝑃𝐴_𝐶ℎ𝑎𝑟𝑖𝑡 + 𝛽3 𝐹𝑖𝑟𝑚_𝐶ℎ𝑎𝑟𝑘𝑡 + 𝑇𝑖𝑚𝑒𝑡 + 𝜀𝑖𝑘𝑡
The dependent variable, 𝛼̂𝑖𝑘𝑡 , is the buy-and-hold six-factor alpha, where alphas following negative
recommendations are multiplied by negative 1. In Specifications 1-4, returns are measured over a 63-day holding
period [0,63]. Specifications 5 and 6 decompose the 63-day return into a [0,1] return and a [2,63] return. Past Alpha
is the average alpha from a contributor’s past 10 articles, excluding any articles issued within the past 63 trading
days. NPA_Char and Firm_Char includes contributor and firm characteristics. Detailed definitions for these
characteristics are provided in the Appendix. All continuous variables are standardized to have mean zero and unit
variance. The t-statistics (in parentheses) are computed from standard errors double-clustered by author and month.
[0,63] [0,63] [0,63] [0,63] [0,1] [2,63]
[1] [2] [3] [4] [5] [6]
Past Alpha 1.33 1.32 1.21 0.06 1.21
(6.82) (6.78) (6.34) (4.52) (6.34)
Across-Article Focus 0.21 0.22 0.22 -0.02 0.22
(2.23) (2.41) (2.51) (-1.67) (2.51)
Within-Article Focus 0.23 0.21 0.23 0.02 0.23
(2.23) (2.13) (2.65) (1.32) (2.65)
Self-Comments -0.39 -0.35 -0.41 -0.02 -0.41
(-2.08) (-2.13) (-2.32) (-0.61) (-2.32)
Other-Comments 0.55 0.49 0.49 -0.01 0.49
(3.50) (3.40) (3.23) (-0.50) (3.23)
PhD 0.37 0.47 0.58 -0.21 0.58
(0.36) (0.54) (0.65) (-1.71) (0.65)
MBA -0.22 -0.13 -0.14 0.07 -0.14
(-0.37) (-0.25) (-0.33) (1.13) (-0.33)
Top School -0.07 -0.04 -0.49 -0.16 -0.49
(-0.19) (-0.12) (-1.34) (-3.63) (-1.34)
HF 0.45 0.46 0.07 0.02 0.07
(0.90) (1.05) (0.16) (0.28) (0.16)
PE 0.63 0.53 1.49 0.00 1.49
(0.41) (0.37) (1.16) (0.04) (1.16)
Log (Size) 0.01 -0.06 0.01
(0.24) (-7.38) (0.24)
Log (BM) 0.40 0.07 0.40
(1.26) (1.41) (1.26)
Ret (m-2, m-12) -0.00 0.00 -0.00
(-0.69) (0.31) (-0.69)
Zero NS -0.27 0.15 -0.27
(-0.40) (1.50) (-0.40)
NS -0.00 0.00 -0.00
(-0.21) (0.20) (-0.21)
Neg Ac/B -0.00 -0.00 -0.00
(-0.47) (-0.25) (-0.47)
Pos Ac/B -0.01 -0.00 -0.01
(-0.76) (-0.40) (-0.76)
dA/A 0.00 0.00 0.00
(0.11) (0.43) (0.11)
Neg Y -0.48 0.06 -0.48
(-0.95) (1.16) (-0.95)
Pos Y/B -0.01 0.00 -0.01
(-1.33) (1.39) (-1.33)
Month Fixed Effects Yes Yes Yes Yes Yes Yes
Observations 114,072 117,064 114,071 100,453 100,453 100,453
R-squared 0.93% 0.62% 0.96% 0.91% 0.63% 0.91%
42

Table 8: Retail and Institutional Order Imbalances around NPA Recommendations
This table reports estimates from the following regression:
𝑂𝐼𝐵𝑖𝑡 = 𝛽1 𝐿𝑜𝑛𝑔𝑖𝑘𝑡 + 𝛽2 𝐿𝑜𝑛𝑔𝑖𝑘𝑡 × 𝑁𝑃𝐴_𝐶ℎ𝑎𝑟𝑖𝑡 + 𝛽3 𝑁𝑃𝐴_𝐶ℎ𝑎𝑟𝑖𝑡 + 𝛽4 𝐶ℎ𝑎𝑟𝑖𝑡 + 𝐷𝑎𝑦𝑡 + 𝜀𝑖,𝑡 .
OIB, is either retail investor order imbalance (Retail OIB) or institutional investor order imbalance (Inst OIB),
measured over the [0,1] interval around the release of the recommendation. Retail buy and sell volume are
calculated as in Boehmer et al. (2020), and institutional buy and sell volume are determined using the Lee and
Ready (1991) algorithm. Long is an indicator equal to 1 for positive reports and 0 for negative reports, where
positive and negative are defined as in Table 1. NPA Char includes P𝑎𝑠𝑡 𝐴𝑙𝑝ℎ𝑎, Other-Comment, Across-Report
Focus, and Within-Report Focus, as defined in the Appendix. Char is a vector of firm characteristics. Detailed
definitions are provided in the Appendix. All continuous variables are standardized to have mean zero and unit
variance. The t-statistics (in parentheses) are computed from standard errors double-clustered by firm and date.
Retail OIB Inst. OIB
[1] [2] [3] [4]
Long 1.23% 1.23% 0.27% 0.28%
(8.25) (8.30) (3.15) (3.09)
Long * Past Alpha 0.00% 0.04%
(-0.01) (0.47)
Long * Across-Report Focus -0.03% -0.10%
(-0.26) (-1.27)
Long * Within-Report Focus 0.18% -0.03%
(1.45) (-0.29)
Long * Other Comment -0.02% 0.22%
(-0.19) (2.97)
Past Alpha -0.01% 0.00%
(-0.11) (0.02)
Author Focus 0.13% 0.01%
(1.42) (0.22)
Report Focus -0.18% 0.02%
(-1.91) (0.25)
Other Comment -0.03% -0.06%
(-0.19) (-0.84)
Retail OIB (w-1) 3.00% 3.00%
(15.54) (15.54)
Inst OIB (w-1) 1.62% 1.58%
(19.14) (18.84)
Ret (w-1) -0.46% -0.46% -0.09% -0.10%
(-7.25) (-7.26) (-2.08) (-2.33)
Ret (m-1) -0.28% -0.28% 0.08% 0.10%
(-4.23) (-4.22) (1.82) (2.18)
Ret (m-2, m-7) 0.06% 0.06% 0.06% 0.05%
(0.80) (0.76) (1.17) (0.94)
Log (Turn) 0.26% 0.27% 0.38% 0.38%
(2.24) (2.32) (4.38) (4.16)
Log (Vol) 0.26% 0.26% -0.28% -0.26%
(2.65) (2.63) (-4.28) (-3.82)
Log (Size) -0.11% -0.11% -0.01% -0.01%
(-0.75) (-0.71) (-0.15) (-0.06)
Log (BM) 0.04% 0.04% 0.10% 0.10%
(0.23) (0.25) (1.56) (1.46)
Date Fixed Effects Yes Yes Yes Yes
Observations 106,263 106,263 106,263 106,263
R-squared 7.93% 7.94% 9.99% 9.99%
43

Figure 1: Performance of Top NPAs by Year
This figure plots the average the monthly alpha on the Long – Short portfolio for contributions in top quintile of past
alpha (as defined in Table 4) for each year of the sample period. The error bars report 95% confidence intervals
computed based on standard errors from annual time-series regression.
8.00%
6.00%
4.00%
Monthly Alpha
2.00%
0.00%
2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
-2.00%
-4.00%
-6.00%
44

Figure 2: Top 10 words for each LDA topic
We apply the Latent Dirichlet Allocation (Blei, Ng, and Jordan, 2003) to the sample of Seeking Alpha reports to
encode each report as a set of weights among twenty topics. This table reports the top 10 characteristic words for each
of the 20 topics.
Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 Topic 6 Topic 7
service quarter dividend drug revenue tesla bank
network revenue debt patient margin u loan
ford million billion trial guidance china rate
subscriber earnings income phase sale car america
att billion asset study quarter industry financial
customer sale yield treatment eps sale book
content result portfolio product think product credit
verizon estimate rate fda likely model banking
netflix reported capital data management vehicle asset
wireless net credit sale estimate new deposit
Topic 8 Topic 9 Topic 10 Topic 11 Topic 12 Topic 13 Topic 14

apple earnings dividend report technology cash ge
iphone trading value option product revenue user
aapl short earnings case intel margin google
product day ratio risk amd flow facebook
phone week valuation information solar operating revenue
device analyst cash issue device management mobile
new position return investment solution million platform
sale buy rate short new cost ad
watch month pe doe chip acquisition twitter
smartphone report yield make power debt advertising
Topic 15 Topic 16 Topic 17 Topic 18 Topic 19 Topic 20

microsoft like store amazon million oil
game just sale home deal production
disney think brand walmart value energy
starbucks going food online shareholder gas
gaming make retail amzn management natural
window thing new sale acquisition cost
new dont product ecommerce new project
revenue way consumer retail board million
movie thats restaurant buy ceo asset
msft new customer retailer capital barrel
45

The Cross-Section of Non-Professional Analyst Skill

Uploaded by

Copyright:

Available Formats

The Cross-Section of Non-Professional Analyst Skill

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Cross-Section of Non-Professional Analyst Skill

Uploaded by

Copyright:

Available Formats

The Cross-Section of Non-Professional Analyst Skill

Michael Farrell, Russell Jame, and Tian Qiu*

Keywords: Social Media, Investment Research, Performance Persistence, Trading Strategy

Electronic copy available at: https://ssrn.com/abstract=3682490

particularly prominent source of non-professional investment research is Seeking Alpha which

visitors.2 Investors willingness to embrace non-professional research on Seeking Alpha appears to

Electronic copy available at: https://ssrn.com/abstract=3682490

As a starting point, we model NPA performance as a mixture of multiple skill distributions.

across a contributor’s past 10 investment recommendations yields economically large benefits to

Electronic copy available at: https://ssrn.com/abstract=3682490

NPAs extends beyond merely “piggybacking” off of major information releases.

allowing for reasonable delays in information processing. We develop an implementable trading

NPA research instantaneously by imposing investment delays of 24 to 72 hours following the

Electronic copy available at: https://ssrn.com/abstract=3682490

contributors, we examine retail and institutional order imbalances following NPA

Electronic copy available at: https://ssrn.com/abstract=3682490

communities for a fixed monthly fee.

Electronic copy available at: https://ssrn.com/abstract=3682490

past investment recommendations or even distributing education materials that encourage

investors to track historical performance could be beneficial to investors.

2. The Seeking Alpha Sample

2.1. Data and variable construction

opinions, and all reports undergo an editorial review.

Electronic copy available at: https://ssrn.com/abstract=3682490

192,398 reports by 9,130 unique contributors for 5,080 firms.

neutral.7 Neutral reports are excluded from the sample.

with each report k issued by NPA i at time t as:

𝛼̂𝑖𝑘𝑡 = 𝐷𝑖𝑘 [∏𝑥𝑡=0(1 + 𝑟𝑘𝑡 ) − ∏𝑥𝑡=0(1 + 𝑟𝑏𝑡 )], (1)

Electronic copy available at: https://ssrn.com/abstract=3682490

processing delays and transaction costs.

Electronic copy available at: https://ssrn.com/abstract=3682490

traded fund (SPY) over the same period.

2.2 Descriptive Statistics

which is the primary focus of the paper.

Electronic copy available at: https://ssrn.com/abstract=3682490

3. Mixture Models and the Cross-Section of NPA Skill

3.1 The Mixture Model

two components by modeling the performance of NPAs as a mixture of multiple normal

Electronic copy available at: https://ssrn.com/abstract=3682490

sample of estimated average abnormal returns of N NPAs is:

𝐿(𝛼̂1 , 𝛼̂2 … , 𝛼̂𝑁 | 𝑠1 , 𝑠2 , … , 𝑠𝑁 , 𝜃 ) = ∏𝑁 ̂ 𝑖 ),

where 𝜃 is the set of parameters to be estimated: 𝜋0 , 𝜋1 , 𝜇0 , 𝜇1 , 𝜎0 and 𝜎1 . We estimate the

parameters (𝜃) via maximum likelihood subject to the restrictions that: 0 ≤ 𝜋0 ≤ 1, 𝜋1 = 1 − 𝜋0 ,

and 𝜎𝑗2 ≥ 0 for j = 0, 1.

3.2 The Cross-Section of NPA Skill

Electronic copy available at: https://ssrn.com/abstract=3682490

with a standard deviation of 8.21%.

Electronic copy available at: https://ssrn.com/abstract=3682490

NPAs with the best past performance.

4. The Returns to Following NPA Research

4.1 Portfolio Construction

recommendations, we construct transaction-based calendar-time portfolio (see, e.g., Seasholes and

stock in both the long and short portfolio.12

Electronic copy available at: https://ssrn.com/abstract=3682490

Electronic copy available at: https://ssrn.com/abstract=3682490

4.2 Portfolio Characteristics and Returns

which suggests that majority of NPAs are skilled.

Electronic copy available at: https://ssrn.com/abstract=3682490

unconditional strategy by a statistically significant 0.99% per month.

4.3 Portfolio Returns – Robustness