Exploratory Trading Harvard
Exploratory Trading Harvard
Exploratory Trading Harvard
Adam Clark-Joseph∗
Abstract
Empirical research suggests that high-frequency traders (HFTs) tend to be better
informed in some respects than their lower-frequency counterparts, but the precise con-
nection between rapid trading and superior information remains unclear. Anecdotal
accounts suggest that at least some HFTs use their own trades to gather information,
but such “exploratory trading” is poorly understood. In this paper, I formalize the in-
tuitive concept of exploratory trading in a simple model, and I show that exploratory
trading and high-frequency trading bear a natural relationship to one another. My
model sheds light on the broad question of how HFTs could parlay their speed into
valuable information, and it provides explanations for a variety of empirical findings
about high-frequency trading. In addition, the exploratory trading model generates a
number of new testable predictions about HFT activity.
∗
Harvard University, E-mail: [email protected]. I thank Andrei Kirilenko and other seminar
participants at the Commodity Futures Trading Commission, as well as seminar participants at Harvard
University for their useful feedback, and I thank John Campbell, Andrei Shleifer, Alp Simsek and Jeremy
Stein for their invaluable advice and comments. I gratefully acknowledge the support of an NSF Graduate
Research Fellowship.
1
1 Introduction
1.1 High Frequency Trading
Over the past few decades, information technology has permeated and reshaped financial
markets. Several recent papers, such as Hendershott and Riordan (2009) [20] and Hender-
shott et al. (2011) [19] document the prevalence algorithmic trading in modern (electronic)
financial markets. The research of Hendershott et al. provides compelling evidence that
general algorithmic trading tends to improve liquidity and aid the price-discovery process.
However, algorithmic trading can potentially be used myriad ways, some of which are
socially desirable, and some of which may be deleterious to the public good.
Although the research of Hendershott et al. suggests that the positive effects of algo-
rithmic trading outweigh the negative effects in aggregate, this leaves open the possibility
that some subset of algorithmic traders is doing something harmful. The particular sub-
set of algorithmic traders that has attracted the greatest scrutiny in this regard are the
so-called “high-frequency traders” (“HFTs”).
2
1.1.2 Empirical Results
High-frequency trading is notoriously difficult to study empirically, because data with suf-
ficient temporal resolution are typically anonymous. Although we can use 13-F forms to
track the behavior of institutional investors at a quarterly frequency, there is no general,
simple way to track the behavior of a HFT at second- or millisecond-frequency. Nev-
ertheless, there are at least three datasets3 in which HFT-activity can be distinguished
from non-HFT-activity, and analyses of these datasets have established some foundational
empirical results about HFTs.
Perhaps the most striking and consistent finding in this empirical literature is that
an enormous fraction of trading volume can be attributed to HFTs. Estimates vary, but
even the most conservative estimates suggest that HFTs account for more than 30% of
total trading volume in U.S. equities, and most estimates fall in the 50% − 70% range (see
[10, 3]). These trading volume figures suggest that HFT activity constitutes an important
facet of modern financial markets, which naturally raises the question of exactly how HFTs
affect markets.
Three recent studies—Hasbrouck and Saar (2011), Brogaard (2010), and Kirilenko et
al. (2010)—all examine the same general question of how HFT activity affects markets,
but each study takes a very different approach to answering this question. Hasbrouck and
Saar study order-level NASDAQ data, and they develop statistical techniques to identify
“strategic runs” of orders that they attribute to HFTs. In a more direct approach, Brogaard
uses novel dataset of order-level data for 120 stocks (60 listed on NASDAQ, 60 listed on
the NYSE); this dataset distinguishes messages from 26 firms that had been identified by
NASDAQ as engaging primarily in high frequency trading. Finally, Kirilenko et al. analyze
audit-trail, transaction-level data for the E-mini S&P 500 stock index futures market, from
May 3 to May 6, 2010. Kirilenko et al. use this data to sort over 15,000 trading accounts
into six categories on the basis of realized trading behavior (one category consisted of
HFTs, and the remaining categories consisted of different types of non-HFTs4 ).
Both Brogaard (2010) and Hasbrouck and Saar (2011) reach similar conclusions about
the effects of HFT activity. Hasbrouck and Saar conclude that increased HFT activity
tends to improve traditional measures of market quality such as short-term volatility and
spreads. Brogaard likewise finds evidence that HFTs may dampen intra-day volatility,
and that HFTs frequently provide inside quotes. Although Brogaard’s results suggest that
HFTs supply less additional book-depth than their inside-quote provision would typically
imply, he finds no evidence that HFTs flee the market in volatile times. Furthermore,
Brogaard finds that HFTs contribute significantly to the price-discovery process. Both
3
Specifically, the datasets are 1) the computerized trade reconstruction data from the Commodity
Futures Trading Commission (CFTC) that Kirilenko et al. (2010) [24] use to analyze the Flash Crash, 2)
Order-level data for 120 stocks (60 listed on NASDAQ, 60 listed on the NYSE) which flags the orders from
26 firms known by NASDAQ to primarily engage in high-frequency strategies, and 3) Transaction data
from the Deutsche Boerse that identifies whether or not each trade’s buyer and seller were participants in
the “Automated Trading Program”—i.e., whether the orders were generated by an algorithm. See [24, 11]
for details on (1), [3] for details on (2), and [14, 20] for details on (3).
4
Kirilenko et al. define “high-frequency trader” much more precisely than do either Hasbrouck and Saar
or Brogaard. The accounts that Kirilenko et al. classify as HFTs are archetypes of the SEC characteriza-
tion, but they are not necessarily close analogues of the “HFTs” that the other papers consider.
3
studies suggest that in aggregate, HFT activity tends to have a positive influence on
markets.
Whereas Brogaard (2010) and Hasbrouck and Saar (2011) examine HFT activity across
a large number of assets, over relatively long time spans, Kirilenko et al. focus on a single
asset during a short time span, because their primary interest is the role (if any) that HFTs
played in the Flash Crash of May 6, 2010. Kirilenko et al. conclude that HFT activity did
not actually trigger the Flash Crash, but HFTs did exacerbate market volatility through
their response to the unusually large selling pressure that day. These conclusions are
not incompatible with the broader findings of the other two studies, and some of the
more detailed results agree extremely well. For example, Brogaard’s analysis suggests that
rather than fleeing the market in volatile times, HFTs actually increase their participation
somewhat; Kirilenko et al. find that HFT trading volume increased dramatically in both
absolute and relative terms during short interval on May 6 when the largest price changes
occurred. Nevertheless, the discrepancy between the rosy conclusions of the two more
general studies and the darker conclusions of the more specific study highlight the need
for more thorough and detailed understanding HFT activities.
4
employ a variety of different strategies5 , so these aggregate results potentially occlude im-
portant heterogeneity. More importantly, although traditional, formally-registered market-
makers typically have unique information about order-flow, almost no HFTs are formally
registered as market-makers [10]. Even if each HFT behaves exactly like a traditional
market-maker, the puzzle of how they use speed as a replacement for privileged order-flow
information would remain.
Many of the general techniques/strategies disclosed by and imputed6 to HFTs are
standard elements of the non-high-frequency realm. These include arbitrage, statistical
arbitrage, directional bets, and market-making. However, the high-frequency activities
commonly classified as “liquidity detection” lack low-frequency analogues. This classifica-
tion covers a number of slightly different techniques— “pinging,” “sniffing,” etc.—but all of
these techniques fundamentally entail the use of orders for the express purpose of gathering
information about the market. Obviously, such techniques point to a mechanism by which
HFTs might obtain superior information. Less obviously, but much more importantly, this
mechanism illuminates the connection between information and trading speed.
5
traded.
In principle, the trading process could reveal information about either the future asset-
intrinsic component of price, or the future price-impact of trades. However, essentially
all of research in the spirit of Romer (1993) focuses on the question of how past trading
can reveal information about the asset-intrinsic component of future prices. Although the
revelation of information about price impact is a topic of both practical and theoretical
importance, the mechanism by which past trading reveals information about future price
impact is relatively transparent. The sole paper on this topic, Hong and Rady (2002),
covers this basic mechanism fairly comprehensively.
The theoretical question that I address in the present paper—how optimal initial trades
depend on the amount of information that they are expected to reveal—is strictly more
complicated than the analogous, converse, Romer (1993)-style questions. In the simplest,
two-period setting, the Romer (1993)-style question must be addressed for period 2, but
then to answer my style of question, we must determine how optimal period-2 trading
profits depend on the amount of information that period-1 trade reveals. Next, we must
role back to period 1, solve a highly non-standard inference problem to find the relationship
between the period-1 trade and the amount of information the trade is expected to reveal,
use this to express the expected conditionally optimal period-2 trading profits in terms of
the period-1 trade, and then combine this with the expected direct trading profits from
the period-1 trade, and finally maximize the expected total profit with respect to the
period-1 trade. The very simplicity that makes learning about price-impact unattractive
for studying Romer (1993)-style issues makes “learning about price impact” an ideal setting
to develop a baseline model of exploratory trading.
Hong and Rady (2002) briefly discuss the topic that I address:
By using a slight variation on the Hong and Rady (2002) model (which is in turn a slightly
modified version of the Kyle (1985) model), I derive a closed-form solution for optimal
exploratory trading, I formalize Hong and Rady’s conjecture that informed traders may
have an incentive to engage in costly experiments in period 1 to increase their expected
profits in period 2, and I establish conditions under which the conjecture holds.
With the inner workings of exploratory trading laid bare, the connections between
exploratory and high-frequency trading follow easily. These connections shed light on the
general question of how trading speed relates to superior information, and they also suggest
explanations for a variety of existing empirical findings, as well as generating a number of
testable predictions.
6
2 A Model of Exploratory Trading
For purposes of tractability and expositional clarity, I initially consider exploratory trading
in the context of learning about price-impact. Although this is not necessarily the most
interesting application, the formal and conceptual results that I derive about exploratory
trading in this context extend to more general settings.
y1 = α − β −1 p1 (1)
−1
y2 = α − β p2 (2)
where pt denotes the price in period t7 . The parameter β is drawn from a distribution
with strictly positive support, bounded away from zero, with mean b and finite variance
σβ2 > 0; intuitively, β represents marginal price-impact of a market order. The parameter
α depends on β and follows a distribution such that
µ+ξ
α≡ (3)
β
for some zero-mean random variable ξ ∈ L1 that is independent of β and has finite variance
σξ2 > 0. The product αβ, which I will denote by p0 ≡ αβ, corresponds to the price at
which yt = E [st ]. Equation (3) implies that p0 ≡ µ + ξ, and the distributional assumptions
on ξ imply that E [p0 |β] = E [p0 ] = µ.
Consider a single agent—call him the “high-frequency trader,” or “HFT”—who submits
market orders8 in periods 1 and 2 with the objective of maximizing his expected aggregate
net profits. Denote by xt the number shares that the HFT purchases in period t. Assume
7
From a theoretical standpoint, it might be nicer to have a second-period demand curve of the form
y2 = α − β −1 p2 − y1 , but this would make the algebra much messier. To the extent that the interesting
features of this model revolve around the estimation of β, the algebraically simpler case still delivers the
same intuitive conclusions.
8
Nothing fundamental would change if the HFT used limit orders instead of market orders. If we assume
both that there is no supply noise in period 2, and that the HFT knows µ − p0 perfectly, then allowing
the HFT to use limit orders would lead to a trivial solution wherein the HFT offers to purchase/sell an
unlimited quantity at the price p0 + µ−p2
0
. However, as long as the HFT is uncertain about either the net
supply in period 2, or the true value of µ − p0 , allowing the HFT to submit limit orders would not produce
any pathologies. The HFT’s optimization problem would be less intuitive and far less tractable, but the
same basic concepts would apply.
7
that the HFT knows µ and the product p0 ≡ αβ 9 , but not α or β individually. The HFT
observes his own trades, as well as the market-clearing price in each period, but he observes
neither the market demand curves, nor the quantity s1 . When the HFT observes p1 he
can use his knowledge of x1 and p0 to estimate β, but his uncertainty about s1 prevents
him from learning β exactly.
8
3 Solving the Model
Intuitively, the HFT’s basic strategy entails buying (selling) the asset when the price is
below (above) the terminal value µ. However, the HFT faces downward-sloping market
demand curves, so the market-clearing price pt depends on the HFT’s purchase xt . Con-
sequently, the HFT faces a trade-off between the number of shares that he buys, and the
spread µ − pt that he earns on each share. If the HFT knows the shape of the market
demand curve, then his optimization problem is isomorphic to that of a profit-maximizing
monopolist facing a downward-sloping demand curve.
If the HFT does not know the shape of the market demand curve, his task is more
complex than the standard monopolist’s problem. Although the HFT still faces the quan-
tity/spread trade-off, his ability to optimally balance this trade-off depends on the quality
of his information about the market demand curve. The price impact of the HFT’s trade
in the first period provides information about the slope of market demand curve that the
HFT can use to better choose his trade in period 2, but this information comes at the
expense of trading costs in the first period.
In the remainder of this section, I derive the HFT’s optimal trading strategy through
backward induction.
3.1 Date t = 2
The HFT pays x1 p1 at date 1, and x2 p2 at date 2, then he receives a payoff of µ (x1 + x2 )
at the end of period 2, so the HFT’s total realized profit is
At date 2, the HFT chooses x2 to maximize the conditional expectation of his total
profit, E1 [x1 (µ − p1 ) + x2 (µ − p2 )]. Since x1 and p1 are determined before the second
period, the HFT’s choice of x2 depends only on his conditional expectation of his period-2
trading profits, E1 [x2 (µ − p2 )]. Thus the HFT solves
max E1 [x2 (µ − p2 )]
x2
s.t. p2 = p0 + βx2
Equation (4) confirms our basic intuitions about the HFT’s trading strategy; he trades
against perceived price dislocations, but he accounts for the anticipated price impact of
his trade.
Next, we wish to determine how the profit that the HFT earns from trading xˆ2 depends
on the quality of his information about the demand curve. Let x∗2 ≡ µ−p 2β denote the
0
infeasible optimal trade that the HFT would select if he knew the true value of β, and let
9
π2∗ ≡ β (x∗2 )2 denote the associated infeasible maximized profit. Intuitively, as xˆ2 deviates
from the infeasible optimum x∗2 , we should expect the HFT’s realized profits to decline
relative to π2∗ . Indeed, in the appendix I show that the true profit that the HFT would
earn from the optimal feasible trade x̂2 can be expressed as
The HFT’s expected feasible profit at a given value of β can be expressed naturally in
terms of the mean-square error of E11[β] relative to β1 , and approximated in terms of the
conditional variance of E1 [β]:
�� � �
∗ β (µ − p0 )2 1 1 2
E [π̂2 |β, p0 ] = π2 − E − |β (6)
4 E1 [β] β
(µ − p0 )2
≈ π2∗ − V ar (E1 [β] |β) (7)
4β 3
Equation (6) illustrates how variability in the HFT’s estimate of the market demand curve
reduces the profits that the HFT earns from his optimal feasible trading strategy in period
2.
3.2 Date t = 1
In section (3.1), I related the HFT’s optimal period-2 trading strategy and associated
profits to the estimator E1 [β]. In particular, I showed that the HFT’s expected profits
from trading in period 2 increased as E1 [β] became a better estimator of β. As I will show
below, the quality of E1 [β] as an estimator of β depends on x1 , the HFT’s order in period
1. Consequently, the HFT’s optimal trading strategy in period 1 will depend not only on
the direct revenues associated with the trade, but also on the extent to which the trade is
expected to improve the HFT’s information about the market demand curve in the next
period.
∆p = β (x1 + s1 ) (8)
where I define ∆p ≡ p1 − p0 .
Since the single observation (∆p, x1 ) constitutes the entirety of the HFT’s empiri-
cal data, β is underidentifed from the HFT’s perspective, regardless of the value of x1 .
However, the underidentification of β means simply that the HFT cannot perfectly (i.e.,
consistently) estimate β from a single, noisy observation. Although the traditional binary
10
identified/underidentified classification is well-suited for asymptotic analyses, the present
setting calls for finer distinctions.
No finite choice of x1 will allow the HFT to completely disentangle the effects of β from
those of supply noise on the basis of a single observation, but the value of x1 determines
how well the HFT can separate the effects of β from those of s1 . Intuitively, we might think
of x1 determining “how well” β is identified. We can make this intuitive notion precise by
using equation (6), and considering how x1 affects the HFT’s expected period-2 profits.
The exact effects of x1 will depend both on the distributions of β and s1 , and on the
HFT’s knowledge about these distributions, but the HFT will generally tend to learn more
about β the larger is the magnitude of x1 . Although I cannot invoke a central limit theorem
to sidestep these distributional details in the usual manner, I accomplish something similar
by considering the case in which |x1 | becomes large, and applying integrability/moment
conditions to characterize tail behavior. We can always bound the variance of E1 [β] by
� �
(σ2 +b2 )σ2
E (E1 [β] − β)2 ≤ β x2 s , and under mild regularity conditions, this bound becomes
1
tight as |x1 | becomes large (see mathematical appendix for details). To avoid a morass
of unenlightening algebra, I will appeal� to this tight
� bound and make the simplifying
2
assumption that V ar (E1 [β] |β) ≡ E (E1 [β] − β) |β is given by
β 2 σs2
V ar (E1 [β] |β) = K 2 (9)
x21
for x21 ≥ K 2 σs2 , where 1 ≥ K > 0 is some positive constant that depends on the uncondi-
tional distributions of β and of s1 .
We can combine (9) with the approximation (6) to characterize the relationship between
x1 and the HFT’s expected period-2 profits. Taking (6) to hold exactly11 , we obtain
� �� �
(µ − p0 )2 E β −1 K 2 σs2
E [π̂2 |p0 ] = 1− (10)
4 x21
Since the HFT could always choose not to trade at all in period 2, his expected period-2
profits must be non-negative. The right-hand side of equation 10 is negative for x21 < K 2 σs2 ,
so the model is not applicable in that region (this is why I only assume that (9) holds for
x21 ≥ K 2 σs2 ).
The expected direct trading profit from trading x1 is
11
enter through the “−bx21 ” term. Second, as the HFT trades more at date 1, he obtains
better information about β that he can use to trade more profitably (in expectation) at
2 2
date 2; in equation (12), the “− Kx2σs ” term reflects this informational benefit to trading.
1
This informational benefit pushes the optimal value of |x1 | away from zero. Finally, the
direct gains from trading in period 1, reflected by the “(µ − p0 ) x1 ” term in equation (12),
tend to push the optimal value of x1 away from zero.
Unlike the informational gains from trading, the direct gains from trading in period 1
are completely standard and are thoroughly understood. We can isolate the informational
motive for trading from the “direct gain” motive by supposing that the HFT only observes
µ − p0 after he has selected x1 . Recall that I assume E [µ − p0 ] = 0, so the HFT’s initial
expectation of his total profit is
� �
2 E β −1 � �
σ ξ K 2 σs2
E [π1 + π̂2 ] = −bx1 +
2
1− (13)
4 x21
Intuitively, (13) represents the profit that the HFT would expect to obtain if he had to
select the value of x1 before learning p0 . Since E [µ − p0 ] = 0, the “(µ − p0 ) x1 ” term from
(12) vanishes—in expectation, there is no direct gain from a “blind trade”.
In the mathematical appendix, I show that the x∗1 that maximizes the HFT’s uncondi-
tional expected profit, (13), is characterized by
�
1 σ σ K E[β −1 ] if σ 2 < E[β −1 ] σ 2
ξ s
(x∗1 )2 = 2 b s 16K 2 b ξ
E[β −1 ] 2
(14)
0 2
if σs ≥ 16K 2 b σξ
E[β −1 ]
The condition σs2 < 16K 2 b σξ2 ensures that the maximized expected profit is non-negative
(if the maximized expected profit is negative, HFT would simply not participate in the
E[β −1 ]
market). Also, the condition “σs2 < 16K 2 b σξ2 ” implies (x∗1 )2 > 2K 2 σs2 > K 2 σs2 , and
therefore guarantees that we are in a valid region of our model. Since I have removed the
“direct gain” motive for trading in period 1, the optimal trade characterized in equation
(14) is driven entirely by information-seeking concerns.
To rephrase this more bluntly, equation (14) validates the concept of exploratory trad-
ing by explicitly characterizing the phenomenon in a simple model. However, the concept
of exploratory trading is merely a tool to better unravel the mysteries of high-frequency
trading. Although equation (14) marks the end of my analysis of exploratory trading in the
abstract, it also marks the beginning of my analysis of the connection between exploratory
and high-frequency trading.
12
“importance” (in some sense) of these confounding factors depends on the clock-time dura-
tion of the interval in which trade occurs. This point remains valid, but we can now state
it more precisely. As equation (10) reveals, the HFT’s expected period-2 profit depends
on σs2 , the variance of period-1 supply noise. The relevant measure of the “importance” of
period-1 supply noise is simply the variance σs2 .
The link between the parameter σs2 and the implicit duration of period 1 makes it pos-
sible to investigate chronological-time considerations elsewhere in the model. In particular,
we can use the results from section 3.2 to examine the relationship between exploratory
trading and chronological time in detail, which will in turn reveal the connections between
exploratory trading and high-frequency trading.
E[β −1 ]
When the variance of supply noise exceeds some threshold (σs2 ≥ 16K 2 b σξ2 ), the optimal
level of exploratory trading drops to zero ((x∗1 )2 = 0). Exploratory trading only arises in
the model when the variance of supply noise is sufficiently small, or equivalently when the
duration of the first trading period is sufficiently short.
Although a trade reveals some amount of valuable information, trading is also costly
(due to price impact). Both the informational gain and the price-impact cost of a trade de-
pend on the trade’s magnitude, but only the informational gain depends on the variance of
supply noise. As the variance of supply noise increases, the informational gain from a given
magnitude of trade decreases. Up to a point, a trader can partially offset this reduction
in the informational gain by increasing the magnitude of his trade, but this also increases
his trading costs. Eventually, when σs2 becomes sufficiently large, the informational gains
become too small to justify the cost of any non-zero level of exploratory trade.
The relationship between exploratory trading and high-frequency trading can also be
2 2
understood in terms of equation (13). As noted earlier the “− Kx2σs ” term reflects the
1
informational gain from trading in period 1, expressed in terms of the HFT’s expected
2
total profit. There are two ways to make σxs2 small: make x21 large, or make σs2 small. Up to
1
this point, we have treated σs2 as fixed, and considered the optimal choice of x21 . However,
to the extent that σs2 depends of trading speed, there is another dimension along which
optimization is possible.
This raises two important points. First, the costs associated with reducing σs2 by in-
creasing trading speed are largely fixed; these include direct data feeds, colocation services,
some proprietary software development, and so on. These fixed costs of increasing speed
are considerable, and the cost reduction per trade would likely be small, increased trading
speed would be most valuable (from the standpoint of exploratory trading) for a trader
13
who intended to engage in a large number of trades. Hence the clear connection between
exploratory trade and low-latency trading also suggests a similar connection between ex-
ploratory trade and high-frequency trading per se.
The second important point that arises from the two possible approaches to making
σs2
x21
small is that in spite of the potential value associated with superior trading speed, this
does not necessarily imply that we should observe a Bertrand-competition-style latency
arms race (or at least not one driven by exploratory trading). An HFT could potentially
overcome some minute latency disadvantage by slightly increasing the magnitude of his
exploratory trades.
14
5 Discussion
5.1 Relation to Empirical Findings
The preceding analysis of exploratory trading sheds light on a variety of empirical results
concerning HFT activity.
15
The “exploratory trading + Romer” extension also helps to explain the finding of Kir-
ilenko et al. that HFTs appear to profitably trade in the same direction as contemporaneous
price changes (i.e., anticipate price changes). When agents deduce the relative precision
of their signal, they can use this information to infer whether the current market-price
properly reflects aggregate information that is appropriately weighted by its precision, and
thereby potentially uncover temporary mispricings.
16
By comparison, Brogaard estimates the annual profit of the 26 HFTs from trading the 120
stocks is around $74 million15 .
Once again, the baseline exploratory trading model provides some insight. First, the
result that all high-frequency trades generate some form of potentially valuable information
suggests that there are natural economies of scale for HFTs. Roughly speaking, a HFT
who trades more gets better information.
We can take this analysis still further by noting that if there were more than one HFT
in the exploratory trading model, then one HFT’s trades would look like supply noise to
the other HFTs (and vice versa). If there is one incumbent HFT, then his trades would
look like noise to a potential entrant, and at least reduce the incumbent’s prospective
profits. As the number of HFTs in a given market increases, the apparent supply noise
from the perspective of a potential entrant also increases. In other words, high-frequency
trading inherently imposes barriers to entry. An interesting possibility that arises from
this conclusion is that HFTs might engage in excessive trading for purely anti-competitive
purposes16 .
17
an agent who has some private knowledge about future prices. In terms of the model, I
examine how the HFT can best make use of his knowledge of µ − p0 during periods 1 and
2, but I take the HFT’s knowledge of µ − p0 to be exogenously given. I assume that the
HFT exogenously knows µ − p0 purely to simplify the exposition, and to focus on the novel
aspect of my model. In other words, this assumption is not crucial !
In earlier drafts of this paper, the HFT inferred both the intercept and the slope, but
inference about the intercept turns out to be a standard type of problem, and it complicates
the more interesting and unusual inference about the slope. Another, potentially more
attractive way to remove the “exogenous private information” assumption is to tack a
slight variation of the model in section 1 of Romer (1993) onto the basic exploratory
trading model. I am currently working on the details of this extension.
Alternatively, the baseline exploratory trading model can be viewed as an extension of
the Kyle (1985) model to a setting in which the informed trader is uncertain about market
depth, and the market-maker doesn’t act quickly enough to alter his initial strategy. This
is not necessarily an attractive option if our ultimate goal is to use exploratory trading to
illuminate the inner workings of high-frequency trading, but it is is reasonable if we simple
want to think about exploratory trading for its own sake.
18
change between dates 0 and 1 could be attributed to the price impact of his trade, and how
much could be attributed to some persistent component of the preceding price change.
6 Conclusion
In this paper, I address a central puzzle about high-frequency trading, namely how trad-
ing speed could translate to superior information. To this end, I analyze the idea of
“exploratory trading”—how an agent might use his own trades to gather valuable informa-
tion. A central result of my analysis is that exploratory trading bears a natural connection
to high-frequency trading, and conversely, that high-frequency trading inherently raises
issues analogous to those of exploratory trading. This connection between exploratory
and high-frequency trading ultimately illuminates the issue of how trading speed could
translate to superior information. Beyond these general results, the model of exploratory
trading also helps to explain a variety of specific empirical findings and generates a number
of testable implications.
References
[1] Terrence Hendershott Alex Boulatov and Dmitry Livdan. Informed trading and port-
folio returns. June 2011.
[2] Bruno Biais, Thierry Foucault, and Sophie Moinas. Equilibrium algorithmic trading.
Working Paper, October 2010.
18
Although I abstract away from any sort of strategic or adaptive behavior by the agents who (in
aggregate) submit the market demand curve, we could relax this simplifying assumption somewhat without
dramatically altering the qualitative behavior of the model.
19
[3] Jonathan A. Brogaard. High frequency trading and its impact on market quality.
Northwestern University Kellogg SOM Working Paper, November 2010.
[4] Miguel Sousa Lobo Bruce Ian Carlin and S. VIiswanathan. Episodic liquidity crises:
Cooperative and predatory trading. The Journal of Finance, LXII(5), OCTOBER
2007.
[5] Álvaro Cartea and José Penalva. Where is the value in high frequency trading? Banco
De España, Documentos de Trabajo N.Âo 1111, June 2011.
[6] Jeff Castura, Robert Litzenberger, Richard Gorelick, and Yogesh Dwivedi. Market
efficiency and microstructure evolution in u.s. equity markets: A high-frequency per-
spective. Working Paper from RGM Advisors, LLC, October 2010.
[7] Giovanni Cespa and Thierry Foucault. Insiders-outsiders, transparency and the value
of the ticker. Queen Mary University Dept. of Economics Working Paper No. 628,
April 2008.
[8] Tarun Chordia, Richard Roll, and Avanidhar Subrahmanyam. Recent trends in trading
activity. UCLA Working Paper, January 2010.
[9] Adam Clark-Joseph and Brock Mendel. A model and analysis of high-frequency trad-
ing. Harvard University Working Paper, February 2011.
[10] Securities Exchange Commission. Concept release on equity market structure, concept
release no. 34-61358. FileNo. 17 CFR Part 242 [Release No. 34-61358; File No. S7-02-
10] RIN 3235-AK47, January 2010.
[11] U.S. Securities Exchange Commission. Findings regarding the market events of may
6, 2010, September 2010.
[12] Jaksa Cvitanic and Andrei Kirilenko. High frequency traders and asset prices. Cal-
Tech/CFTC Working Paper, March 2011.
[13] David Easley, Marcos M. López de Prado, and Maureen O’Hara. The microstructure
of the ‘flash crash’ flow toxicity, liquidity crashes and the probability of informed
trading. Journal of Portfolio Management, Forthcoming.
[14] Peter Gomber and Markus Gsell. Algorithmic trading engines versus human traders
– do they behave different [sic] in securities markets? Published by the Center for
Financial Studies at Goethe-University Frankfurt„ April 2009.
[15] Sanford J. Grossman and Merton H. Miller. Liquidity and market structure. The
Journal of Finance, 43(3), July 1988.
[16] L.E. Harris. Optimal dynamic order submission strategies in some stylized trading
problems. Financial Markets, Institutions and Instruments, 7:1–76, 1998.
[17] Joel Hasbrouck and Gideon Saar. Technology and liquidity provision: The blurring
of traditional definitions. Journal of Financial Markets, 12:143–172, 2009.
20
[18] Joel Hasbrouck and Gideon Saar. Low-latency trading. NYU Stern/ Cornell GSM
Working Paper, May 2011.
[19] Terrence Hendershott, Charles M. Jones, and Albert J. Menkveld. Does algorithmic
trading improve liquidity? The Journal of Finance, 66(1), February 2011.
[20] Terrence Hendershott and Ryan Riordan. Algorithmic trading and information. NET
Institute Working Paper No. 09-08, September 2009.
[21] Robert Jarrow and Philip Protter. A dysfunctional role of high frequency trading in
electronic markets. Johnson School Research Paper Series No. 08-2011, March 2011.
[22] Boyan Jovanovic and Albert J. Menkveld. Middlemen in limit-order markets. Working
Paper, June 2010.
[23] Michael Kearns, Alex Kulesza, and Yuriy Nevmyvaka. Empirical limitations on high
frequency trading profitability. University of Pennsylvania CIS Working Paper, 2010.
[24] Andrei Kirilenko, Mehrdad Samadi, Albert S. Kyle, and Tugkan Tuzun. The flash
crash: The impact of high frequency trading on an electronic market. October 2010.
[25] Thomas McInish and James Upson. Strategic liquidity supply in a market with fast
and slow traders. September 2011.
[26] Ciamac C. Moallemi, Beomsoo Park, and Benjamin Van Roy. Strategic execution in
the presence of an uninformed arbitrageur. Columbia GSB/ Stanford EE Working
Paper, December 2010.
[27] Ciamac C. Moallemi and Mehmet Saglam. The cost of latency. Columbia GSB
Working Paper, June 2010.
[28] Marco Lutat Tim Uhle Peter Gomber, Bjorn Arndt. High-frequency trading. Pubished
by Deutsche Borse AG Market Policy and European Public Affairs, March 2011.
[29] David Romer. Rational asset-price movements without news. The American Economic
Review, 83(5):1112–1130, December 1993.
[30] Australian Securities and Investments Commission. Australian equity market struc-
ture. REPORT 215, November 2010.
[31] X. Frank Zhang. High-frequency trading, stock volatility, and price discovery. Yale
SOM Working Paper, December 2010.
A Mathematical Appendix
A.1 Calculations Related to x2 and π2
Solving for x∗2 :
21
∂
(−β (α + x2 ) x2 + µx2 )
∂x2
= −βα − 2βx2 + µ
0 ≡ −βα − 2βx∗2 + µ
µ − βα
x∗2 =
2β
µ − p0
=
2β
Solving for π2∗ :
� �
µ − βα µ − βα µ − βα
π2∗ = −β α + +µ
2β 2β 2β
� �
βα + µ µ − βα µ − βα
= −β +µ
2β 2β 2β
(βα)2 − µ2 2µ2 − 2µβα
= +
4β 4β
2
(βα) + µ − 2µβα
2
=
4β
� �
µ − βα 2
= β
2β
= β (x∗2 )2
22
Expected Feasible Period-2 Profit Taylor Approximation:
� �2 � � � �� �
1 1 1 1 2 1 1 −1
− = − +2 − |z=β (z − β)
z β β β z β z2
� � � �
1 −2 3 2 2 6 12
+ 2 + |z=β (z − β) + − 5 (z − β)3
2 βz 3 z 4 6 βζ 4 ζ
� �
1 ζ − 2β
= 4
(z − β)2 + 2ζ −5 (z − β)3
β β
1 32
≤ (z − β)2 + (z − β)3
β4 3125β 5
z 2 − 2zβ + β 2 z 3 − 3z 2 β + 3zβ 2 − β 3
= + 3125 5
β4 32 β
� �2 � � � �
1 1 1 2 −5 β − 2β 3 2 10 4
− = (z − β) + 2β (z − β) + − 5 (z − β)4
z β β4 β 24 η 6 βη
� �
1 2 η −6 5β − 2η
= (z − β)2 − 5 (z − β)3 + (z − β)4
β4 β 6 β
1 2 1
≥ (z − β)2 − 5 (z − β)3 − (z − β)4
β 4 β 4374β 6
The last line uses the fact that the conditional expectation of β will be at least as good
an estimator (in the L2 -sense) as the expectation of β conditioned upon the value of ∆p x1 ,
∆p
which in turn will be at least as good an estimator of β (in the L -sense) as x1 itself.
2
Next, since x1 is known (and we shall assume x1 �= 0) we can normalize equation (8)
by dividing through by x1 :
∆p
= β + β�x
x1
s1
where we define �x ≡ x1 .
Suppose that for a given fixed value of x1 , the conditional expec-
� �
tation E1 [β] is a smooth function of ∆p
x1 , say E1 [β] = g ∆p
x1 , with continuous derivative
g � (·) that is not identically zero. By standard delta-method-type Taylor approximation
23
arguments, it is easy to show that
� � � �
(x1 )2 E (E1 [β] − β)2 → E g � (β)2 V ar (∆p)
� �� �
= E g � (β)2 σβ2 + b2 σs2
� g (·)
as |x1 | → ∞ . Since we assume that the derivative � �is not identically zero, the term
�
�
2
on the right-hand side of the equation above, E g (β)
� σβ + b σs , is strictly positive.
2 2 2
Therefore we can choose J > 0 such that for all x1 satisfying (x1 )2 > J, we have
� � 1 � �� �
(x1 )2 E (E1 [β] − β)2 > E g � (β)2 σβ2 + b2 σs2
2
� � 2
Hence as |x1 | → ∞ , we can bound E (E1 [β] − β)2 below by a function of the form Aσ x21
s
� ���
∆p
ps x1
θ − θ f (θ)
x1
f (θ|∆p = c; x1 ) = ´ � x1 � ∆p ��
ps t x1 − t f (t) dt
� � � �
p�x 1θ ∆p
x1 − 1 f (θ)
= � � � �
p�x 1t ∆p
x1 − 1 f (t) dt
´
� � � �
θp�x 1θ ∆p 1 f (θ) dθ
´
x1 −
E [β|∆p = c; x1 ] = � � ��
p�x 1t ∆p f (t) dt
´
x1 − t
Of course, the probability of encountering this break-point decreases as |x1 | increases, and
Chebychev’s inequality implies that this probability is bounded above by a function of the
form xH2 for some strictly positive constant H. Since we are only interested in deriving a
1 � �
lower bound for E (E1 [β] − β)2 , the basic result above should probably hold under more
general conditions, but the proof would be more delicate and more involved.
24
A.3 Maximizing Total Expected Profits
Expected Period-2 Profits as a Function of x1 : Using equation (9), we can re-
2 2
place V ar (E1 [β] |β) with K 2 βxσ2 s in equation(6). Then, taking the approximation to hold
1
exactly for simplicity, we get
(µ − p0 )2 σs2
E [π̂2 |β, p0 ] = π2∗ − K 2
4β x21
By taking expectations of both sides of the above equation with respect to β, we obtain
an expression for the unconditional expectation of π̂2 as a function of x1 :
� �
∗ K 2 E β −1 σs2 ξ 2
E [π̂2 |p0 ] = E [π2 ] −
4x21
� � � �
ξ 2 E β −1 K 2 E β −1 σs2 ξ 2
= −
4 4x21
� � � �
ξ 2 E β −1 K 2 σs2
= 1−
4 x21
π1 = x1 (µ − p0 − βx1 )
= (µ − p0 ) x1 − βx21
= ξx1 − βx21
� � �� ��
σξ2 E β −1 K 2σ2
s
max −bx21 + 1−
x1 4 x21
� �
∗
σξ2 E β −1 K 2 σs2
F OC : −2bx1 + 2 ≡ 0
4 (x∗1 )3
� �
σξ2 E β −1 K 2 σs2
⇒ = b (x∗1 )4
� 4
E [β −1 ]
⇒ σξ σs K = (x∗1 )2
4b
25
The maximized value of the objective function is therefore given by:
�
2 E β −1
�� �
σ ξ K 2σ2
max E [π1 + π̂2 ] = −b (x∗1 )2 + 1 − ∗ 2s
x1 4 (x1 )
�
2 E β −1
� �
2 E β −1
�
σ ξ σ ξ K 2 σs2
= −b (x∗1 )2 + −
4 4 (x∗1 )2
� � �
σξ2 E β −1 E [β −1 ]
= − bσξ σs K
4� � 2 2 4b
−1 √
σξ E β
2 K σs 2 b
− �
4 σξ σs K E [β −1 ]
� � � �
σξ2 E β −1 σξ σs K E [β −1 ] b σξ σs K E [β −1 ] b
= − −
4
� � 2 2
σξ2 E β −1 �
= − σξ σs K E [β −1 ] b
�4
σξ E [β −1 ] � � √ �
= σξ E [β −1 ] − 4σs K b
4
Hence the maximized value of the expected total profit (in the absence of direct trading
gains) will be positive when σξ2 is sufficiently larger than σs2 , specifically, when
16K 2 b
σξ2 > σs2
E [β −1 ]
If σξ2 does not satisfy the condition above, the HFT would obtain a higher expected
profit (zero) by not participating in the market than he would by participating in an
otherwise optimal manner.
26