Staged Investments in Entrepreneurial Financing

Staged Investments in Entrepreneurial Financing
Korok Ray
McDonough School of Business
Georgetown University
Washington, DC 20057
[email protected]
May 28, 2010
Abstract
Venture capitalists deliver investments to entrepreneurs in stages. This paper

shows staged financing to be efficient. Staging lets investors abandon ventures
with low early returns, and thus sorts good projects from bad. The primary
implication from staging is that it is efficient to invest more in later rounds. The
model yields a number of empirical implications on how the ratio of early to late
round financing varies with uncertainty, the outside options of both parties, the
value of the venture, the costs of investment, and project difficulty. The main
results generalize in a model that includes the entrepreneur’s unknown ability.
JEL Classification code: G, D.
Keywords: Entrepreneurship, Venture capital, staged financing, optimal stopping,

performance evaluation, financial contracting.
1 Introduction
“You got to know when to hold ’em, know when to fold ’em,
Know when to walk away and know when to run.”
-Kenny Rogers, The Gambler
This paper gives an efficiency-based explanation of staged financing in venture capi-

tal. Sahlman (1990), Gompers (1995), and Gompers and Lerner (1999) all document the
extensive practice of venture capitalists delivering investments to new firms in stages.
The current view in the venture capital literature is that staging mitigates moral haz-
ard. Here, I argue that venture capitalists use staging as a sorting instrument. Staging
investments provide the venture capitalist (VC) with the option of ending projects with
low early returns. This sorts ventures into two groups: stay or quit. It is efficient to quit
if the early returns are weak, and to stay otherwise. If the entrepreneur’s early output
is low, it is in the interests of both the VC and the entrepreneur to discontinue work
and collect their respective outside options.
The main result shows that it is efficient to assign more resources to the later stages
of the project. Thus, one consequence of staging is that it is efficient for investment
levels to increase in later rounds. Staging creates the possibility of termination after the
early stage, and this reduces the project’s expected return. This lowers the marginal
return from investment, and so the VC shades his investment downward in the early
stage. Once the entrepreneur advances, the possibility of termination vanishes and the
marginal return to investment rises, so the VC invests more. Those who make it to the
second stage are more valuable precisely because their first stage output was sufficiently
high. The VC invests more in the later stage because the new venture is “in the running”
to becoming highly successful. Said differently, it is inefficient for the VC to bet big on
a horse that won’t finish the race.
The distinguishing feature of this paper is that the analysis operates entirely within
a first-best setting. The model abstracts from conflicts of interest and agency problems
between the VC and entrepreneur. While the relationship between a VC and an en-
trepreneur is no doubt rife with moral hazard, it is unnecessary to resort to a complex
agency model to explain staged financing. A first-best analysis predicts that funding
levels increase over stages, an empirical regularity that Gompers (1995) documents. This
2
suggests that asymmetric information is not necessary to explain all aspects of staged
financing, as prior work in this literature claims. Instead, optimal decision making under
uncertainty gives a clean, robust, and intuitive explanation of staged financing.
The model consists of a risk-neutral venture capitalist funding an entrepreneur over
two stages. Each party invests resources (capital and labor) into each stage, and the
output from the project is the total investment plus a noise term in each stage. Both
parties are symmetrically uninformed on the project’s uncertainty. The project earns a
positive return if the total output across both stages clears an exogenous hurdle. For
example, consider a software company developing a new search engine. If the search
engine is of sufficiently high quality, it has positive value and the company has potential
to be taken public; otherwise, the product is worth nothing.
Both the VC and the entrepreneur have outside options in each stage. For example,
the VC can fund other ventures and the entrepreneur can work on other projects. Staging
investments gives the VC the option to discontinue the project, at which point both
parties collect their respective outside options. The primary implication from staging
investments is that it skews the efficient allocation of resources towards the later stages.
The VC deliberately withholds investments in the early stages precisely because of the
uncertainty from the early stage. In particular, the model shows that the VC will set a
milestone after the first stage, and if the project’s output clears this milestone, the VC
knows the project is sufficiently successful and therefore invests more.
In addition to this primary implication that investments increase in later rounds,
the model generates several testable predictions for future empirical work. First, as the
outside options of both parties increase, the VC will skew its investments even more into
later rounds. Intuitively, as the parties’ outside opportunities improve, the VC has a high
opportunity cost from investing, and therefore can adopt a “wait and see” approach and
can postpone investments into the future. Second, as uncertainty increases, it is efficient
for the VC to invest more towards the early stage. While this may seem counterintuitive,
the logic follows from the option value of continuing. Because the stages are sequential,
an increase in uncertainty increases the upside benefit from continuing, and this gives an
extra benefit to investing in the early stage rather than the late stage. And finally, as the
difficulty of project completion increases (because of market or technology factors), the
VC will invest more resources into later rounds. This occurs because the VC is reluctant
3
to invest too much money in early stage projects which are unlikely to “make it.” All
of these comparative statics give concrete predictions on the ratio of early to late round
financing. Future empirical work can therefore test these predictions in industries with
cross-sectional heterogeneity in technological uncertainty, outside options of VCs and
entrepreneurs, and market risk. None of the prior theoretical work in staged financing
generates testable predictions on how the investment levels per stage vary with the
economic and technological environment of the firm.
I extend the benchmark model by including an ability parameter for the entrepreneur
that persists across both stages. The model considers output measures that cannot
disentangle ability, investment, and uncertainty, where ability is unknown to all parties.
Persistent ability induces correlation over time. If early stage output is low, both the VC
and entrepreneur update their Bayesian priors on ability. They infer that low underlying
ability drives low first stage output. Since ability persists into the second stage, it also
drives low second stage output. They have outside options, so it is efficient to quit early
on. Moreover, low ability entrepreneurs push more of their investment into later stages
than high ability entrepreneurs. Thus, the main results still hold: it is still efficient to
quit if the early returns are weak, and it is efficient to allocate more resources in later
rounds.
Existing literature on staged financing exists exclusively within agency models of
asymmetric information. A landmark paper is Neher (1999), who claims that en-
trepreneurs threaten to hold up VCs by reneging on investments, so VCs stage payments
to reduce their bargaining power. Dividing investments into a number of stages creates
inefficiencies but is necessary in overcoming the commitment problem. Landier (2002)
argues that staging is one way of protecting an investor from risk when entrepreneurs
have a high exit option, i.e. when bankruptcy laws are lenient and when there is little
stigma associated to business failure. Bergemann and Hege (1998, 2005) study the dy-
namics of the optimal contract and equilibrium funding decisions in arm’s length versus
relationship financing. In other work, Bergemann and Hege (2003) show that the du-
ration of funding, though not necessarily the level of funding, increases in later stages.
Cornelli and Yosha (2003) look at “window dressing,” the manipulation of information
on project performance, which entrepreneurs may practice in order to continue to re-
ceive funds. Wang and Zhou (2004) finds that there are cases in which up-front financing
4
may be superior to staging; under staged financing, VCs will underinvest in low quality
projects and potentially doom them to failure. In Yerramilli (2006), each party can
hold up the other and threaten to walk away in order to press for a renegotiation of
the contract. Finally, without the ability for investors to unilaterally cancel projects,
Admati and Pfleiderer (1994) argue that entrepreneurs with outside financing will be
reluctant to quit unproductive ventures. All of these models take place in moral hazard
and asymmetric information settings, and therefore staging is an instrument to mini-
mize agency costs. None of the prior theoretical work explores the efficiency properties
of staged financing.
The empirical literature is consistent with the primary implication of the model. For
example, Sahlman (1990) and Gompers (1995) analyze the Venture Economics database
and find that VCs disburse more money to firms in later stages of development. The
existing theoretical papers on staged financing give mixed predictions on whether in-
vestments will increase in later rounds. In Neher (1999), investments increase over time
because the VC is willing to invest more as the firm’s collateral grows. Yet Giat et al.
(2009) let the VC and entrepreneur hold asymmetric beliefs about the potential value of
a project, and find that staged investments can increase over time, decrease over time,
or rise and then fall. Hsu (2002) analyzes staging in an options valuation framework,
assuming that agents act to maximize the probability of advancing stages. Hsu (2002)
finds computationally that staging tends to be more profitable to investors when ven-
tures are in early stages and will need greater amounts of capital in the future.1 Yet,
none of these papers make predictions on how the ratio of early to late stage financing
changes with exogenous parameters of the environment, such as increase in uncertainty,
project difficulty, or outside opportunities.
The paper is organized as follows. Section 2 presents the benchmark model and
shows that staging investments increases total surplus. Section 3 explores the effects of
staged financing on the ratio of early to late stage funding levels. Section 4 contains the
comparative statics with respect to the model parameters, and delivers secondary impli-
1
Other empirical work in venture capital documents different features of the venture capital environ-
ment. Krohmer and Lauterbach (2005) find empirically that in the final stages of a project, investment
managers may be too unwilling to pull the plug on failing projects. Cuny and Talmor (2005) and Bienz
and Hirsch (2009) look at the differences between the two types of staged financing that are commonly
observed, staging with milestones or with rounds.
5
cations on how the funding level changes with the uncertainty in the model, technology
or market risk, or the outside options of the VC or entrepreneur. Section 5 expands the
model to include an ability parameter for the entrepreneur. Section 6 concludes.
6
2 The Model
Consider an entrepreneur working on a project (a new venture) over time. The en-
trepreneur seeks funding for the project from a venture capitalist (VC). Both parties
are risk neutral. Production takes place across two stages, and there is no discounting.
It takes time to establish a business, and the stages represent distinct phases in pro-
duction. For example, the early stage involves establishing the founder’s initial business
plan, while the later stage involves marketing the plan and generating advertising rev-
enue. Let kt be the total resources invested in the project at stage t. This reflects the
sum of both the entrepreneur’s and the VC’s resources (labor and capital) invested in
the project. Though I call kt investment, it includes human resources as well as financial
resources. Since the focus of the analysis is on efficient resource allocation, it is not
necessary to specify the entrepreneur’s and venture capitalist’s resources separately.
The total resources kt in stage t = 1, 2 face a cost of resource function C(kt ). This
is the total social cost of resources in stage t. Assume C ′ , C ′′ are strictly positive, so
costs are separable across stages, increasing, and convex. The convexity of the cost
function reflects a convex cost of investment for the venture capitalist and a convex cost
of effort for the entrepreneur. A convex cost of effort is a standard assumption, while
a convex cost of investment simply reflects that the VC cannot invest arbitrarily large
amounts without cost.2 The convexity of the supply curve represents all the costs of
raising capital to deliver funds to the entrepreneur. Output from the project is
qt = kt + ǫt .
The noise terms ǫt are i.i.d., and distributed symmetrically around a mean of zero and
over infinite support, with cdf G(·) and density function g(·). Interpret ǫt as a stage-
specific shock unknown to anyone. The ǫt captures all of the market and technological
uncertainty in raising profits: novelty of the founder’s idea, viability of the business plan,
2
VCs draw from dedicated pools of capital that institutional investors supply. In particular, the
VC raises capital in blocks (“funds”), usually targeted towards investments in a specific industry or
technology. If the VC exhausts the fund and wants to invest more he must raise a new fund, which
involves soliciting interest from limited partners (institutional investors), advertising the fund through
business networks, or transferring capital from other preexisting funds. See Prowse (1998) for a full
description on the capital raising process.
7
existence of a potential market, quality of human and physical capital, etc. Even though
the noise terms are independent, later in the paper I will add an ability parameter which
persists across stages. This is essentially equivalent to making the noise terms correlated
over time.
A project is a pair (V, q), where V > 0 is the value of the project and q > 0 is the
final hurdle. After stage two, the VC takes the firm public if it is of sufficiently high
quality. Therefore, the final hurdle represents the minimum quality necessary for a new
venture to capture a positive market price when its shares are traded on public stock
markets. The value of the venture is

V if q1 + q2 > q̄
V (q1 , q2 ) =
0 otherwise.
Output (for e.g. profits, quality, sales) has no value unless it is sufficiently high. In most
new ventures, the venture is worth little unless it can eventually be taken public, or
at least generate profits. Assets of firms that have either failed to go public or have
not successfully obtained later round financing are usually sold at low (firesale) prices;
I simply normalize these low prices to zero. Thus, qt is the project’s internal output
(prototypes, beta versions, etc) while V (q1 , q2 ) measures the project’s external value
based on market valuation. Throughout, call qt the project’s output, and call V (q1 , q2 )
the project’s value. Since information is symmetric in this model, both parties know the
true value V but do not know whether output from the project is sufficiently high to
clear the hurdle q̄. Observe that output levels across stages are perfect substitutes. This
isolates the effects of staging on investment from the effects of technology on investment.
Suppose that both the VC and the entrepreneur have outside options in each stage.
These outside options capture the value of the outside opportunities of both parties.
For example, the VC has many competing investments to fund and can allocate his
capital and his time elsewhere. Similarly, the entrepreneur can either work on other
new ventures, or even collect a wage as an employee for another organization. Let ūt be
the sum of the outside options of the VC and entrepreneur in stage t. So ūt measures
the opportunity cost of the project (time, labor, capital) to both parties.3 The venture
3
The outside options are independent of early stage output. The results of the model generalize
easily if outside options increase linearly in output.
8
capitalist may conduct an evaluation of the venture after stage one. In fact, the purpose
of staged financing is to give the venture capitalist an intermediate reading on the new
venture, with the option of ending the venture if the early returns are weak.
2.1 Upfront Financing

As a benchmark, suppose that the VC does not conduct an evaluation after the first
stage. Importantly, there are no grounds for terminating the project after the first stage.
So the VC gives all the funds for the project upfront; call this “upfront financing.” To
calculate the social payoff, observe that both parties receive positive surplus only if the
project is a success, i.e. that q1 + q2 > q̄. The probability of success is
Z ∞Z ∞
P = Pr(q1 + q2 > q) = Pr(ǫ1 + ǫ2 > q − k1 − k2 ) = g(ǫ1 )g(ǫ2 ) dǫ2 dǫ1
−∞ q−k1 −k2 −ǫ1
by the independence of the errors. After integrating and using the symmetry of the
errors around zero,
Z ∞ Z ∞
P = g(ǫ1 )[1 − G(q − k1 − k2 − ǫ1 )]dǫ1 = g(ǫ1 )G(ǫ1 + k1 + k2 − q)dǫ1 .
−∞ −∞
Therefore, the marginal effect of increasing investment on improving the probability of

success is ∞
∂P
Z
= g(ǫ1 )g(ǫ1 + k1 + k2 − q)dǫ1 .
∂kt −∞
This expression is positive, so increasing investment makes it more likely that the project
will clear the final hurdle. Moreover, observe that the right-hand side of the equality
above is independent of t, and therefore so is the left-hand side. The VC can fund either
in stage one or stage two, as it has the same effect on the project clearing the final
hurdle. Thus, the probability of success increases by the same amount with investment
in either stage. Since total investment is additive, stage one and stage two investment
are perfect substitutes.
Since the objective of the analysis is to understand the efficient allocation of resources,
it is necessary to consider the social planner’s problem, i.e., the joint payoff of the
entrepreneur and the VC combined. This is the expected benefit from investments less
9
the cost of investment in each stage.4 The social planner maximizes total surplus, so
the problem is
max P V − C(k1 ) − C(k2 ),
kt
which yields the first-order condition

∂P
V = C ′ (k̂t ),
∂kt kt =k̂t

where k̂t denotes the optimal effort level.

The marginal cost of investment is equal to its marginal return, which is the marginal
probability of success times the value of the project. Since the left-hand side is indepen-
dent of t, the right hand side must be as well. Hence k̂1 = k̂2 ≡ k̂; this is the efficient
investment under upfront financing, and is the same in each stage. It is efficient to
split investment evenly across stages since the cost of investment per stage is the same.
Because the model is symmetric with respect to the VC and entrepreneur, it is possible
to implement this first best solution, so the VC will split its investment evenly across
stages and the entrepreneur will exert effort and deploy resources evenly across stages.
For example, this is the outcome under a contracting game where the venture capitalist
is the principal who proposes a contract to the agent, the entrepreneur. In this setting,
since both parties are risk neutral, it is straightforward to construct a contract that
implements the first-best.5
Note that convexity of the cost function is not what guarantees that investment in
both stages is the same. Investment is the same because (1) convexity of the cost function
guarantees a unique solution, (2) the marginal return to investment in each period is
same, and (3) the cost function is separable and identical across stages. Convexity does,
however, guarantee that efficient investment increases with V . Collecting terms, the
4
Observe that even though the production function V (q1 , q2 ) is discontinuous at the point q1 +q2 = q̄,
the planner’s expected payoff P V is continuous in kt .
5
Since contracting issues are not central to this analysis, I do not outline the details of the contracting
game, such as the contract space, the bargaining power between the two parties, etc. Such a game is
straightforward to construct, as the principal will pay the agent W for success (q1 + q2 > q̄) and L for
loss (q1 + q2 < q̄). To guarantee full incentives to exert first best, the principal will set W − L = V .
Further details on this contract are available from the author upon request.
10
efficient per-stage investment level k̂ solves
Z ∞
′
C (k̂) = V g(ǫ1 )g(ǫ1 + 2k̂ − q) dǫ1 .
−∞
The remaining constraint is a bound on the reservation utilities. The total surplus from
having the entrepreneur undertake the project must be at least as large as the total
outside options across both stages. So,
P V − 2C(k̂) ≥ ū1 + ū2 .
Call this the project feasibility constraint.
2.2 Efficiency of Staging Investments

The main reason to conduct an evaluation halfway through a project is that it provides
the option to abandon the project if the early returns are low. The two parties will
use first stage output to compute the expected project value V (q1 , q2 ) after the second
stage. This yields an expected value of continuing. Because of the outside options, it is
efficient to continue only if this value exceeds these outside options.
If the VC and entrepreneur observe q1 after the first stage and must decide whether to
continue or not, their decision will depend on the observed q1 . Therefore the probability
of continuing and the total surplus from continuing will also depend upon this observed
q1 . The probability of clearing the final hurdle, conditional on a realized value q1 , is
P (q1 ) ≡ Pr(q1 + q2 > q | q1 ) = Pr(ǫ2 > q − q1 − k2 ) = G(q1 + k2 − q̄).
So the total surplus conditional on a realized q1 is
S(q1 , k2 ) = E2 V (q1 , k2 + ǫ2 ) − C(k2 ) = P (q1 )V − C(k2 ),
where Et denotes the expectation taken over ǫt . Call this the continuation surplus
function. For clarity, let S(q1 ) ≡ S(q1 , k2∗ ) be the continuation surplus evaluated at the
efficient investment level k2∗ .6 This continuation surplus function reflects the expected
total surplus from continuing after a realization of first stage output q1 . The continuation
6
Note that in general kt∗ , which is efficient under staged financing, differs from the k̂t from the
previous section, which is efficient under upfront financing.
11
decision rests entirely on this function. In particular, it is efficient to continue if and
only if S(q1 ) ≥ ū2 . The first result below shows that the continuation surplus function
is strictly increasing. This means there exists a unique cut-off output level q ∗ such that
S(q ∗ ) > ū2 if and only if q1 > q ∗ . In words, the planner sets the optimal target q ∗ such
that he is indifferent between advancing and retaining the entrepreneur. All proofs are
in the appendix.
Proposition 1 There exists a target q ∗ such that it is efficient only for entrepreneurs
with q1 > q ∗ to advance to the second stage.
Because q1 + q2 > q̄ in order to collect positive surplus, the stages are connected;
output in the early stage signals final project value. Said differently, a successful early
stage (high q1 ) means that the project will have an easier time clearing the final hurdle,
and therefore a higher chance of both parties collecting surplus. Proposition 1 shows
that the continuation surplus is monotonic, and this generates the cutoff target q ∗ . In
practice, this q ∗ represents the milestone in between rounds of venture financing.7 If the
quality of the project clears this milestone, then the entrepreneur qualifies for the next
round of funds. The assumption on outside options is key. Without outside options, it
would be efficient to continue for any q1 since all parties get nothing by quitting and are
at least as well off continuing (recall that V is always nonnegative).
Proposition 1 shows that staged financing generates more surplus than upfront fi-
nancing. Under upfront financing, the VC does not collect information on early stage
output, and therefore advances all projects regardless of their early performance. Under
staged financing, the investor sorts projects into two groups: stay or quit. The target q ∗
conducts the sorting, in that it allows only entrepreneurs with high output to proceed.
The VC can always set the target arbitrarily low, which permits continuation for all
output levels, and hence replicates upfront financing. By setting the target optimally,
the VC has an additional instrument to maximize total surplus, and therefore must be
weakly better off. This suggests that staging financing does more than simply minimize
7
For example, milestones separate early round financing (series A) from later round financing (series
B). New ventures must meet certain targets, such as number of employees hired, free cash flow, research
and development investments, progress on business plan, etc. These targets constitute the milestone
q∗ .
12
agency costs, as the prior literature has argued. Instead, staging is a tool to make both
VCs and entrepreneurs better off.
3 Effects of Staged Financing

Now that we know staged financing increases surplus, what is the efficient investment
level per stage under the staged financing regime? The previous section shows that
the continuation decision will take the form of a cut-off rule. Precisely, the VC sets
some target (or milestone) q ∗ after the first stage, and advances the entrepreneur only
if q1 > q ∗ . The probability of clearing the target is
P1 = Pr(q1 > q ∗ ) = G(k1 − q ∗ ).
As expected, this probability increases in first-stage investment since ∂P1 /∂k1 = g(k1 −
q ∗ ) > 0. The ex-ante probability of success is
Z ∞
∗
P ≡ Pr(q1 + q2 > q, q1 > q ) = P (q1 )g(q1 − k1 ) dq1 ,
q∗
where P (q1 ) = G(q1 + k2 − q̄) is the interim probability of clearing the final hurdle and
capturing V for each realization of q1 . Notice that
Z ∞
∂P
= g(ǫ1 )g(ǫ1 + k1 + k2 − q) dǫ1 > 0,
∂k2 q ∗ −k1
∂P ∂P ∂P1
= + G(q ∗ + k2 − q) (1)
∂k1 ∂k2 ∂k1
∂P
> .
∂k2
The returns to investment are positive for both stages, but are higher for the first
stage. Additional first stage investment increases the probability of success in two ways.
First, it increases q1 + q2 and thus directly increases the probability of final success. Sec-
ond, it increases first stage output (q1 = k1 +ǫ1 ) and so improves the chance of advancing
to the second stage. Hence the marginal benefit of first stage investment exceeds the
marginal benefit of second stage investment. It is incorrect to conclude from this, how-
ever, that it is efficient to invest more in the first stage, since this analysis both ignores
13
the cost of investment, and takes q ∗ as given, whereas in fact q ∗ is determined simulta-
neously with the optimal kt . Increasing first stage investment increases the chances of
advancing to the second stage, and thus increases the probability of bearing the cost of
a second stage investment. At the optimum, this cost is enough to push k1 below k2 .
To see this, it is necessary to solve the social planner’s problem.
The conditional probability of clearing the final hurdle, given that the entrepreneur
has reached the target, is
Q = Pr(q1 + q2 > q̄|q1 > q ∗ ).
So the ex-ante probability of clearing the hurdle q̄ is P = P1 Q. If the entrepreneur

passes the intermediate target q ∗ , the planner gets V if he clears q̄ and zero otherwise,
and bears cost C(k2 ). If he doesn’t pass the intermediate target, the planner gets only
ū2 . So the total surplus is

P1 QV − C(k1 ) − C(k2 ) + (1 − P1 ) ū2 − C(k1 ) .
Rearranging terms gives the planner’s problem
max P V − C(k1 ) + (1 − P1 )ū2 − P1 C(k2 ),

kt ,q
subject to project feasibility. The last term above is the cost of advancing to the second
stage. This cost is increasing in first-stage investment. As the VC invests more in stage
one, he increases the expected second-stage cost, since larger first-stage investments
increase the probability of making it to the second-stage. This cost forces first-stage
investment downward, ultimately below even second-stage investment. More generally,
it is possible to write the planner’s objective function in terms of the continuation surplus
function. So the planner solves
Z ∞
max S(q1 , k2 )g(q1 − k1 )dq1 + (1 − P1 )ū2 − C(k1 ), (2)
kt ,q q
where investment levels kt and the target q are the planner’s choice variables, and (kt∗ , q ∗ )
denote the efficient choices. The first term is the expected value of continuing: the
continuation surplus function integrated over all realizations of q1 > q ∗ . The middle term
(1 − P1 )ū2 is the expected value of abandoning the project. Both parties collect their
14
outside options if the project does not clear the target, which occurs with probability
1 − P1 . Note that C(k2 ) does not appear in the objective function explicitly because it
is embedded in S(q1 , k2). The planner bears the cost of C(k2 ) only in the event that the
entrepreneur advances.
3.1 Primary Implication: Dynamic Capital Allocation

The following proposition solves the planner’s problem (2) for the efficient allocation of
resources across stages, and is the main result.
Theorem 1 It is efficient to invest more in the second stage (k1∗ < k2∗ ).
Since the planner sets the target optimally, the marginal return of an entrepreneur
who cleared the target exceeds the marginal return of an entrepreneur in the first stage.
Formally,
C ′ (k1∗ ) = E[S ′ (q1 )] < E[S ′ (q1 )|q1 > q ∗ ] = C ′ (k2∗ ).
The mean marginal return conditional on q1 > q ∗ exceeds the unconditional mean.
Since marginal costs are increasing, this implies that k1∗ < k2∗ . The marginal return to
investment is lower in stage one precisely because the entrepreneur may not advance to
the second stage. In this case, he bears the cost C(k1 ) but acquires the benefit V not
with certainty but with probability less than one. This lowers the marginal return in
stage one relative to stage two. At the optimum the VC selects kt to set the marginal
costs equal to the marginal returns, and so he shades investment downward in the early
stages. He will allocate more resources in the later stages of the project, where the
marginal return is higher. Rewriting the first order conditions in terms of the specific
production function here yields
∂P ∂P/∂k2 ∂Q
C ′ (k1∗ ) = V <V =V = C ′ (k2∗ ).
∂k2 P1 ∂k2
Thus, those who make it to the second stage are more valuable precisely because their
first stage output was sufficiently high. The VC invests more in the later stage because
the new venture is “in the running” to becoming highly successful. It is inefficient for
the VC to dump too many resources into a horse that won’t finish the race.
15
Stage Amount ($Thousands) Number
Seed 921 122
Early Stage 1054 114
First Stage 1928 288
Other early 2182 221
Expansion 2343 377
Second Stage 2507 2482
Third Stage 2784 181
Bridge 2702 454
Figure 1: Data on Staged Financing from Gompers (1995).
Gompers (1995) provides data on staged financing, which fits the predictions of this
model. His data comes from a sample of 794 venture capital backed firms randomly
selected from the Venture Economics database. Figure 3.1 shows his summary statistics
on amounts of funding by stage for all firms in the sample. Without question, it is
clear that funding increases in the later stages. This is precisely k1 < k2 . Moreover,
his regression results (Panel B of page 1479) show that average late stage investments
exceed average early stage investments by $1.3 to $2.03 million, and exceed average
middle stage investments by $0.7 to $1.21 million. Therefore, the predictions of this
model are at least consistent at a first pass with the existing empirical work of Gompers
(1995).
The premise of Gompers (1995) is to explain staged financing as an instrument to
mitigate agency conflicts. While the data is consistent with such an interpretation, there
is no formal model in his paper. In contrast, our formal model not only generates the
same prediction that venture capitalists will stage investments and such investments
rise with time, but it also generates new comparative statics on the outside options and
output variance. The agency models on staged financing cited above say nothing on
these last two points. We now turn to these comparative statics of the model.
16
4 Secondary Implications: Comparative Statics
While some prior theoretical work has made explicit predictions on the ratio of invest-
ment levels over stages (Neher (1999), Giat et al. (2009), Hsu (2002)), these papers
k1
have not made predictions on how this ratio k2
varies in different environments, such
as industries with different levels of market risk, projects with different levels of tech-
nological feasibility, or more prestigious venture capitalists with better outside options.
The objective of this section is to explore how the endogenous variables (kt∗ , q ∗ ) vary
with the exogenous parameters (ū2 , v, q̄). In particular, the aim is to produce a number
of secondary implications that can be tested against real venture capital data. If this
ratio k1∗ /k2∗ increases, the VC skews the investment mix towards the early stage, and
vice versa. To get traction on the model, parameterize the cost function as C(kt ) = λktγ
for some constants λ > 0, γ > 1. The next result predicts how the ratio of investment
levels varies with the outside options of both parties.
Proposition 2 The ratio of early to late round financing k1∗ /k2∗ decreases in the outside
options ū2 .
As the outside options improve, it is efficient to invest even more money in later
rounds. The outside options represent the opportunity cost of alternative investments
in the second stage. As this opportunity cost increases, investors are more reluctant
to fund ventures since alternative opportunities are promising. This causes the efficient
investment level k1∗ to sink. However, once those ventures do clear the hurdle it is efficient
to invest more, so k2∗ rises. The net effect is that the ratio k1∗ /k2∗ sinks. Ultimately, good
outside opportunities allow VCs to withhold early round investments relative to later
round investments. Similarly, good outside opportunities for the entrepreneur make it
tempting for him to abandon projects with low early returns, and therefore this will cause
him to invest less effort and resources into the project in the early stage. Ultimately,
the outside options capture the opportunity cost of investment, and therefore measure
the tolerance for poor projects. With high outside options, this tolerance is low, and
therefore both parties invest less in the first stage.
In practice, there is wide heterogeneity among venture capitalists and entrepreneurs
in terms of their outside options. For example, VC firms with successful records in bring-
ing new ventures to an IPO and generating outside profits for their limited partners will
17
often enjoy high outside options. These VC firms are routinely flooded with capital from
limited partners as well as with proposals from many different entrepreneurs.8 Similarly,
entrepreneurs vary in their outside options as well. Successful managers at existing com-
panies, or entrepreneurs with a prior record of performance in new companies, will no
doubt enjoy multiple offers from management teams and VCs alike. Provided it is pos-
sible to measure the outside options of the VC or the entrepreneur, Proposition 2 gives
a clear prediction on how the ratio of investment levels over stages will vary with these
outside options.
To what extent does the first best analysis here prevent or hinder empirical veri-
fication of the predictions? In particular, is it necessary to specify the breakdown of
the outside options between the two parties in order to test Proposition 2? Sorensen
(2007) structurally estimates a two-sided matching model of Silicon Valley VCs and
entrepreneurs. He finds that high quality VCs match with high quality entrepreneurs,
i.e. he finds evidence of positive sorting. This suggests that the outside options of the
entrepreneur and VC “move together” — when the VC has high outside options (high
quality), so does the entrepreneur. This makes Proposition 2 especially relevant, as the
parameter of interest is the total outside option of both parties. These options will be
high among parties of high quality and low among parties of low quality.
Now consider what happens with an increase in the output variance, i.e. the variance
on the error distribution g. For the remaining implications, let the cost of investment
be quadratic (γ = 2) and the error distribution be uniform.
Proposition 3 As the output variance increases, kt∗ decreases while k1∗ /k2∗ increases. If
V
ū2 > 2
, then q ∗ increases.
As the variance of g increases, it is clear that this will choke off investment in both
stages. This is the same intuition from the Lazear and Rosen (1981) tournament model,
in which increased noise reduces effort incentives. What is not obvious is whether the
decrease in investment is larger in stage one versus stage two. It seems plausible that an
increase in noise will cause the entrepreneur to withhold investment in the early round,
and work harder in the later rounds. While this logic is compelling, it is misleading.
8
The VC firms that funded the major internet companies in the late 1990s, such as Kleiner Perkins
or Sequoia Capital, have higher opportunity costs than less successful VC firms.
18
An increase in output variance affects later stage investment more than first stage
investment. This occurs because the marginal benefit to first stage investment exceeds
the marginal benefit to second stage investment (see (1)), since investment in the first
stage not only affects the probability of clearing the final hurdle q̄ but also of clearing
the milestone q ∗ . Because the investor and the entrepreneur can quit the project if
output does not clear q ∗ , an increase in the output variance increases the upside benefit
from continuing. A larger first stage investment increases the chance of capturing this
upside, and this gives an extra benefit to investing in the early stage rather than the later
stage. Therefore, an increase in output variance decreases investment in both stages,
but decreases late stage investment more than early stage investment.
Recall that the stage specific noise terms represent market and technological uncer-
tainty at each stage. In reality, there is clearly a variation between different industries
on this uncertainty. For example, some industries may have high uncertainty at the mar-
ket level, possibly reflecting difficulty in bringing a new firm to market because of the
strategic position of incumbents. On the other hand, some industries may exhibit high
technological uncertainty, deriving from the production function itself; for example, the
biological process of drug development may impose higher uncertainty on new biotech
firms than technological uncertainty in other industries. This variation in market and
technological uncertainty can be exploited to predict variation in the ratio of invest-
ment levels over stages. Finally, observe from Proposition 3 that if the investor and
V
entrepreneur have sufficiently good outside options (ū2 > 2
), then they will set higher
targets when the output variance increases. So in more risky industries, it is efficient to
set a higher milestone to justify later round financing.
Proposition 4 As the value of the venture V increases, q ∗ decreases while k1∗ and k2∗
both increase.
This comparative static is perhaps the most straightforward. As the venture becomes
more valuable, it is efficient to invest more in each round. Said differently, each party is
more willing to invest more and bear a higher cost of investment if the resulting benefit
increases. The proof of the proposition shows that as the variance in the noise terms
becomes sufficiently large, the ratio of early to late investments k1∗ /k2∗ does not vary with
V . Therefore, even though the VC invests more in each stage, the ratio of investments
19
across stages eventually stays constant. On top of this, higher valuation ventures should
exhibit lower milestones (q ∗ ) between early and late stages. Therefore since q ∗ and kt∗
both increase, this increases the probability of success, since P1 = G(k1∗ −q ∗ ). Intuitively,
the venture is more valuable, and so it becomes more desirable to pass at the interim
stage, as this generates surplus for both parties. Passing the interim hurdle is made easier
by simultaneously increasing investments in each stage and decreasing the milestone,
thereby increasing the probability of investment. It is efficient to do this precisely
because the end game prize V is worth more.
In practice, measuring q ∗ directly may be difficult, as VCs may not have hard, ob-
jective criteria when deciding whether to continue funding projects or not. For example,
part of the evaluation may be based on instinct for whether the project will be success-
ful or not. Nonetheless, higher milestones are harder to clear than lower milestones,
and therefore VCs that set high milestones will abandon many ventures at the interim
stage. Similarly, VCs with low milestones will tell most of its entrepreneurs to continue.
Therefore one such empirical proxy for q ∗ is the number of firms abandoned at the in-
terim stage divided by the total number of firms funded at the outset. In this sense, the
milestone q ∗ reflects the quit rate or abandonment rate of the VC and entrepreneur.
Proposition 5 As the final hurdle q̄ increases, k1∗ decreases, q ∗ increases, and k2∗ is
unchanged.
Recall that q̄ is the final hurdle that output must clear in order for both parties to
receive value from the venture, and therefore reflects the fundamental difficulty of project
completion (because of market or technology factors). As the hurdle increases, it is
efficient to decrease first stage investments and leave late stage investments unchanged,
thus decreasing the ratio of early to late stage investments. Formally, q̄ affects the
planner’s payoffs only through the probability of success P . Specifically, for every q1 ,
P (q1 ) = G(q1 + k2 − q̄) decreases in q̄. Therefore the expected benefit P V decreases
in q̄. As the benefit sinks, the VC lowers costly investment k1 . In fact, a marginal
increase in q̄ has the opposite effect of a marginal increase in V . As q̄ increases, the VC
simultaneously decreases k1 and increases q ∗ , thus lowering the probability of clearing
the target, since P1 = G(k1 − q ∗ ). In other words, when the project’s difficulty increases,
this lowers the expected benefit to the VC, so he reduces the probability of advancing
the entrepreneur at the intermediate stage.
20
The concrete empirical prediction is that industries with higher final hurdles should
k1
observe more funding in later stages (lower k2
). Observe that this is the opposite predic-
tion from an increase in variance, as predicted by Proposition 3. As an empirical matter,
it will be important to distinguish high hurdles from high risk industries. The empirical
measures of these variables may be close even though the theoretical concepts are quite
different. For example, consider the market for an AIDS vaccine. It is plausible that
AIDS research is both highly risky (high variance on g), and that it is very difficult to
discover an actual vaccine (high final hurdle q̄). Finally, the comparative static with re-
∂q ∗
spect to the intermediate target is particularly elegant: ∂ q̄
= 1. For every unit increase
in the final hurdle, it is efficient to increase the milestone by exactly that amount.
The last comparative static of this section involves the cost of investment.9 On
the VC side, the cost of investment includes the transaction costs of deploying capital
from existing funds, as well as time and labor spent attracting additional capital from
institutional investors through raising a new fund. For example, the large influx of
capital from public equity into private equity over the last twenty years (Prowse (1998))
has made it easier for VCs to raise new funds, and constitutes a reduction in the cost
of investment λ. As such, Proposition 6 predicts that second stage investments will
increase.10 For the entrepreneur, the cost of investment includes his cost of effort as well
as the cost of deploying his own capital in the firm. For example, suppose in the very
early stage of the venture, the entrepreneur finances the project with his own savings
and a small loan from the bank. Interest rates that govern the bank loan will affect his
cost of investment, and hence higher interest rates correspond to higher λ. According
to Proposition 6, this results in the VC setting a higher milestone and making a larger
second round investment.
Proposition 6 As the cost of investment λ increases, k2∗ decreases and q ∗ increases.
Market factors which increase the cost of investment will decrease later round invest-
ments and increase the milestone q ∗ .
9 λ 2
Recall that the cost of investment is C(kt ) = 2 (kt ) , so λ is a parameter that scales the cost of
investment.
10
According to Prowse (1998), the venture capital market grew from $600 in 1980 to over $4 billion
in 1984. In fact, the market grew five-fold from the years 1980 to 1984 itself. The volume of venture
capital available post-1980 dwarfs the pre-1980 levels.
21
5 Ability
When projects fail, it is difficult to separate technology failure from management failure.
This happens because output measures cannot disentangle the entrepreneur’s ability
from technological uncertainty when assessing the performance of a project. This section
models optimal dynamic decision-making when ability and uncertainty are impossible
to observe separately. Ability is a shorthand for project-specific ability, measuring the
quality of the match between the entrepreneur and the project.
The benchmark model now includes an underlying ability parameter that persists
through both stages. Ability a is distributed according to a prior distribution f (·) with
support A. Since neither the VC nor the entrepreneur know the entrepreneur’s ability,
the uncertainty is symmetric. Suppose that
qt = a + kt + ǫt .
Clearly, higher ability increases output. More precisely, ability and investment are sub-
stitutes, so more able (higher a) entrepreneurs can invest less to generate the same
output as less able entrepreneurs. Including a persistent ability term induces correlation
in output across stages, as the error term is now effectively at ǫt . High ability which
generates high output today will also generate high output tomorrow. This is the main
intuition which drives the results of this section.
For each a ∈ A, the probability of passing the intermediate target q ∗ is
Pr(q1 > q ∗ ) = P r(ǫ1 > q ∗ − a − k1 ) = G(a + k1 − q ∗ ).
This probability increases in a, so higher ability entrepreneurs are more likely to clear
the target. The ex-ante probability of passing the target, averaging over A, is
Z
P1 ≡ G(a + k1 − q ∗ )f (a)da.
A
Conditional on output realization q1 , the probability of passing the final hurdle q for
each a ∈ A is
Pr(q1 + q2 > q|q1 ) = Pr(ǫ2 > q − q1 − a − k2 ) = G(k2 + a + q1 − q).
22
Similarly, high ability entrepreneurs are more likely to clear the final hurdle as well.
The ex-ante probability of clearing q averaging over all a ∈ A is
Z
P (q1 ) ≡ G(k2 + a + q1 − q)f (a|q1 )da.
A
Let S(q1 , k2 ) be the continuation surplus function, i.e. the expected surplus of con-
tinuing, given that first stage output is q1 . This is the same function as before, except
now it is necessary to take expectations over a. So
Z
S(q1 , k2 ) = V G(k2 + a + q1 − q)f (a|q1 )da − C(k2 ).
A
It is efficient for the entrepreneur to continue on the project if the continuation

surplus exceeds the outside options, or if S(q1 ) ≡ S(q1 , k2∗ ) ≥ ū2 . As before, a cutoff
strategy is a target q ∗ that the planner sets, such that the entrepreneur continues to
work if q1 > q ∗ but not otherwise. If S(q1 ) is strictly increasing, then the entrepreneur
will use a cutoff strategy.
The posterior mean E[a|q1 ] measures the entrepreneur’s revised estimate of his ability
after the first stage. If the posterior mean increases with q1 , then high early output
signals high ability. In other words, with higher realizations of q1 , the entrepreneur
updates his posterior mean, giving him a revised estimate of his ability. A high q1 is
more likely to emerge from an entrepreneur with a high ability, so the entrepreneur will
revise his posterior mean upward. A fairly weak condition that guarantees that E[a|q1 ]
increases with q1 is the monotone likelihood ratio property of the posterior density.
Definition 1 The posterior f (a|q1 ) satisfies monotone likelihood (MLRP) if the likeli-
fq1 (a|q1 )
hood ratio L(a) ≡ f (a|q1 )
increases in a.
Proposition 7 Under MLRP, the entrepreneur uses a cutoff strategy. There exists a q ∗
such that it is efficient for entrepreneurs with q1 > q ∗ to continue into the second stage.
Intuitively, introducing persistent ability induces correlation across stages. This cor-
relation makes it worthwhile to cut projects with low output. The posterior mean E[a|q1 ]
increases in q1 under MLRP. So if an entrepreneur sees a high q1 , he learns (in a precise
Bayesian sense) of his high ability. Because ability persists into the second stage, he
will most likely get a high q2 , and is therefore more likely to clear the final hurdle. So
23
it makes sense for him to stay after a high q1 . The entrepreneur stays not only because
output is high today, but because he learns of his high ability that leads to high output
tomorrow. Since the analysis solves the social planner’s problem, the interests of the
entrepreneur and VC are aligned. Thus the VC would also like to keep an entrepreneur
with a high q1 because he learns of the entrepreneur’s high ability, which will affect both
second stage output and the probability of clearing the hurdle. This ability-induced
correlation across stages is sufficient to guarantee the use of a cutoff strategy.
5.1 Efficient Dynamic Capital Allocation

This section analyzes the efficient amount of investment across stages. The main result
of the paper is robust even after including ability in the model. Specifically, the sorting
effect of the efficient target q ∗ will bias investment upwards in the second stage.
To gain intuition on the problem, suppose that errors are distributed normally, so
ǫt ∼ N(0, s2 ), and the ability parameter is distributed a ∼ N(a0 , t2 ). Call s2 the error
variance and t2 the prior variance, i.e. the variance on the prior distribution of ability.
Calculation shows that the posterior density is normal with moments
t2 s2
E[a | q1 ] = (q1 − k 1 ) + a0 ;
t2 + s2 t2 + s2
s2 t2
Var(a | q1) = 2 .
t + s2
The posterior mean is a linear function of the output realization q1 − k1 and the prior
mean a0 , placing more weight on the term with smaller variance. For example, as the
prior variance decreases to zero, the posterior mean converges to the prior mean. So as
the entrepreneur’s information on his prior improves, his posterior places little weight
on his output realization and more weight on the prior mean. Similarly, as the error
variance decreases to zero, the posterior mean converges to the realization q1 − k1 . With
a low error variance, the output realization gives an accurate signal of ability, and so
the posterior mean reflects this. Solving the social planner’s problem shows the paper’s
main result is robust.
Proposition 8 It is efficient to invest more in the second stage (k1∗ < k2∗ ).
24
The same intuition from the main result earlier in the paper holds here: the possibility
of halfway termination lowers the marginal return to investment in stage one, and so
the VC shades investment downward in the first stage. Sorting guarantees that the
more able entrepreneurs advance, and they invest more because they no longer face the
threat of termination that they did in the first stage. The analysis here suggests that it
is efficient to invest more once all parties learn of the entrepreneurs higher output and
hence higher ability.
This section has shown that the benchmark model is robust to including an ability
parameter for the entrepreneur, arguably a more realistic setting. The only additional
assumption needed is MLRP of the posterior density.
6 Conclusion
Staged financing is a fundamental feature of the venture capital market. VCs do not
fund new ventures all at once, but instead deliver the investments in stages, forcing the
project to clear a sequence of milestones in order to guarantee future funding. While
the relationship between a VC and an entrepreneur is no doubt plagued by agency
problems and asymmetric information, this paper shows that staged financing can be
explained with a more simple and robust efficiency argument. VCs stage investments
not necessarily to mitigate moral hazard nor as a response to private information, but
simply because doing so maximizes total surplus. A common critique of agency models
is that it takes the information environment as given, and cannot explain why both
parties do not bargain or trade from second-best outcomes towards first-best outcomes.
The argument here is immune to that criticism.
Not only is staged financing efficient, but it skews the allocation of investment to-
wards later stages. This backloading of investments is an empirical regularity established
in the venture capital literature. Staged financing creates the possibility of termination
after the early stage, and this introduces uncertainty into the early stage. This uncer-
tainty decreases the expected surplus in stage one, and therefore, it is efficient to invest
less in stage one. Once the entrepreneur has proven his first stage output to be high
(q1 > q ∗ ), this uncertainty vanishes, and expected surplus rises. Because of this, it is
efficient to invest more in the later stage.
25
The existing empirical literature on venture capital documents that investments in
later rounds exceed those in earlier rounds. Therefore, the model is consistent with ex-
isting empirical work. Moreover, the model produces a number of empirical implications
that have not yet been tested, and hopefully will stimulate future empirical papers. The
secondary implications of the model all predict how the ratio of investment levels over
stages (k1∗ /k2∗ ) varies with the parameters of the model, such as the outside options of
both parties, the variance in the error distribution, and the difficulty of project comple-
tion. To summarize, it is efficient to invest even more in the later stages if the outside
options of both parties increase, the error variance decreases, or the final hurdle q̄ in-
creases. I provide several suggestions for empirical proxies for the exogenous parameters
of the model. This hopefully provides guidance to explain some of the variation between
firms and industries in terms of their outside options, market and technological volatility,
difficulty of project completion, and cost of investment. Prior agency models of venture
capital have been unable to produce these testable implications. That the model here
can be tested is both its distinguishing feature and its primary strength.
Future work in this area can extend this model in a number of promising directions.
For example, the valuation of the project V is known by both parties at the outset,
though in practice the firm’s valuation is highly uncertain prior to the initial public
offering. Also, it is an open question as to how syndicates of venture capitalists investing
simultaneously in a firm will change the conclusions of this paper. I assumed throughout
that the VC acts as a single entity, though in practice a lead venture capitalist provides
the majority of the financing while secondary VCs share the risk by holding a minority
share of the equity in the firm. Ultimately, the contribution of this paper is conceptual
in nature: efficiency can explain certain features of venture capital markets more cleanly
and simply than complex moral hazard arguments. Agency theory has done much to
improve our understanding of financial contracting over the last quarter century, but
there are times when a first-best explanation can do just as well, if not better.
7 Appendix
Proof of Proposition 1
26
Let S(q1 ) ≡ S(q1 , k2∗ ) and q1∗ = k1∗ + ǫ1 . Since g > 0,
∂S(q1 , k2∗ )
S ′ (q1 ) = = P ′ (q1 )V = V g(q1 + k2∗ − q) > 0.
∂q1
So continuation surplus is strictly increasing and continuous in q1 . Recall that V (q1 , q2 ) →
0 as qt → −∞ for some t. Since ū2 > 0, there exists an x small enough such that
0 < S(x) < ū2 . Now
P V − C(k1∗ ) − C(k2∗ ) = E1 S(q1∗ ) − C(k1∗ )

= E1 E2 V (q1∗ , k2∗ + ǫ2 ) − C(k2∗ ) − C(k1∗ )

= E V (q1∗ , q2∗ ) − C(k2∗ ) − C(k1∗ )

≥ ū1 + ū2 ,
where the inequality follows from project feasibility. Therefore E1 S(q1∗ ) > ū2 . By
the mean value theorem there exists a y ∈ R such that S(y) = ES(q1∗ ), and hence
S(y) > ū2 > S(x). By the intermediate value theorem there exists a q ∗ ∈ (x, y) such
that S(q ∗ ) = ū2 . If S(q1 ) < ū2 , it is efficient to terminate the project. Since S(q1 ) is
monotonically increasing in q1 , this holds for q1 < q ∗ as well.
Proof of Theorem 1. Recall that
S(q1 ) = P (q1 )V − C(k2 ) and P (q1 ) → 0 as q1 → −∞.
The planner solves

Z ∞
max S(q1 , k2 )g(q1 − k1 )dq1 + (1 − P1 )ū2 − C(k1 ).
kt ,q q
The first order conditions with respect to q, k2 , k1 are
S(q ∗ ) = ū2 ;
∞
∂S(q1 , k2 )
Z
g(q1 − k1∗ )dq1 = 0;
q∗ ∂k2
Z ∞
′
C (k1∗ ) =− S(q1 )g ′ (q1 − k1∗ )dq1 − g(q ∗ − k1∗ )ū2 ,
q∗
∂S(q1 ,k2∗ )
where S(q1 ) = S(q1 , k2∗ ), and S ′ (q1 ) = ∂q1
.
27
In what follows, we write kt for kt∗ for visual clarity. Substituting S(q ∗ ) = ū2 into
the last equation and integrating by parts gives
Z ∞
′
C (k1 ) = S ′ (q1 )g(q1 − k1 )dq1 .
q∗
From the continuation surplus function S(q1 ) = P (q1 )V − C(k2 ),
S ′ (q1 ) = g(q1 + k2 − q̄)V ;
∂S
= g(q1 + k2 − q̄)V − C ′ (k2 ).
∂k2
Combining these gives
∂S
= S ′ (q1 ) − C ′ (k2 ).
∂k2
Integrating both sides and combining with the FOC for k2 yields
Z ∞ Z ∞
∂S(q1 )
0= g(q1 − k1 )dq1 = S ′ (q1 )g(q1 − k1 )dq1 − P1 C ′ (k2 ).
q∗ ∂k2 q∗
where P1 = Pr(q1 > q ∗ ). Now combining with FOC for k1 gives

Z ∞
′
C (k1 ) = S ′ (q1 )g(q1 − k1 )dq1 = P1 C ′ (k2 ) < C ′ (k2 ).
q∗
And, since marginal costs are increasing, this means k1 < k2 .
Consider possible values ūa and ūb for ū2 , with ūa corresponding to qa∗ , k1a , and k2a .
Let Sa (q1 ) = S(q1 , k2a ). And let ūb correspond to qb∗ , k1b , and k2b , where Sb (q1 ) = S(q1 , k2b ).
Lemma 1 If ūa < ūb, then qa∗ − k1a < qb∗ − k1b .
Proof: Suppose the contrary, that qa∗ − k1a ≥ qb∗ − k1b . For clarity, write ε for ε1 . Let
F (k1∗ , k2∗ , q ∗ |ū2 ) be the value function of the social planner’s objective function, so
Z ∞
∗ ∗ ∗
F (k1 , k2 , q | ū2) = max S(q1 , k2 )g(q1 − k1 )dq1 + (1 − P1 )ū2 − C(k1 ),
kt ,q q
where P1 = 1 − G(q ∗ − k1∗ ).
28
By optimality of (qa∗ , k1a , k2a ) and (qb∗ , k1b , k2b ),
F (k1a , k2a , qa∗ | ūa) > F (k1b , k2b , qb∗ | ūa) and F (k1b , k2b , qb∗ | ūb) > F (k1a , k2a , qa∗ | ūb).
Expanding,
Z qb∗ −k1b Z ∞
ūa g(ε)dε + Sb (ε + k1b )g(ε)dε − C(k1b )
−∞ qb∗ −k1b
Z qa∗ −k1a Z ∞
< ūa g(ε)dε + Sa (ε + k1a )g(ε)dε − C(k1a ); (A1)
−∞ qa∗ −k1a
Z qa∗ −k1a Z ∞
ūb g(ε)dε + Sa (ε + k1a )g(ε)dε − C(k1a )
−∞ qa∗ −k1a
Z qb∗ −k1b Z ∞
< ūb g(ε)dε + Sb (ε + k1b )g(ε)dε − C(k1b ). (A2)
−∞ qb∗ −k1b
Now, (A1) implies

Z qa∗ −k1a
ūa − Sb (ε + k1b ) g(ε)dε

qb∗ −k1b
Z ∞
Sa (ε + k1a ) − Sb (ε + k1b ) g(ε)dε − C(k1a ) + C(k1b ) > 0

+
qa∗ −k1a
Z qa∗ −k1a
Sb (ε + k1b ) − ūa g(ε)dε

=⇒
qb∗ −k1b
Z ∞
Sb (ε + k1b ) − Sa (ε + k1a ) g(ε)dε + C(k1a ) − C(k1b ) < 0.

+
qa∗ −k1a
And, if qb∗ − k1b ≤ qa∗ − k1a , then it also holds that the left-hand side is negative when
ūa is replaced by ūb , since ūa < ūb . So
Z qa∗ −k1a
Sb (ε + k1b ) − ūb g(ε)dε

qb∗ −k1b
Z ∞
Sb (ε + k1b ) − Sa (ε + k1a ) g(ε)dε + C(k1a ) − C(k1b ) < 0.

+
qa∗ −k1a
29
But a similar calculation subtracting the left-hand side of (A2) from the right shows
that the term above is positive. Contradiction. Thus, qb∗ − k1b > qa∗ − k1a .
By the lemma, if ūa < ūb, then

(qb∗ − k1b ) − (qa∗ − k1a )
> 0.
u¯b − ūa
Taking the limits gives
∂(q ∗ − k1 ) (qb∗ − k1b ) − (qa∗ − k1a )
≡ lim ≥ 0.
∂ ū2 ūb →ūa u¯b − ūa
Now P1 = 1 − G(q ∗ − k1 ), so
∂P1 ∂(q ∗ − k1 )
= −g(q ∗ − k1 ) < 0.
∂ ū2 ∂ u¯2
C ′ (k1 )
Since C ′ (k2 )
= P1 , this means
∂(C ′ (k1 )/C ′ (k2 )) ∂P1

= < 0.
∂ ū2 ∂ ū2
′
γ−1
γ C (k1 ) k1
For C(k) = λk , ′ = , so
C (k2 ) k2
γ−2
∂P1 ∂(k1 /k2 ) k1 ∂(k1 /k2 )
= (γ − 1) < 0 =⇒ < 0,
∂ ū2 ∂ ū2 k2 ∂ ū2
since kt > 0, γ > 1.
Let g be uniform over [−β, β] for β > 0, and take the cost function to be quadratic,
λx2
so C(x) = 2
.
Then,
∞
λ
Z
S(q1 , k2 ) = V g(q2 − k2 )dq2 − k22 ;
q̄−q1 2
S ′ (q1 ) = V g(q̄ − q1 − k2 );
∂S(q1 )
= S ′ (q1 ) − λk2 .
∂k2
30
This gives first order conditions
∞
λ
Z
∗
S(q ) = ū2 ⇐⇒ V g(q2 − k2 )dq2 − k22 = ū2 .
q̄−q ∗ 2
∞
∂S(q1 )
Z
g(q1 − k1 )dq1 = 0
q∗ ∂k2
Z ∞ Z ∞
⇐⇒ V g(q̄ − q1 − k2 )g(q1 − k1 )dq1 = λk2 g(q1 − k1 )dq1 .
q∗ q∗
Because g ′ is not defined, use the equivalent formulation

Z ∞
′
C (k1 ) = S ′ (q1 )g(q1 − k1 )dq1
q∗
Z ∞
⇐⇒ λk1 = V g(q̄ − q1 − k2 )g(q1 − k1 )dq1 .
q∗
Take candidate q̃ ∗ , k̃1 , k̃2 values as:

V2
8β ū2 − 4βV + 4q̄V − βλ
q̃ ∗ = ; (A3)
4V
V2
−8β ū2 + 8βV − 4q̄V + βλ
k̃1 = ; (A4)
4(4β 2λ − V )
V
. k̃2 = (A5)
2βλ
We claim that for β large enough, these values satisfy the three FOCs. For sequences
x(n), y(n), say x(n) is asymptotically equal to y(n) (i.e. x(n) ∼ y(n)) if
x(n)
lim = 1.
n→∞ y(n)
Observe first that as β → ∞,

∗ 2ū
lim k̃1 = 0, lim k̃2 = 0, q̃ ∼ − 1 β.
β→∞ β→∞ V
2ū
Because ū < V , V
− 1 ∈ (−1, 1). Observe that
k̃1 − β < q̃ ∗ for large enough β. (A6)
31
This holds because k̃1 − β ∼ −β, but q̃ ∗ ∼ αβ for α ∈ (−1, 1). And
k̃1 < q̄ − k̃2 for large enough β, (A7)

since q̄ > 0, and k̃1 → 0, k̃2 → 0. Moreover,
q ∗ > q̄ − k̃2 − β for large enough β, (A8)

since, q ∗ ∼ αβ for α > −1, q̄ − k̃2 − β ∼ −β. Finally,
q̄ − q̃ ∗ > k̃2 − β for large enough β, (A9)

since q̄ − q̃ ∗ ∼ (−α)β for α < 1, and k̃2 − β ∼ −β.
Now observe that for sufficiently large β
Z ∞
λ V λ
V g(q2 − k̃2 )dq2 − k22 = (k̃2 + β − max(k̃2 − β, q̄ − q̃ ∗ )) − k̃22
q̄−q̃ ∗ 2 2β 2
V λ
= (k̃2 + β − q̄ + q̃ ∗ ) − k̃2 by (A9).
2β 2
= ū2 ,
where the last equality follows from plugging in k̃2 , q̃ ∗ .

Claim 1. For large enough β,
Z ∞ Z ∞
V g(q̄ − q1 − k̃2 )g(q1 − k̃1 )dq1 = λk2 g(q1 − k1 )dq1 .
q̃ ∗ q̃ ∗
Proof: This holds iff

V ∗

min(q̄ − k̃ 2 + β, β + k̃ 1 ) − max(q̄ − k̃ 2 − β, −β + k̃ 1 , q̃ )
4β 2
λk̃2
k̃1 + β − max(q̃ ∗ , k̃1 − β) ,

=
2β
which, by (A6), (A7), and (A8) holds iff
V ∗ λk̃2
( k̃ 1 + β − q̃ ) = (k̃1 + β − q̃ ∗ ),
4β 2 2β
V λk̃2 V
which holds iff = , which holds iff k̃ 2 = .
4β 2 2β 2βλ
32
Claim 2. For large enough β,
Z ∞
λk̃1 = V g(q̄ − q1 − k̃2 )g(q1 − k̃1 )dq1 .
q̃ ∗
Proof: As before in Claim 1, this holds iff

V
(k̃1 + β − q̃ ∗ ) = λk̃1 .
4β 2
Plugging in k̃1 and q̃ ∗ confirms that this holds.
γ
So, for large enough β, k̃1 , k̃2 , q̃ ∗ satisfy the three FOCs. Observe that k̃2 ∼ β
,
k̃1 V −ū
k̃1 ∼ βδ , and k̃2
approaches V
.
In fact,
V2
k̃1 βλ(8β(V − ū2 ) − 4q̄V + βλ
)
= ;
k̃2 2V (4β 2 λ − V )
∂ kk12 2λ(4ū2β − 6V β + q̄(V + 4λβ 2 ))
= > 0.
∂β (V − 4λβ 2)2
J
As β → ∞, this goes as where J is positive.
β2
k1
Eventually, increasing the width of the support of εt makes increase to some
k2
ū
asymptote 1 − V
< 1.
The same argument from the proof of Proposition 3 shows that the candidate values
q̃ ∗ , k̃1 , and k̃2 given by (A3), (A4), (A5) will satisfy the first order conditions. Straight-
forward computations show that
∂q ∗ ∂k2∗
and > 0.
∂V ∂V
Furthermore, for large β
∂k1 1 ∂(k1 /k2 )
∼ >0 and → 0.
∂V βλ ∂V
33
Straightforward computations on the candidate values q̃ ∗ , k̃1 , and k̃2 defined in (A3),
(A4), and (A5) in the proof of Proposition 3 shows that
∂k2 ∂q ∗ ∂k1 V
= 0, and = 1, =− 2
∂ q̄ ∂ q̄ ∂ q̄ 4β λ − V
for sufficiently large β.
Straightforward calculations on the candidate values q̃ ∗ , k̃1 , and k̃2 from (A3), (A4),
and (A5) from the proof of Proposition 3 show that
∂q ∗ V2 ∂k2 −V
= >0 and = < 0.
∂λ βλ2 ∂λ 2βλ2
Lemma 2 Let (MLRP) hold. If z(a) is strictly increasing, then

Z ∞
z(a)fq1 (a|q1 )da > 0.
A
Proof: Let f (· | ·) denote the conditional density of a, given q1 , and let ϕ(· | ·) be the
conditional density of q1 , given a. By definition, the likelihood ratio is given by
fq1 (a|q1 )
L(a) ≡ .
f (a|q1 )
By Bayes Rule,
ϕ(q1 |a)f (a)
f (a|q1 ) = R ∞ .
A
ϕ(q1 |a′ )f (a′ )da′
Now
R∞ R∞
ϕ′ (q1 |a)f (a) A
ϕ(q1 |a′ )f (a′ )da′ − ϕ(q1 |a)f (a) A
ϕ′ (q1 |a′ )f (a′ )da′
fq1 (a|q1 ) = R∞ 2 .
A
ϕ(q1 |a′ )f (a′ )da′
Integrating over A gives Z ∞
fq1 (a|q1 )da = 0. (A10)
A
34
So fq1 assumes positive and negative values, and therefore so does L(a). Let a, a be
the lower and upper limits of A, respectively (they may be ±∞). By (MLRP), L(a) is
increasing, so there exists an a∗ such that {a : L(a) < 0} = (a, a∗ ) and {a : L(a) > 0} =
(a∗ , a). By definition of L(a), fq1 > 0 if and only if L(a) > 0. Hence {a : fq1 (a|q1 ) >
0} = (a∗ , a) and {a : fq1 (a|q1 ) < 0} = (a, a∗ ). Rewrite (A10) as
Z a Z a∗
fq1 (a|q1 )da = − fq1 (a|q1 )da. (A11)
a∗ a
Now fq1 (a|q1 ) < 0 if a < a∗ , so |fq1 (a|q1 )| = −fq1 (a|q1 ) for a < a∗ . Integrate both sides
over (a, a∗ ) and combine with (A11) to get
Z a Z a∗
fq1 (a|q1 )da = |fq1 |(a|q1 )da. (A12)
a∗ a
Now z(a) is strictly increasing, so

Z a∗ Z a∗ Z a Z a
∗ ∗
z(a)|fq1 |(a|q1 )da < z(a )|fq1 |(a|q1 )da = z(a )fq1 (a|q1 )da < z(a)fq1 (a|q1 )da.
a a a∗ a∗
Taking the left hand side over to the right gives

Z ∞ Z a Z a∗
z(a)fq1 (a|q1 )da ≡ z(a)fq1 (a|q1 )da + z(a)fq1 (a|q1 )da
A a∗ a
Z a Z a∗
= z(a)fq1 (a|q1 )da − z(a)|fq1 |(a|q1 )da > 0.
a∗ a
Let X ≡ {q1 : S(q1 ) > ū2 } be the continuation set and P1 = Pr(X). That is, the
entrepreneur advances to the second stage if q1 ∈ X. Then the entrepreneur solves
Z
max∗ Ea S(q1 )g(q1 − a − k1 )dq1 + ū2 (1 − P1 ) − C(k1 ) ,
kt ,q X
where Z
S(q1 ) = V G(k2 + a + q1 − q)f (a|q1 − k1 )da − C(k2 ).
A
35
Taking the derivative with respect to q1 and substituting in the previous expression for
C ′ (k2 ) gives S ′ (q1 ) = W + Y where
Z
W ≡V g(k2 + a + q1 − q)f (a|q1 − k1 )da;
A
Z
Y ≡V G(k2 + a + q1 − q)fq1 (a|q1 − k1 )da.
A
Since V > 0, W is positive. Now the function
z(a) ≡ V G(k2 + a + q1 − q)
increases in a. By Lemma 2, this means

Z ∞
Y = z(a)fq1 (a|q1 − k1 )da > 0.
−∞
Thus, S ′ (q1 ) > 0. Clearly S is continuous. By the intermediate value theorem, there
exists a q ∗ such that S(q ∗ ) = ū2 . Hence X = {q1 : q1 > q ∗ }.
Let us define (for t = 1, 2)
ξt ≡ a + ǫt , Ψt (x) ≡ Pr (ξt ≤ x) , and ψt (x) ≡ Ψ′t (x) .
Let
P1 ≡ Pr(q1 ≥ q | q, k1) = 1 − Ψ1 (q − k1 );

P2 ≡ Pr(q1 + q2 ≥ q (q1 ≥ q), kt, q)
Z ∞
1
= [1 − Ψ2 (q − k1 − k2 − ξ1 )]ψ1 (ξ1 )dξ1 ;
P1 q−k1
P ≡ P1 P2 .
Then the problem is
max P V − C(k1 ) + (1 − P1 )u2 − P1 C(k2 ),

kt ,q
with FOCs
−V [1 − Ψ2 (q − k2 − q)]ψ1 (q − k1 ) + ψ1 (q − k1 )u2 + ψ1 (q − k1 )C(k2 ) = 0; (A13)
36
∂P
V − C ′ (k1 ) − ψ1 (q − k1 )u2 − ψ1 (q − k1 )C(k2 ) = 0; (A14)
∂k1
∂P
V − P1 C ′ (k2 ) = 0. (A15)
∂k2
Summing up (A13) and (A14) and integrating by parts gives
Z ∞
V ψ2 (q − k1 − k2 − ξ1 )ψ2 (ξ1 )dξ1 = C ′ (k1 ). (A16)
q−k1
Now
∂P ∂P
= + (1 − Ψ2 (q − k2 − q))ψ1 (q − k1 ); (A17)
∂k1 ∂k2
Z ∞
∂P
= ψ2 (q − k1 − k2 − ξ1 )ψ1 (ξ1 )dξ1 . (A18)
∂k2 q−k1
Hence, by (A15) and (A16), P1 C ′ (k2 ) = C ′ (k1 ). So if C ′′ > 0, k1 < k2 .
37
References
Anat Admati and Paul Pfleiderer. Robust financial contracting and the role of venture
capitalists. Journal of Finance, 49(2):371–402, Jun 1994.
Dirk Bergemann and Ulrich Hege. Venture capital financing, moral hazard, and learning.
Journal of Banking and Finance, 22(3):703–735, 1998.
Dirk Bergemann and Ulrich Hege. The value of benchmarking. In J. McCahery and
L. Rennenboog, editors, Venture Capital Contracting and the Valuation of High Tech
Firms, pages 83–107. Oxford University Press, 2003.
Dirk Bergemann and Ulrich Hege. The financing of innovation: Learning and stopping.
The RAND Journal of Economics, 36(4):719–752, 2005.
Carsten Bienz and Julia Hirsch. The dynamics of venture capital contracts. EFA 2008
athens meetings paper, Feb 2009.
Francesca Cornelli and Oved Yosha. Stage financing and the role of convertible securities.
Review of Economic Studies, 70(1):1–32, Jan 2003.
Charles Cuny and Eli Talmor. The staging of venture capital financing: Milestone vs.
rounds. EFA 2005 moscow meetings paper, Apr 2005.
Yahel Giat, Steve Hackman, and Ajay Subramanian. Venture capital investment under
uncertainty and asymmetric beliefs: A continuous-time, stochastic principal-agent
model. Working paper, March 2009.
Paul A. Gompers. Optimal investment, monitoring, and the staging of venture capital.
The Journal of Finance, 50(5):1461–1489, Dec 1995.
Paul A. Gompers and Josh Lerner. The Venture Capital Cycle. The MIT Press, Sep
1999. ISBN 0262071940.
Yaowen Hsu. Staging of venture capital investment: A real options analysis. EFMA
2002 london meetings, 2002.
38
Philip Krohmer and Rainer Lauterbach. Private equity post-investment phases - the
bright and dark side of staging. Working paper, J.W. Goethe-Universität Frankfurt,
Frankfurt, Germany, Aug 2005.
Augustin Landier. Start-up financing: From banks to venture capital. Unpublished

manuscript, University of Chicago Graduate School of Business, 2002.
Edward P. Lazear and Sherwin Rosen. Rank-order tournaments as optimum labor con-
tracts. The Journal of Political Economy, 89(5):841–864, Oct 1981.
Darwin V. Neher. Staged financing: An agency perspective. The Review of Economic

Studies, 66(2):255–274, Apr 1999.
Stephen Prowse. The economics of the private equity market. Economic Review, (3):
21–35, 1998.
William Sahlman. The structure and governance of venture-capital organizations. Jour-

nal of Financial Economics, 27(2):473–521, Oct 1990.
Morten Sorensen. How smart is smart money? A two-sided matching model of venture
capital. The Journal of Finance, 62(6):2725–2762, 2007.
Susheng Wang and Hailan Zhou. Staged financing in venture capital: Moral hazard and
risks. Journal of Corporate Finance, 10(1):131–155, Jan 2004.
Vijay Yerramilli. Staging of investments: Flexibility vs incentives. Working paper, 2006.
39

Staged Investments in Entrepreneurial Financing

Uploaded by

Copyright:

Available Formats

Staged Investments in Entrepreneurial Financing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Staged Investments in Entrepreneurial Financing

Uploaded by

Copyright:

Available Formats

Staged Investments in Entrepreneurial Financing

May 28, 2010

Venture capitalists deliver investments to entrepreneurs in stages. This paper

JEL Classification code: G, D.

Keywords: Entrepreneurship, Venture capital, staged financing, optimal stopping,

This paper gives an efficiency-based explanation of staged financing in venture capi-

2.1 Upfront Financing

Therefore, the marginal effect of increasing investment on improving the probability of

which yields the first-order condition

where k̂t denotes the optimal effort level.

P V − 2C(k̂) ≥ ū1 + ū2 .

Call this the project feasibility constraint.

2.2 Efficiency of Staging Investments

P (q1 ) ≡ Pr(q1 + q2 > q | q1 ) = Pr(ǫ2 > q − q1 − k2 ) = G(q1 + k2 − q̄).

So the total surplus conditional on a realized q1 is

S(q1 , k2 ) = E2 V (q1 , k2 + ǫ2 ) − C(k2 ) = P (q1 )V − C(k2 ),

3 Effects of Staged Financing

P1 = Pr(q1 > q ∗ ) = G(k1 − q ∗ ).

Q = Pr(q1 + q2 > q̄|q1 > q ∗ ).

So the ex-ante probability of clearing the hurdle q̄ is P = P1 Q. If the entrepreneur

Rearranging terms gives the planner’s problem

max P V − C(k1 ) + (1 − P1 )ū2 − P1 C(k2 ),

3.1 Primary Implication: Dynamic Capital Allocation

Figure 1: Data on Staged Financing from Gompers (1995).

Proposition 6 As the cost of investment λ increases, k2∗ decreases and q ∗ increases.

Pr(q1 > q ∗ ) = P r(ǫ1 > q ∗ − a − k1 ) = G(a + k1 − q ∗ ).

Pr(q1 + q2 > q|q1 ) = Pr(ǫ2 > q − q1 − a − k2 ) = G(k2 + a + q1 − q).

It is efficient for the entrepreneur to continue on the project if the continuation

5.1 Efficient Dynamic Capital Allocation

P V − C(k1∗ ) − C(k2∗ ) = E1 S(q1∗ ) − C(k1∗ )

= E V (q1∗ , q2∗ ) − C(k2∗ ) − C(k1∗ )

Proof of Theorem 1. Recall that

S(q1 ) = P (q1 )V − C(k2 ) and P (q1 ) → 0 as q1 → −∞.

The planner solves

The first order conditions with respect to q, k2 , k1 are

From the continuation surplus function S(q1 ) = P (q1 )V − C(k2 ),

S ′ (q1 ) = g(q1 + k2 − q̄)V ;

where P1 = Pr(q1 > q ∗ ). Now combining with FOC for k1 gives

And, since marginal costs are increasing, this means k1 < k2 .

where P1 = 1 − G(q ∗ − k1∗ ).

Now, (A1) implies

By the lemma, if ūa < ūb, then

∂(C ′ (k1 )/C ′ (k2 )) ∂P1

Because g ′ is not defined, use the equivalent formulation

Take candidate q̃ ∗ , k̃1 , k̃2 values as:

Observe first that as β → ∞,

k̃1 − β < q̃ ∗ for large enough β. (A6)

k̃1 < q̄ − k̃2 for large enough β, (A7)

q ∗ > q̄ − k̃2 − β for large enough β, (A8)

q̄ − q̃ ∗ > k̃2 − β for large enough β, (A9)

where the last equality follows from plugging in k̃2 , q̃ ∗ .

Proof: This holds iff

Proof: As before in Claim 1, this holds iff

Plugging in k̃1 and q̃ ∗ confirms that this holds.

Lemma 2 Let (MLRP) hold. If z(a) is strictly increasing, then

Now z(a) is strictly increasing, so

Taking the left hand side over to the right gives

increases in a. By Lemma 2, this means

ξt ≡ a + ǫt , Ψt (x) ≡ Pr (ξt ≤ x) , and ψt (x) ≡ Ψ′t (x) .