The Price Impact of Generalized Order Flow Imbalance: Yuhan Su, Zeyu Sun, Jiarong Li, Xianghui Yuan
The Price Impact of Generalized Order Flow Imbalance: Yuhan Su, Zeyu Sun, Jiarong Li, Xianghui Yuan
The Price Impact of Generalized Order Flow Imbalance: Yuhan Su, Zeyu Sun, Jiarong Li, Xianghui Yuan
Abstract
Order flow imbalance can explain short-term changes in stock price. This paper considers the change of non-minimum quotation units in
real transactions, and proposes a generalized order flow imbalance construction method to improve Order Flow Imbalance (OFI) and
Stationarized Order Flow Imbalance (log-OFI). Based on the high-frequency order book snapshot data, we conducted an empirical analysis
of the CSI 500 constituent stocks. In order to facilitate the presentation, we selected 10 stocks for comparison. The two indicators after the
improvement of the generalized order flow imbalance construction method both show a better ability to explain changes in stock prices.
Especially Generalized Stationarized Order Flow Imbalance (log-GOFI), using a linear regression model, on the time scales of 30 seconds,
1 minute, and 5 minutes, the average out of sample compared with Order Flow Imbalance (OFI) 32.89%, 38.13% and 42.57%,
respectively increased to 83.57%, 85.37% and 86.01%. In addition, we found that the interpretability of Generalized Stationarized Order
Flow Imbalance (log-GOFI) showed stronger stability on all three time scales.
Keywords:High-frequency Trading, Limit Order Book, Chinese Financial Market, Order Flow Imbalance
1. Introduction
The maturity of China's electronic trading system and the openness of high-frequency data information have made high-
frequency trading increasingly emerging in China's financial market. The availability of high-frequency trading and quotation
records has inspired a large amount of empirical and theoretical literature on the relationship between order flow, liquidity,
and price movements in order-driven market [1]. A particularly important issue in practical applications is the impact of orders
on prices: within a specific time frame, the assumption of the best liquidation process that affects stock prices is proposed,
which laid a solid foundation for the optimal execution problem with price impact [2-4]. Ma et al. [5] discussed the optimal
portfolio execution problem under the influence of random prices. But with the rise of high-frequency trading, the order book
based on high-frequency trading provides valuable information for exploring changes in stock prices. High-frequency trading
is a major innovation in the financial market, accounting for a high proportion in stock trading in the United States and Europe
[6]. O’Hara [7] pointed out that technology and high-frequency trading have changed the market, and widely discussed the
impact of these changes on the microstructure of high-frequency market. Jones [8] suggested that high-frequency trading does
not bring any instability to the market, and can even improve the overall quality of the market and reduce transaction costs.
Brogaard et al. [9] examine the role of high-frequency traders (HFTs) in price discovery and price efficiency. Overall HFTs
facilitate price efficiency. According to Berger et al. [10], the existence of the high-frequency trading program has
significantly increased the trading volume and the depth of the bid price, and its impact on the market is benign. It can even
be said to have a certain positive effect from the perspective of promoting trading volume and improving the depth of the bid
price. Chinese stock exchanges provide snapshot data of the order book with a frequency of 3 seconds. The snapshot data of
the limit order book discloses investors' information on the short-term market micro-state. The limit order book data fully
demonstrates the competitive behavior of different investors. Therefore, a better understanding of how the structure of the
limit order book affect the price formation has theoretical and practical significance.
Cao and Hansch et al. [11] proposed Depth Imbalance (QR) and Width Imbalance (HR) to describe the shape of the limit
order book. Cont et al. [12] proposed Order Flow Imbalance, and found that the change of stock intermediate price can be
explained by order book imbalance, and the interpretability is better than Trade Imbalance (VOL). The transaction Volume
Order Flow Imbalance (VOI) constructed by Shen [13] is a feature that measures the incremental difference in the order
volume at the optimal buying and selling price within a certain period of time, reflecting the supply and demand of investment
behavior at the optimal buying and selling price. Xu et al. [14] proposed Multi-level Order Flow Imbalance (MLOFI), which
is a vector that measures the net flow of buy and sell orders at different price levels in a limit order book, and further described
the process of order-flow activity in the limit order book affect the price-formation. Sirignano and Cont [15] use path
* Corresponding author.
The limit order book is composed of different timestamps, execution prices, and order quantities corresponding to the
execution prices. It is an important tool for analyzing the behavior of market participants. Cont [1] focused in particular on
models that describe the limit order book as a queuing system. First, simplify the working mode of the order book. For the
order book, it is assumed that the number of orders at each price level does not exceed at most. When the order quantity
corresponding to the optimal buyer's execution price or seller's execution price reaches , the order book will create a new
optimal buyer's execution price or seller's execution price. At this time, the number of orders placed, the number of orders
cancelled, and the number of transactions will be accumulated along the new price until the quantity reach again or zero,
so that the optimal buyer's execution price or the seller's execution price will move anew. Taking the entry of a new buyer
order as an example, the image description is as follows:
Figure 1: The buyer's new orders arrive and the price moves while maximum depth at each price level is equal to D.
As shown in the figure, the order book is in the state (a) at the initial moment, and when a new batch of buyer limit orders
arrive (b), the number of orders on the optimal buyer's execution price will increase. If the number of orders on the optimal
buyer's execution price exceeds the D, the orders arriving in the future will be accumulated along the new price, as shown in
Figure (c). It should be noted that there are some seller orders at the new price level in the figure, so buyer orders arriving
here will match the seller's orders according to the trading rules. In the end, the strength of the buyer and seller reaches a new
balance, and a new order book (d) can be obtained at this time. This is just an example of the assumptions of the model.
Based on the above settings, Cont proposed a linear model describing order flow imbalance and price changes [12].
Consider a certain time interval , during this period of time, the number of orders that reach the current optimal
buyer's execution price is recorded as , , and the number of cancelled orders is recorded as , , and the number of
transactions that match the market orders from the seller is recorded as , . The relationship between the change of the
buyer's order and the change of the optimal buyer's execution price can be obtained:
, , ,
,
Where δ represents the transaction price unit. In the same way, the relationship between the change in the seller's order and
the change in the optimal seller's execution price can be obtained, which differs only in the direction of buying and selling:
, , ,
,
In the above formula, , , , . represents the optimal buyer's execution price, and
represents the optimal seller's execution price. Define the intermediate price as , and define Order
Flow Imbalance in this period of time as , , , , , , , , where is
used to represent the truncation error, we can get:
,
,
The model shows that there is a linear relationship between the change in the mid-price and Order Flow Imbalance. To test
this relationship from an empirical point of view, Cont gave a measure of order flow imbalance. Still considering the time
interval , divide this period of time into N small observation intervals [12], and define:
Among them, and represent the optimal buyer's execution price order quantity at time and time ,
respectively, and and represent the optimal seller's execution price at time and time . represents
the increase in buyer power in the nth observation interval, and represents the increase in seller power in the nth
observation interval.
Note that the linear model imposes a limit on the number of orders at each price level, but in fact this limit does not exist.
The number of orders corresponding to different levels is very different, showing high volatility, which to some extent will
cause the model to be inaccurate. Wang et al. [16] have performed logarithmic processing on the classic order flow imbalance
indicator to make the data show higher stability. This scheme can effectively reduce the bias caused by the threshold
hypothesis to the empirical analysis, which means that the maximum depth at each price level is equal to D. Based on the
symbols mentioned above, the stationarized order flow imbalance (log-OFI) is to take the logarithm of each q on the basis of
the original OFI.
Considering that there is a minimum observation interval in the real market (for example, the time interval of order book
snapshot data provided by the Chinese stock market is three seconds), the size of N used to divide the observation interval
cannot be chosen arbitrarily, which will largely lead to discontinuous movement of the optimal execution price (movement
exceeds one ). Discuss the process of increasing the buyer's optimal execution price as exemplified before. Since the entire
process usually occurs in a short time, for a three-second observation interval, such a process may have occurred multiple
times. This will cause the optimal buyer's execution price to move beyond one .
Figure 2: Buyer’s new order arrival and optimal execution price movement in a small observation interval while maximum depth at each price level is
equal to D.
The figure ignores the previously described order matching step, and directly gives the corresponding results, showing the
situation where the buyer's optimal execution price moves by 2δ, all of which happen in a short time. In the figure (a) and (d)
are the initial and final conditions that can be observed in the interval, and (b) and (c) indicate that two batches of new orders
have entered during this period. It should be noted that the image is only an illustration, and there are many situations that
cause the optimal quotation to move similar to this.
In response to this situation, this paper relaxes the restriction that the optimal price within the observation interval implied
by the order flow imbalance can only move one δ, and proposes a new order flow indicator called Generalized Order Flow
Imbalance (GOFI). This indicator is no longer based on the position of the optimal execution price, but focuses on the value
of the optimal execution price. Consider the time interval , divide this period into N small observation intervals,
Generalized Order Flow Imbalance is defined as follows:
, , , ,
, , , ,
, , , ,
In the formula, reflects the increase in buyer power in a small observation interval, reflects the increase
in seller power in a small observation interval, and the two can be subtracted to measure generalized order flow imbalance
within a small observation interval. To explain the newly appeared symbols in the above formula, still taking the buyer as an
example, , means that at the n time of the nth small observation interval, the quantity of buyer orders at the ith price level.
, means that at the n-1 time of the nth small observation interval, the quantity of buyer orders at the ith price level. In
the same way, the meanings of , and , can be obtained. There are two cases involved here. When uses orders
quantity directly, the indicator obtained is , ; considering the stationarity correction, when takes the logarithmic
value of the orders quantity, the indicator obtained is , . For example, if , it means that represents the
(logarithmic) value of the orders quantity at the buyer's execution price or the seller's execution price; if , it means that
represents the (logarithmic) value of the orders quantity at the second bid price or the second ask price. is used to
determine the value range of in the above formula, and its superscript expresses the buying and selling direction.
represents the number of all the stalls where the pending order price at time is greater than or equal to or less than or
equal to the optimal buyer's execution price at time when the optimal buyer's execution price changes in the nth small
observation interval. In short, measures the level of the optimal bid movement.
3. Comparison of Interpretability Between Four Order Flow Imbalances and Mid-price Changes
, ,
, ,
, ,
, ,
Table 1: The coefficient of determination of the linear regression between four order flow imbalances and mid-price change within 30 seconds.
StockCode
StockCode
Table 3: The coefficient of determination of the linear regression between four order flow imbalances and mid-price change within 5 minutes.
StockCode
Figure 4: Comparison of the coefficient of determination of the linear regression between four order flow imbalances and mid-price change within 1
minute.
Figure 5: Comparison of the coefficient of determination of the linear regression between four order flow imbalances and mid-price change within 5
minutes.
4. Conclusion
Reference
[1] Cont, R. (2011). Statistical modeling of high-frequency financial data. IEEE Signal Processing Magazine, 28(5), 16-25.
[2] Bertsimas, D., & Lo, A. W. (1998). Optimal control of execution costs. Journal of Financial Markets, 1(1), 1-50.
[3] Almgren, R., & Chriss, N. (2001). Optimal execution of portfolio transactions. Journal of Risk, 3, 5-40.
[4] Obizhaeva, A. A., & Wang, J. (2013). Optimal trading strategy and supply/demand dynamics. Journal of Financial Markets, 16(1), 1-32.
[5] Ma, G., Siu, C. C., Zhu, S. P., & Elliott, R. J. (2020). Optimal portfolio execution problem with stochastic price impact. Automatica, 112, 108739.
[6] Biais, B., & Woolley, P. (2011). High frequency trading. Manuscript, Toulouse University, IDEI.
[7] O’Hara, M. (2015). High frequency market microstructure. Journal of Financial Economics, 116(2), 257-270.
[8] Jones, C. M. (2013). What do we know about high-frequency trading?. Columbia Business School Research Paper, (13-11).
[9] Brogaard, J., Hendershott, T., & Riordan, R. (2014). High-frequency trading and price discovery. The Review of Financial Studies, 27(8), 2267-2306.
[10] Berger, N., DeSantis, M., & Porter, D. (2020). The Impact of High-Frequency Trading in Experimental Markets. The Journal of Investing, 29(4), 7-
18.
[11] Cao, C., Hansch, O., & Wang, X. (2009). The information content of an open limit‐order book. Journal of Futures Markets: Futures, Options, and
Other Derivative Products, 29(1), 16-41.
[12] Cont, R., Kukanov, A., & Stoikov, S. (2014). The price impact of order book events. Journal of financial econometrics, 12(1), 47-88.
[13] Shen, D. (2015). Order imbalance based strategy in high frequency trading (Doctoral dissertation, oxford university).
[14] Xu, K., Gould, M. D., & Howison, S. D. (2018). Multi-level order-flow imbalance in a limit order book. Market Microstructure and Liquidity, 4(03n04),
1950011.
[15] Sirignano, J., & Cont, R. (2019). Universal features of price formation in financial markets: perspectives from deep learning. Quantitative Finance,
19(9), 1449-1459.
[16] Wang, Q., Teng, B., Hao, Q., & Shi, Y. (2021). High-frequency Statistical Arbitrage Strategy Based on Stationarized Order Flow Imbalance. Procedia
Computer Science, 187, 518-523.