This article provides a technical overview of Kaiko’s recently launched tick-by-tick historical order book product.
- L1 vs. L2 vs. L3 Order Book Data
- L3 Data in Cryptocurrency Markets
- Order Book Snapshots vs. Tick-Level Order Books
- Kaiko’s Tick-level Order Book Data
- Use Cases
- Request Data Sample
L1 vs. L2 vs. L3 Order Book Data
Financial market order book data can be divided into three categories: Level 1, Level 2, and Level 3. In cryptocurrency markets, these categories are often blurred and differ slightly depending on the data formatting provided by the exchange. However, for the sake of clarity, we will provide an overview of the differences between these data types as they are generally defined in cryptocurrency markets.
Level 1: L1 data refers to the best bid / best ask of a trading pair’s order book. This data is commonly referred to as “quote” data, and is accessed in real-time. Traders subscribing to L1 quote data receive real-time changes to the best bid / best ask of an order book, which includes the price level and order amount.
Level 2: L2 data is more granular than L1 data, and includes bids and asks at all price levels of a trading pair’s order book. This data is commonly referred to as “depth of book” data. In cryptocurrency markets, market makers place limit orders at price levels that differ from the current price of an asset. L2 data includes all bids and asks placed by these market makers, aggregated by price level. For example, if 10 traders placed bids of various sizes at $8,000 while the current price of BTC was $8,500, this would appear as an L2 quote of [price: $8,000, amount: 23.4]. The example amount of 23.4 would be the sum of all orders placed by the 10 traders. Because L2 data is aggregated by price level, it can also be referred to as “market by price level” data.
Level 3: L3 data provides even deeper information than L2 data. L3 data refers to non-aggregated bids and asks placed by individual market makers. An L3 data feed would include every individual bid and ask, whereas an L2 data feed would include these bids and asks aggregated by price level. For example, L3 data for the quotes placed by the 10 traders above would result in 10 different data points with each individual order amount, rather than a single data point aggregated by price level. Thus, L3 data can be referred to as “market by order” data.
Each consecutive level of data provides an increased level of granularity. Some traders may find L1 data sufficient for their trading strategies, while others require L2 data. L3 data is the most granular level of order book data, frequently used by the most sophisticated traders.
L3 Data in Cryptocurrency Markets
Nearly all exchanges provide L1 and L2 data feeds, which comprises all bids and asks on a trading pair’s order book, aggregated by price level. However, only three top exchanges currently provide L3 data feeds: Coinbase, Bitstamp, and Bitfinex.
Why is this the case? One reason may be the sheer volume of data. At each individual price level, there can be dozens (at times, hundreds) of individual market makers placing bids or asks. L2 data transforms these orders into a single data point while L3 data consists of raw, individual orders. Most traders find L2 data sufficient for their use case, which may be why exchanges frequently do not provide this data unaggregated. L3 data feeds can at times be unwieldy from these three exchanges, which implies that exchanges struggle to ensure stable WebSocket connections for this data type.
There is currently no strong consensus across exchanges for defining this data type. Of the three exchanges that provide an L3 data feed, each is labeled differently. Coinbase calls their L3 feed the ‘Full’ channel, Bitfinex calls their feed the ‘Raw Book’ channel, and Bitstamp calls their feed the ‘Live Orders’ channel. All three of these feeds have completely different formatting. However, the one commonality is that this data comprises non-aggregated, raw bids and asks.
At Kaiko, we hope to create a common language around L1, L2, and L3 data and define its role in cryptocurrency markets.
Tick-level Order Book Data
Order book data can be collected from cryptocurrency exchanges in a number of ways. Every exchange provides both a REST API, where users can make requests for market data, and a streaming WebSocket, where users can subscribe to live market data feeds. All exchanges provide order book data through both of these delivery channels.
Since 2015, we have collected what we call “order book snapshots” from exchanges’ REST APIs. Twice per minute, we make requests to each exchange’s order book API endpoint to collect all bids and asks at a moment in time live on a trading pair’s order book. We call this data an “order book snapshot” because it comprises all bids and asks at a moment in time. The data is essentially stored as a long list of bids and asks, all with the same timestamp which refers to the moment in time when the snapshot was taken. Different timestamps correspond to different snapshots.
Order book snapshots allow traders and researchers to replicate order books as one would view them on an exchange, but only in 30 second increments (which is the frequency that we currently take snapshots). In between the time a snapshot is taken, it is impossible to know if changes were made to orders or orders were added or removed. For liquid markets, this can be problematic because frequently hundreds, if not thousands, of changes are made every minute. In particular, high frequency traders often place limit orders that are executed within milliseconds. Order book snapshots are simply not granular enough to capture these rapid changes to an exchange’s order book.
Thus, we decided to build a tick-level order book product to ensure that our sophisticated clients have the granular data they need. Tick-level order book data is collected from exchange WebSocket feeds. Rather than make requests in 30-second increments, we simply subscribe to the order book feed and receive a constant stream of all updates made to the order book. Each data packet we receive is either a new order, deleted order, or changed order that can be “added” or “subtracted” from an order book’s current state.
In order to maintain an internal order book, you need an order book snapshot as a base point from which to add and subtract the “ticks.” Thus, for each tick-level WebSocket feed we receive data from, we are also polling the order book REST API endpoint to collect an order book snapshot. All ticks come with either a millisecond (or microsecond) timestamp from the exchange and/or a sequence ID. These timestamps/sequence IDs are crucial for maintaining the correct sequence of events. When adding/subtracting ticks from an order book snapshot, the order in which they are processed by an exchange must be taken into account.
To summarize, order book snapshots and tick-level updates can be used in tandem to rebuild an order book at any given point in time. For example, with just one order book snapshot and all tick-level updates collected, you can replay every single change made to the order book over the time interval you are interested in.
Kaiko’s Tick-level Order Book Data
Based on client requests and market research, we decided that the first version of our tick-level order book data would be historical L3 data from Coinbase, Bitstamp, and Bitfinex, for the pairs btc-usd and eth-usd. For now, this data will be delivered in historical .csv files. In the future, we plan to collect L2 data from a wider range of exchanges, and add live WebSocket feeds.
This dataset comes in two parts: one file includes all tick-level updates, and one file includes order book snapshots taken once per hour. These two datasets can be combined to rebuild historical market states.
Collection: Ticks are collected from each exchange’s L3 WebSocket channel. Order book snapshots are taken once per hour, collected from each exchange’s order book REST API endpoint.
Normalization: The data is not normalized due the large differences in formatting across exchanges. Each raw data packet collected includes details about the individual bid or ask placed by the market maker.
File Structure: Each individual data packet is stored in rows in compressed .csv files. One file is generated per day, for both tick updates (which we call “events”) and snapshots.
Delivery: Data is delivered through either AWS or GCP. Files for both “events” and “snapshots” will be delivered, as shown below. We can configure weekly delivery so that once per week, the previous week’s worth of L3 data is automatically pushed to your cloud bucket.
Every exchange has different formatting, and we include examples of the diverging formats below. Because not every “event” has the same format, we do not include headers in the files. You can view details about headers on our documentation for this data type here.
Use Cases for Tick-level Order Book Data
Despite the sheer volume of L3 data and resulting complexities when working with it, we received significant demand for L3 data over the past year, in particular for the Coinbase ‘Full’ data feed. Most order book analyses can be performed on L2 data, but L3 data provides even further precision. Traders and researchers can better project market demand and support, simulate entries and exits, train machine learning algorithms, search for arbitrage opportunities, determine at what price level a trade is likely to be filled, and more.
Some traders want to maintain internal order books, which can be done by applying the tick-level orders to an order book snapshot, which we provide in a companion file. Some analyses can be performed with just the raw ticks, but others will need to fully recreate order book states, which means that an original snapshot is necessary to apply all tick-level order book updates. There are simple algorithms described by exchanges for maintaining an internal order book (Coinbase’s documentation includes one here).
Request Data Sample
We began collecting this data on May 21. Thus, we are happy to provide this data for free for the first few weeks in order to receive feedback and perfect the product. If you are interested in trying out our tick-level order book data, please email us at email@example.com or fill out our contact form here. We can provide individual data samples or set-up a full trial of the service with weekly pushes of historical data.
We are eager for any and all feedback, including any comments about our data formatting, details about your use case, additional exchanges to begin collection for, etc.
Data Dictionary: https://www.kaiko.com/pages/cryptocurrency-data-types
Full Documentation: https://docs.kaiko.com/#introduction
API Documentation: https://www.kaiko.com/pages/market-data-api
Historical Data: https://www.kaiko.com/pages/historical-data
Charts and Analytics: https://www.kaiko.com/pages/research-factsheet
Contact Form: https://www.kaiko.com/pages/contact-kaiko
Tick-Level Order Books: Technical Product Overview was originally published in Kaiko Data on Medium, where people are continuing the conversation by highlighting and responding to this story.