Engineered for speed. Built for insight.
Live dashboard running against the Binance BTC/USDT feed β 7 panels, 50ms refresh, ~497 ticks captured
The Hybrid HFT Microstructure Engine is a full-stack quantitative trading research system that bridges the gap between theoretical market microstructure and production-grade systems engineering.
The architecture is intentionally split into two specialized runtimes:
- A C++ backend that ingests live Binance WebSocket data at sub-millisecond latency, maintains a real-time Limit Order Book, and runs a 5-layer quantitative signal stack.
- A Python frontend that subscribes to the signal stream via ZeroMQ and renders a 7-panel real-time analytics dashboard using PyQtGraph.
This project is not a black-box algorithmic trader β it is a microstructure observatory: a system for understanding how markets move, who is trading, and why prices change, all in real time.
| Category | Feature | Details |
|---|---|---|
| π Performance | Sub-millisecond C++ Core | Lock-free data flow, compiled with -O2, no heap allocations on the hot path |
| π‘ Connectivity | Binance WebSocket | Partial Book Depth stream @ 100ms (btcusdt@depth10@100ms) via ixwebsocket |
| π Order Book | Limit Order Book (LOB) | std::map with price-time priority for Bids (descending) and Asks (ascending) |
| π’ Signal Layer 1 | Rolling Return Z-Score | Welford's online algorithm for numerically stable, real-time mean/variance estimation |
| π Signal Layer 2 | Parkinson Volatility | High-low range volatility estimator for intraday regime detection |
| β£οΈ Signal Layer 3 | VPIN | Volume-Synchronized Probability of Informed Trading (Easley, LΓ³pez de Prado, O'Hara 2012) |
| π§ Signal Layer 4 | Kalman Filter (OBI) | Adaptive noise reduction on the raw Order Book Imbalance signal |
| π§ Signal Layer 5 | Kyle's Lambda | Measures price impact per unit of order flow β a real-time liquidity gauge |
| π² Signal Layer 6 | Predictive PDF | Location-Scale Student-t distribution fit to next-tick Ξprice; outputs P(up), P(down), and directional edge |
| β‘ Composite Signal | Multi-Factor Score | Combines all 6 layers into a single, gated trading signal |
| π IPC | ZeroMQ PUB/SUB | Non-blocking, high-throughput message passing at tcp://127.0.0.1:5555 |
| π Dashboard | 8-Panel Live View | PyQtGraph powered, 50ms refresh, dark-mode, colour-coded signal thresholds |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β C++ BACKEND (trading_engine) β
β β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββββββββ β
β β Binance β β Limit Order β β 6-Layer Signal β β
β β WebSocket βββββΆβ Book (LOB) βββββΆβ Stack β β
β β (100ms) β β β β β β
β βββββββββββββββ β bids: map β β L1: Return Z-Score β β
β β asks: map β β L2: Parkinson Vol β β
β ixwebsocket β β β L3: VPIN β β
β nlohmann/json β midPrice() β β L4: Kalman OBI β β
β β obi() β β L5: Kyle's Lambda β β
β ββββββββββββββββ β L6: Predictive PDF β β
β ββββββββββββ¬βββββββββββ β
β β β
β ββββββββββββΌβββββββββββ β
β β Composite Signal β β
β β + JSON Payload β β
β ββββββββββββ¬βββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββ
β ZeroMQ PUB
β tcp://127.0.0.1:5555
ββββββββββββββββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββ
β PYTHON FRONTEND (dashboard.py) β
β β β
β ββββββββββββΌβββββββββββ β
β β ZmqThread (SUB) β β
β β data_queue β β
β ββββββββββββ¬βββββββββββ β
β β β
β ββββββββββββ¬βββββββββββ¬βββββββββββββββββ¬βββββββββββΌβββββββββββ β
β β Panel 1 β Panel 2 β Panel 3 β QTimer (50ms poll) β β
β β Mid-Priceβ Return Z β Return Distrib.β β β
β ββββββββββββΌβββββββββββΌβββββββββββββββββ€ RollingBuffer[500] β β
β β Panel 4 β Panel 5 β Panel 6 β β β
β β Park Vol β VPIN β Kalman OBI β PyQtGraph UI β β
β ββββββββββββΌβββββββββββΌβββββββββββββββββ β β
β β Panel 7 β Panel 8 β β β
β β Kyle's Ξ» β PDF Edge β β β
β ββββββββββββ΄βββββββββββ΄βββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The heart of the engine is a sequential signal pipeline that transforms raw order book data into a calibrated composite trading score.
Tracks the tick-by-tick return distribution using Welford's numerically stable online algorithm. Each new return is expressed as a z-score relative to the rolling mean and standard deviation. Z-scores beyond Β±2.5Ο indicate statistically anomalous price moves.
ret_t = (mid_t - mid_{t-1}) / mid_{t-1}
z_t = (ret_t - ΞΌ) / Ο # Welford rolling mean/stddev
Uses the high-low range (Best Ask / Best Bid) as a proxy for the intraday volatility estimator proposed by Parkinson (1980). More efficient than close-to-close estimators as it uses intra-period price extremes.
park_vol = β( 1/(4Β·ln2) Β· lnΒ²(ask/bid) ) # rolling average
The raw OBI is then normalized by Parkinson Vol to make it volatility-regime-aware:
obi_normalized = obi_raw / park_vol
Implements the Volume-Synchronized Probability of Informed Trading from Easley, LΓ³pez de Prado & O'Hara (2012). Because we receive LOB snapshots rather than individual trades, tick direction is classified via the Lee-Ready rule:
midββ buyer-initiated (aggressive buyer lifted the ask)midββ seller-initiated (aggressive seller hit the bid)mid=β neutral (split 50/50)
VPIN is computed as a rolling average over 50 volume buckets (each 50 ticks). Values above 0.6 signal dangerously toxic order flow.
bucket_vpin = |buy_vol - sell_vol| / (buy_vol + sell_vol)
rolling_vpin = avg(last 50 bucket_vpins)
Applies a 1D Kalman Filter to the raw OBI signal to produce a smoothed estimate that adapts to changing market conditions without look-ahead bias. The innovation (prediction error) is also tracked and normalized as its own z-score.
prediction = xΜ_{t|t-1} = FΒ·xΜ_{t-1}
innovation = z_t - HΒ·xΜ_{t|t-1}
update = xΜ_t = xΜ_{t|t-1} + K_t Β· innovation
Estimates the price impact coefficient (Kyle, 1985) β the cost of trading one unit of volume. A high lambda indicates an illiquid market where even small orders move prices significantly.
Ξprice = Ξ» Β· signed_volume + Ξ΅
Ξ» = OLS estimate (rolling window, online update)
Models the next-tick price change as a Location-Scale Student's t-distribution β the theoretically correct choice for HFT returns, which are known to exhibit heavy tails that Gaussian models cannot capture.
In market microstructure, a short-term price move decomposes into a deterministic drift (driven by order flow imbalance and price impact) and a stochastic diffusion (driven by realized volatility). This layer fuses all upstream signals into a single probabilistic forecast:
x = ΞP_{t+Ξt} ~ t(ΞΌ, Ο, Ξ½)
ΞΌ = Ξ»_kyle Β· OBI_kalman Β· Ξt # drift: flow imbalance Γ impact Γ time
Ο = Ο_parkinson Β· βΞt Β· (1 + VPIN) # scale: vol widened by flow toxicity
Ξ½ β [3, 5] # degrees of freedom (tail thickness)
Why Student-t and not Gaussian? Because crypto microstructure is leptokurtic β returns have sharper peaks and fatter tails than a normal distribution. Using Ξ½ = 4 (default) gives an excess kurtosis of 6, which closely matches empirical BTC/USDT tick distributions.
The CDF is evaluated in closed form using the regularised incomplete beta function (Lentz continued fraction, O(1) per tick, no external dependencies). This yields exact uptick/downtick probabilities:
P(up) = P(x > 0) = 1 - F_t(-ΞΌ/Ο; Ξ½) # CDF evaluated at standardised threshold
P(dn) = 1 - P(up)
edge = P(up) - P(dn) = 2Β·P(up) - 1 # signed directional edge β (-1, 1)
The edge output is the key actionable quantity: edge > 0 implies a bullish lean, edge < 0 a bearish one, and |edge| quantifies confidence.
The final signal gates the Kalman-smoothed OBI through three multiplicative factors:
composite = obi_kalman Γ (1 - vpin) Γ min(1/park_vol, 5.0) Γ liq_gate
[direction] [toxicity] [vol regime] [liquidity]
liq_gate = 1.0 if kyle_lambda < 0.5 (liquid)
0.5 otherwise (illiquid penalty)
| Panel | Metric | Description |
|---|---|---|
| 1 | Mid-Price | Live BTC/USDT mid-market price time series |
| 2 | Return Z-Score | Tick returns normalized to Β±Ο units, with Β±2.5Ο alert lines |
| 3 | Return Distribution | Rolling 500-sample histogram with overlaid Student-t PDF fit (via scipy) |
| 4 | Parkinson Vol & OBI/Vol | Realized volatility (pale blue) vs. volatility-adjusted OBI (purple) |
| 5 | VPIN | Flow toxicity tracker with a 0.6 danger-zone threshold line |
| 6 | Kalman OBI | Raw OBI (grey) vs. Kalman-smoothed OBI (blue) vs. Innovation (orange) |
| 7 | Kyle's Lambda | Market liquidity/impact coefficient β spikes indicate thin order books |
| 8 | Predictive PDF Edge | Rolling directional edge P(up) - P(dn) from the Student-t fit; Β±0.5 threshold bands |
The top stat bar provides a live heads-up display with colour-coded values for Mid-Price, OBI, Return-Z, VPIN, Parkinson Vol, Composite Signal, and PDF Edge.
| Requirement | Version | Notes |
|---|---|---|
| C++ Compiler | GCC 10+ / Clang 12+ | C++17 standard required |
| CMake | 3.20+ | vcpkg ships a compatible version |
| vcpkg | latest | Used for all C++ dependencies |
| Python | 3.8+ | For the analytics dashboard |
nlohmann-json # JSON parsing of Binance payloads
cppzmq # ZeroMQ C++ bindings
ixwebsocket # WebSocket client for Binance streams
pyzmq # ZeroMQ subscriber socket
pyqtgraph # Ultra-fast real-time plotting
PyQt6 # Qt6 bindings for Python
scipy # Optional β Student-t PDF fitting in return histogram
numpy # Array operations for rolling buffers
git clone <your-repo-url>
cd hft-engine
# Build using the convenience script (uses vcpkg's bundled cmake)
chmod +x build.sh
./build.shOr manually with CMake:
cmake -B build -S . \
-DCMAKE_TOOLCHAIN_FILE=./vcpkg/scripts/buildsystems/vcpkg.cmake
cmake --build buildpython -m venv venv
source venv/bin/activate
pip install pyzmq pyqtgraph PyQt6 scipy numpyOpen two terminals and run in order:
# Terminal 1 β Start the C++ engine (connects to Binance, starts publishing)
./build/trading_engine# Terminal 2 β Launch the Python dashboard
source venv/bin/activate
python dashboard/dashboard.pyThe engine will print a live order book view to the terminal with all signal layers, while the dashboard renders the full graphical analytics view.
hft-engine/
βββ engine/
β βββ main.cpp # Entry point β orchestrates all components
β βββ orderbook/
β β βββ orderbook.hpp # LOB data structure & JSON serialization
β β βββ orderbook.cpp
β βββ analytics/
β β βββ analytics.hpp # SignalState struct β shared data model
β β βββ welford.hpp # Layer 1: Welford online mean/variance
β β βββ parkinson.hpp # Layer 2: Parkinson volatility estimator
β β βββ vpin.hpp # Layer 3: VPIN order flow toxicity
β β βββ kalman.hpp # Layer 4: 1D Kalman filter on OBI
β β βββ kyle.hpp # Layer 5: Kyle's lambda (price impact)
β β βββ predictive_pdf.hpp # Layer 6: Location-Scale Student-t PDF
β βββ publisher/
β β βββ publisher.hpp # ZeroMQ PUB socket wrapper
β βββ websocket/
β βββ wb.hpp # ixwebsocket client wrapper
β βββ wb.cpp
βββ dashboard/
β βββ dashboard.py # 7-panel PyQtGraph analytics dashboard
β βββ requirements.txt # Python dependency list
βββ CMakeLists.txt # Build configuration
βββ build.sh # Convenience build script
βββ vcpkg/ # C++ package manager (submodule)
βββ live_view.png # Dashboard screenshot
| Model | Paper |
|---|---|
| VPIN | Easley, D., LΓ³pez de Prado, M. M., & O'Hara, M. (2012). Flow Toxicity and Liquidity in a High-Frequency World. Review of Financial Studies. |
| Kyle's Lambda | Kyle, A. S. (1985). Continuous Auctions and Insider Trading. Econometrica, 53(6), 1315β1335. |
| Parkinson Vol | Parkinson, M. (1980). The Extreme Value Method for Estimating the Variance of the Rate of Return. Journal of Business. |
| Kalman Filter | Kalman, R. E. (1960). A New Approach to Linear Filtering and Prediction Problems. Journal of Basic Engineering. |
| Welford's Algorithm | Welford, B. P. (1962). Note on a Method for Calculating Corrected Sums of Squares and Products. Technometrics. |
| Student-t Microstructure | Cont, R. (2001). Empirical properties of asset returns: stylized facts and statistical issues. Quantitative Finance, 1(2), 223β236. |
| Incomplete Beta (CDF) | Press, W. H. et al. (2007). Numerical Recipes: The Art of Scientific Computing (3rd ed.) Β§6.4. Cambridge University Press. |
C++ for speed Β· Python for insight Β· ZeroMQ for everything in between.