You're Quoting Symmetrically. Your Flow Isn't.
Inventory-aware skew, symmetry drag, and the alpha hiding in your bridge configuration.
Symmetric quoting is the zero-information special case. Once a broker's book has sufficient information — has measurable inventory, client-tier, volatility, and hedge-cost information — continuing to quote symmetrically is an unmanaged risk-transfer decision.
Most retail brokers are running a static markup configuration that hasn't been touched in months – sometimes years. The bid markup equals the ask markup. The same numbers apply whether it's (Eastern time) 2am Asia open or 8:30am New York. The same numbers apply whether the book is flat, long fifty lots of gold, or short five hundred lots of eurodollar. The quoting policy is symmetric in time, symmetric in inventory, and symmetric in counterparty. The flow is none of those things.
Symmetric quoting is the zero-information special case. It is optimal in efficient markets — or close to optimal in real markets — when inventory is zero, flow is symmetric across sides, quote sensitivity is symmetric across clients, volatility is stable, hedge cost is negligible, and client flow is uninformed. In production, none of these conditions hold for more than a few minutes at a time. Once a broker has measurable inventory, client-tier signal, volatility regime, and hedge-cost information, continuing to quote symmetrically is no longer a sensible default. It is an unmanaged risk-transfer decision.
This is what I call symmetry drag. The narrow form is internalization PnL leakage — captured spread that should have been retained, externalization cost that should have been avoided. The broader form is the combined cost of failing to condition quote and routing policy on inventory, flow composition, volatility, and hedge cost: unnecessary capital usage, elevated inventory variance, missed natural-internalization opportunities, quote information leakage to sophisticated counterparties, and the slow-bleed conduct risk of a system that responds to nothing.
Operators almost never attribute it correctly. They blame predatory clients. They blame their LPs. They blame slippage. The real cause is upstream of all of those: the broker is broadcasting the same prices to informed and uninformed flow, in every inventory state, at every volatility regime, and then expressing surprise when realized PnL underperforms theoretical edge.
In many cases, the operator has been sold a "C-book" or "hybrid" risk solution that is, under the hood, either static B-booking with a monthly PDF report, or A-book routing to an LP that rebates volume back to the broker. Both are valid business models when disclosed. Neither is inventory-aware market-making. The operator thinks they bought a controller. They bought a narrative — complex enough to discourage hard questions, and simple enough under the hood to be static B-booking.
The good news is that the first-principles controller has been understood in the academic literature for over forty years. The bad news is that "first-principles controller understood" and "production-grade broker implementation" are separated by a wall of engineering, calibration, and operational governance work that most retail brokers don't have the in-house capability to build, scale, and maintain. That gap is the entire point of this essay.
The Risk Manager's copy-pasta special
Take a broker quoting XAUUSD with a 30-point markup on either side of the LP mid, where one point is $0.01. If gold mid is 4720.00, the broker shows clients a bid of 4719.70 and an ask of 4720.30. Spread is 0.60. Markup is symmetric: 0.30 above, 0.30 below.
This is the default configuration on essentially every retail bridge I've ever opened. It is the configuration shipped by the vendor. It is the configuration the broker's risk manager inherited from a sample setup three years ago, or copy-pasted after looking at what other brokers were marketing for their trading conditions.
Now suppose a burst of correlated client sell flow has just left the broker long fifty lots of gold against the internalized book. At standard retail CFD contract size convention (1 lot = 100 oz), that is 5,000 oz of gold exposure. A meaningful position worth ~$23.6M USD notional at today's spot around $4,720. The reason for the inventory matters and belongs in the toxicity/adverse-selection layer, but for the inventory layer the position itself is the immediate signal.
Under symmetric quoting, the broker continues to show 4719.70 / 4720.30. The next client to lift the offer shrinks the broker's long position. The next client to hit the bid grows it. Both events happen at the same prices. The market does not care that the broker is now carrying 5,000 oz of inventory risk. The flow does not self-correct. The result is not neutrality — it is an unpriced transfer of risk capacity to whichever side of the flow happens to arrive next.
This is the moment where most operators reach for external hedging — externalize the position to the LP, reduce inventory to a sensible level of VaR, and crystallize the hedge economics at the current external price. That can work, but it leaves money on the table. Externalizing through the LP costs the LP spread, the LP commission, market impact on the hedge, and any reject/last-look risk. If offsetting client flow arrives before adverse mid-price movement dominates the spread captured, the broker can capture the client-side spread without paying any of those costs. The tradeoff is precisely the inventory risk carried while waiting for that offsetting flow. The question is how to make that offsetting flow more likely to arrive — without rejecting orders or quoting prices that look obviously off-market.
The answer is to skew. The right way to skew has been worked out in the academic literature in three waves, and the most directly applicable formulation is from 2023.
Three papers and forty years of not having to reinvent the wheel
The foundational paper on inventory-driven dealer pricing is Ho and Stoll (1981), published in the Journal of Financial Economics. Working in a continuous-time stochastic dynamic-programming framework — Poisson jump processes for transaction arrivals plus diffusion for return uncertainty — they showed that a dealer maximizing expected utility of terminal wealth optimally adjusts bid and ask quotes in response to inventory: long inventory shifts the dealer's reservation price downward (incentivizing client buying), short inventory shifts it upward. This was the original framework, predating the modern Avellaneda-Stoikov ("AS") reformulation by twenty-seven years. It is the first landmark result, and the underlying intuition has not been overturned in four decades of follow-up work.
The second wave is Avellaneda and Stoikov (2008), published in Quantitative Finance. They reformulated the Ho-Stoll problem in continuous time on a limit order book, with a Brownian mid-price and Poisson order arrivals whose intensity decays exponentially with distance from mid. Under exponential utility, they derive a Hamilton-Jacobi-Bellman ("HJB") system whose finite-horizon approximation gives the formulas practitioners actually use: a reservation price (mid shifted by an inventory penalty) and a half-spread (a markup around the reservation price). The simple displayed formulas are the approximation, not the full HJB solution; the approximation is what gets implemented because it is closed-form and tractable. AS does not, by itself, model informed flow — adverse selection enters as an explicit object only in later extensions.
The third wave — the one most directly relevant to a retail broker — is Barzykin, Bergault, and Guéant (2023), published in Mathematical Finance. The first two waves model a pure internalizer: a dealer who only sets quotes and waits for clients. But real dealers, including FX brokers, have a second tool: they can externalize inventory through an LP or interdealer market at any time, paying transaction costs and market impact for the privilege. Barzykin et al. extend the framework to a dealer with both options, and their central result is the one that matters for any operator running a hybrid book:
Within a certain inventory range, the dealer optimally internalizes the flow by adjusting quotes (skewing). Outside that range, the dealer optimally externalizes by hedging in the interdealer market. The size of the range scales with the depth of the dealer's franchise.
My street-Ph.D. translation: the band is sized by your balance-sheet capacity and the depth of your client franchise — essentially, how much inventory you can warehouse before the cost of carrying it exceeds the value of waiting for your flow to naturally offset it. Inside the band, don't just warehouse; skew aggressively to pull in the offsetting side. Outside the band, the position is too large for quote adjustments to clear it before drift eats you — externalize.
This is the conceptual controller every serious hybrid-book broker should be approximating, even if the production implementation is necessarily messier than the academic model. Barzykin et al. derive the bands and the optimal skew jointly as a coupled quoting-and-hedging control. In production, you usually end up running them as separate-but-coupled subsystems because of how bridge architectures are built — the hedging side often becomes routing percentages, pass-through rules, or external hedge schedules. That is an implementation proxy, not the model itself.
There is also a 2025 extension from the same authors with Lemmel that adds adverse selection — explicit modeling of informed flow and the information leakage that comes from your own quotes revealing your inventory direction. I won't expand on it here, but it is the right reference for anyone who wants to take per-account markout scoring beyond heuristic post-trade analytics. It formalizes the two effects production toxicity layers usually approximate by hand: informed-flow adverse selection and price-reading by sophisticated counterparties.
The empirical grounding for all of this in the FX context is Butz and Oomen (2019) in Quantitative Finance, who study real internalization behavior of electronic FX spot dealers using a queuing-theory model calibrated with BIS and market-volume data. Under their assumptions, the EURUSD internalization horizon for a representative tier-1 dealer is estimated at 1.39 minutes; typical liquid-G10 horizons are in the single-digit minutes; less-liquid pairs can extend to tens of minutes. That number is one answer to the operator question of how long do I need to hold inventory before client flow naturally clears it, and it is much shorter than most retail brokers assume. A retail CFD broker should not import the EURUSD figure directly: gold, indices, exotic crosses, and crypto behave differently, and the right horizon needs to be estimated per symbol, per session, and per client tier from your own data.
The algebra of not being flat
The commonly used AS finite-horizon approximation produces a reservation price and a half-spread:
Where:
- is the LP mid at time
- is signed inventory (positive long, negative short)
- is risk aversion
- is instantaneous volatility of the underlying
- is the relevant risk horizon — in production this is typically a session close, rollover window, event horizon, or model-reset horizon, not necessarily the literal end of a trading day
- is the order-arrival intensity decay (governs how price-sensitive client flow is)
Quotes are then and .
The first term in is the execution-intensity half-spread — the markup that emerges from balancing per-fill markup against fill probability under the exponential arrival model. It is not an explicit adverse-selection term; AS does not model informed flow. The second term is the inventory-and-volatility half-spread — the additional widening that compensates the dealer for carrying inventory through residual price risk.
When you are long (), the inventory term is positive and gets subtracted, so — your reservation price sits below LP mid, biasing both quotes downward. Volatility and time-remaining widen the half-spread; calm markets near horizon close compress it. The Guéant-Lehalle-Fernandez-Tapia (2013) closed-form approximation made this practically computable and is one of the standard production-friendly templates that real systems wrap in empirical calibrations, guardrails, and routing logic.
To put numbers on the example, suppose:
- (LP mid for XAUUSD)
- lots (5,000 oz)
- per lot (calibrated number absorbing risk aversion, realized vol, and remaining horizon)
- Execution-intensity half-spread (the first term in ) = 0.30
- Inventory-and-volatility half-spread contribution (the second term) = 0.10
Then:
| Quote | Bid | Ask | Spread |
|---|---|---|---|
| Static symmetric | 4719.70 | 4720.30 | 0.60 |
| Inventory-skewed | 4719.30 | 4720.10 | 0.80 |
| Change vs static | −0.40 | −0.20 | +0.20 |
The center of the quote shifted downward by 0.30, and the spread widened from 0.60 to 0.80. The ask is 0.20 better than baseline — clients buying gold from this broker right now are getting tighter prices than they would get from a competitor running static spreads. The bid is 0.40 worse than baseline. The asymmetry of the move — ask tighter, bid much wider — is the entire mechanism. The broker is biasing arrival probability toward client buying (which shrinks the long position) and against client selling (which would grow it). The wider spread compensates the broker for carrying inventory through residual volatility while the rebalance happens.
Notice what is not happening: the broker is not refusing trades. The controller still needs hard bounds against the LP top of book, peer/venue benchmarks, and reasonable price-deviation limits, so production systems should not rely on intuition for "looks off-market" — they need explicit price-reasonability rules. Within those bounds, the math sets the skew.
This is the part most operators miss. Skew is not a way of charging clients more. It is a way of pricing inventory-improving flow more competitively and inventory-aggravating flow less competitively, while preserving a coherent risk budget. The natural-hedge cohort — clients whose flow tends to reduce the dealer's risk in the current state — receives consistently competitive quotes because their flow is reducing the dealer's risk. The risk-aggravating cohort — clients whose flow tends to add to the dealer's existing exposure — receives less competitive quotes because their flow is consuming risk capacity. For clients trading in the inventory-reducing direction at any given moment, execution can be better than under static symmetric quoting. Whether aggregate client experience improves is an empirical question that depends on flow mix and should be measured directly, not assumed.
When skew fails, route
Skew is probabilistic. It changes arrival intensities, not outcomes. A determined client can still hit your bid and grow your inventory regardless of how wide you've made it. Routing is deterministic. You need both because probability alone won't get you back to flat.
This is exactly the externalization-band result of Barzykin, Bergault, and Guéant (2023). When inventory exceeds the optimal internalization band, the dealer externalizes — pushes incremental inventory-worsening flow out to the LP rather than internalizing it.
The control logic is straightforward: as grows, the proportion of inventory-worsening flow that routes to the LP increases. At small inventory, you internalize aggressively. At large inventory, incremental inventory-worsening flow is increasingly externalized, leaving the broker with markup net of LP spread, commission, slippage, and reject/fill risk — which is still better than letting inventory grow further while waiting for skew to clear it.
When I first wrote down a typed example of this out before writing this article, I got the routing direction backwards in the text — described the wrong leg getting externalized — and it took putting the example on the whiteboard to catch the inversion. The takeaway is twofold: 1) the price logic and the routing logic are not the same thing, and both have to be wired up correctly; and 2) production risk controllers need guardrails, not just intuition. When you are long and want to mean-revert, you keep the buys (each one shrinks you) and externalize the sells (each one would grow you). When you are short, you flip. When inventory is flat and the client tier carries no directional or adverse-selection signal, the inventory component collapses back toward symmetric quoting — though tier pricing, volatility regime, and adverse-selection controls may still move the quote surface.
The third layer, which I won't fully expand here, is per-account markout-based scoring. Not all flow is the same. Flow that consistently predicts post-trade price movement is informed flow, and you want to externalize it whether or not you have inventory. Flow that doesn't is uninformed, and you can internalize it cheaply. Modern bridges expose Market Impact Analysis or equivalent post-trade tick-yield analytics that let you decompose your client base into adverse-markout tiers, and the skew-and-route logic gets parameterized per tier. Tighter skew and aggressive externalization for accounts with consistently adverse markout. Tighter skew and aggressive internalization for the risk-reducing cohort. Operators call this "toxic flow" management; the academic objects are adverse selection and price-reading, formalized in Barzykin et al. (2025).
Conduct, governance, and price reasonability
This is not a license to use client-specific pricing as a conduct bypass. In any regulated retail CFD or rolling-spot environment — and increasingly in any jurisdiction at all — tiering logic has to sit inside the firm's documented execution policy, fair-value framework, and conflict-of-interest controls. The defensible posture is narrow:
- Tiers are defined on objective, measurable, auditable state variables: realized markout, fill quality, volume, latency, instrument behavior. Not protected characteristics. Not commercial retaliation. Not ad hoc operator judgment.
- Quote and routing decisions sit inside pre-trade price-reasonability bounds anchored to the LP top of book and peer/venue references, with hard rejection limits if the controller produces an out-of-policy price.
- The controller's outputs, parameter changes, and tier assignments are logged, versioned, and reviewable. Model governance applies the same way it would to any other risk model.
- Periodic testing verifies that the policy is managing inventory and adverse-selection risk without producing systematically unfair client outcomes at the cohort or aggregate level. "Did our risk-reducing clients get better prices on average than our risk-aggravating clients?" is a defensible answer; "did our profitable clients get worse prices than our unprofitable clients?" is not.
The substantive distinction is between using inventory and markout as state variables for risk management — which is what every credible market maker has done since 1981 — and using client P&L as a pricing input, which is something else entirely. The first is simply following best practices. The second is a conduct problem dressed up as a quant problem. The framework I'm describing here is the first; any implementation that drifts toward the second has stopped being market-making and started being something a regulator will name correctly when they find it.
What actually breaks in production
It is worth being honest about what the canonical models don't handle, because anyone running this in production will hit these walls within the first month.
Mid-price dynamics are wrong by assumption. AS assumes the LP mid follows arithmetic Brownian motion. Real FX and metals mids have jumps, intraday seasonality, and event-driven discontinuities (NFP, CPI, FOMC, PMI). Fodra and Labadie (2012) extended the framework to non-martingale mid-price processes and added a directional-bet term. In production, you handle this with regime-switching volatility models and event-aware parameter rotation, not by trusting the closed-form .
Order arrivals are not Poisson. Real flow clusters. Hawkes-process extensions exist, but most production systems just use a moving-window estimator for the arrival intensity and accept the approximation error.
Risk aversion is not identifiable from data. The parameter is a free choice in theory and a calibration headache in practice. Cao, Šiška, Szpruch, and Treetanthiploet (2024) gave a regret upper bound of order for learning the price-sensitivity parameter in the ergodic AS setting, but itself is typically tuned operationally — you set it tighter when capital is constrained and looser when it's flush, and you accept that it doesn't map cleanly onto a measurable utility function.
The retail context is not the textbook context. AS was designed for a market maker on a public limit order book. Retail brokers quote into captive client populations with their own conduct, gap-risk, margin-liquidation feedback, last-look policy, and regulatory constraints. Barzykin-Bergault-Guéant (2023) closes most of the LOB-vs-dealer gap by adding the externalization option and tiered pricing ladders, but per-client customization, conduct controls, and platform-level constraints — which is how every real FX dealer actually runs — are their own engineering problems on top of the math.
None of these are fatal. All of them require ongoing maintenance, not a one-time configuration. A skew system that is set up once and abandoned will degrade as flow mix, volatility regime, LP quality, and client behavior change. The question is not whether it needs maintenance; the question is how quickly drift becomes material — and that depends on the book.
What your bridge needs to expose
For any of this to be operable, the underlying execution infrastructure has to give you the right actuators. At minimum:
- On-the-fly markup updates, ideally via API. Update latency commensurate with the product: milliseconds for fast symbols and event windows, seconds may be sufficient for slower symbols, but session-boundary config reloads are not a serious control surface.
- Per-symbol, per-account-cohort skew parameters. Not just a single bid/ask offset, but separate offsets per cohort, with an override mechanism for individual accounts.
- Real-time inventory readouts at the asset class and underlying level, not just per-symbol. Gold inventory and silver inventory move together; the controller needs to see the cluster, not the leaves.
- Programmable LP-routing percentages that can be tied to inventory state or external signals.
- Per-account inventory limits, with both pass-through and reject modes. Pass-through routes overflow to the LP, reject blocks the trade. Both have their place.
- Pre-trade price-reasonability bounds anchored to the LP top of book and peer/venue references.
- Post-trade tick-yield analytics, exportable to whatever markout-scoring system you build.
- Action scheduling for session-based and event-based parameter rotation. You want different parameters at NY open than at Asia open; you want different parameters during NFP than on a quiet Wednesday.
- Logging and audit trail sufficient to support model governance, compliance review, and post-incident reconstruction.
If your current "hybrid" solution cannot do the above in real time, you are not running a hybrid book. You are running a B-book with a dashboard. The better multi-LP bridge stacks expose much of this. If yours doesn't, the missing actuator becomes part of the implementation cost. Trying to run an inventory-aware skew system on a bridge that doesn't have the right actuators is like trying to drive a car with no steering wheel. The migration to a capable bridge is upstream of the quantitative work, and skipping it doesn't save you money — it just means you'll spend the quantitative budget building things your bridge should have shipped with.
What to do if you suspect this applies to you
If your markup configuration hasn't been updated since the last time you onboarded a new LP — and if your last conversation about adverse markout ended with "we'll just hedge the whole thing for now" — there is almost certainly meaningful execution edge hiding in your current setup. If your last risk review ended with a vendor showing you a "C-book allocation report" that turned out to be 90% static B-book, the diagnostic starts with verifying which of the actuators above you actually have. The size of that edge depends on your inventory turnover, your client mix, and the volatility profile of your symbols. In stale static-markup environments, low-double-digit improvement in internalization economics is a plausible target in some books, but the right number only comes from fill-level diagnostics and shadow-mode validation. I would underwrite the opportunity as a testable hypothesis rather than a promise.
The diagnostic is the right starting point, but it has to be done correctly. A naive replay of historical fills is biased: different quotes change arrival probabilities, fill mix, reject behavior, and client response, so you cannot just rerun your flow with new quote parameters and trust the output. The right framing is off-policy evaluation. You need fill data, quote history, order attempts, markouts, LP execution outcomes, and a calibrated model of how client flow responds to quote changes. Shadow-mode pricing — running the proposed controller live alongside the existing one without actually executing on the new quotes — gives the cleanest validation before any production routing changes.
The reframe worth taking away is this: the optimal broker does not choose between A-book and B-book. It runs a state-dependent market-making controller. Skew inside the internalization band. Externalize outside it. Parameterize by client-tier markout and price sensitivity. Suppress quote behavior that leaks inventory direction to sophisticated counterparties. Recalibrate continuously as regimes shift. All of it inside a documented governance framework with audit trail, price-reasonability bounds, and periodic fairness review.
The claim is not that AS or BBG can be dropped into a bridge and magically improve broker PnL. The claim is narrower: inventory, volatility, hedge cost, and client markout are observable state variables, and quoting and routing policies that ignore them are leaving a measurable, governable control problem unmanaged. The reason most operators don't run this is not disbelief. It's that the quant team, the engineering team, the ops team, and compliance are four different teams, and the risk-management vendor claiming to own the controller is a fifth. Nobody owns the full stack.
The scarce skill was never the HJB equation; plenty of quants can derive that. The scarcity is in building the interdisciplinary iteration loop. The feedback between inventory state, quote surface, routing policy, and post-trade markout. Code the controller, yes—but also build the governance, audit trail, price-reasonability bounds, regime-switching schedule, and a risk budget that matches your balance-sheet reality. The broker that treats this as a one-time config change will see the edge decay in months. The broker that treats it as a living system will compound it. Closing that ownership gap is what I do.
This article is about execution-system design and market-making controls, not investment, legal, or regulatory advice. Retail-broker implementations require jurisdiction-specific compliance review.
References
Avellaneda, M., & Stoikov, S. (2008). High-frequency trading in a limit order book. Quantitative Finance, 8(3), 217–224.
Barzykin, A., Bergault, P., & Guéant, O. (2023). Algorithmic market making in dealer markets with hedging and market impact. Mathematical Finance, 33(1), 41–79.
Barzykin, A., Bergault, P., & Guéant, O. (2023). Market making by an FX dealer: tiers, pricing ladders and hedging rates for optimal risk control. arXiv preprint arXiv:2112.02269 (revised June 2023).
Barzykin, A., Bergault, P., Guéant, O., & Lemmel, M. (2025). Optimal quoting under adverse selection and price reading. arXiv preprint arXiv:2508.20225.
Butz, M., & Oomen, R. (2019). Internalisation by electronic FX spot dealers. Quantitative Finance, 19(1), 35–56.
Cao, J., Šiška, D., Szpruch, Ł., & Treetanthiploet, T. (2024). Logarithmic regret in the ergodic Avellaneda-Stoikov market making model. arXiv preprint arXiv:2409.02025.
Cartea, Á., Jaimungal, S., & Penalva, J. (2015). Algorithmic and high-frequency trading. Cambridge University Press.
Fodra, P., & Labadie, M. (2012). High-frequency market-making with inventory constraints and directional bets. arXiv preprint arXiv:1206.4810.
Guéant, O., Lehalle, C.-A., & Fernandez-Tapia, J. (2013). Dealing with the inventory risk: a solution to the market making problem. Mathematics and Financial Economics, 7(4), 477–507.
Ho, T., & Stoll, H. R. (1981). Optimal dealer pricing under transactions and return uncertainty. Journal of Financial Economics, 9(1), 47–73.