top of page

Machine Learning Applications in DEX Aggregation and Smart Order Routing

Introduction

This article explores Deeplink’s ongoing research towards machine learning approaches and applications in both smart order routing and DEX aggregation. A brief overview of preliminary concepts is provided, along with a detailed explanation of how this field of study applies to the problems at hand, and finally, a collection of related works are investigated.

In previous articles we have outlined both smart order routing and DEX aggregation in some depth, to fully grasp the necessity for pathfinding algorithms in enhancing these spaces, we recommend reading those articles. However, a brief summary of each will also be provided here, along with a few important preliminary concepts.

Smart Order Routing


Smart order routing (SOR) is an automated process in which orders on exchanges are handled with the intent of attaining the most desirable path across trading venues. In a DEX, this generally takes the form of finding the optimal path of swaps across a set of liquidity pools in order to take advantage of the liquidity depth of those pools and mitigate the effects of fragmented liquidity. The primary cause for concern regarding this liquidity fragmentation is negative slippage, losses incurred on account of a change in spot price in the time between an order being placed and executed.

Agent-Based Reinforcement Learning
Introduction
Smart Order Routing
DEX Aggregation
Artificial Intelligence, Machine Learning, and Deep Learning
DEX Aggregation


A DEX is a trading venue consisting of a set of liquidity pools that facilitate the exchange of assets without a central authority or the need for users to forfeit custody over their assets. DEXs are prone to the aforementioned issues of liquidity fragmentation as their asset pairs are segmented into liquidity pools, and as more and more venues arise, the overall market liquidity becomes more and more thinly spread.

DEX aggregators expose traders to more liquidity than any one DEX could provide by aggregating and connecting the services of multiple DEXs. A DEX aggregator can be thought of analogously to services like Expedia or Google Flights, which aggregate offerings from numerous airlines into one comparative service, giving users access to the best possible options for their needs.

Artificial Intelligence, Machine Learning, and Deep Learning
0_UGNL_MlM-OOHRBUk.png
How Can Machine Learning Improve Smart Order Routers and DEX Aggregators?


In DEX aggregation and SOR, liquidity is key. Allowing your algorithms to make deep correlations between liquidity concentration, distribution, and volatility will allow your systems to outperform those which do not consider these factors in such depth.

Machine learning techniques are used broadly across the traditional finance sectors, and it is only a natural progression that these beneficial applications should eventually see their way into DeFi. Machine learning is often used in traditional SOR to assess and identify factors pertaining to liquidity and volatility in order to ascertain the opportunistic routes, pricing, and order sizing — we believe that many of these practices are directly translatable to DEX aggregation and SOR.

By introducing liquidity indicators via the high speed, high granularity data feeds provided by L3 Atom, machine learning models can identify complex correlations between these factors at dimensionalities beyond the capabilities of human traders, and can use these correlations to outperform services that lack such depth of insight. One major area which stands to benefit from this technology is the avoidance of slippage.

Slippage


The nuances and causes of slippage are explored in more depth in our article on smart order routing, but in essence, slippage is the difference between expected and actual price execution when placing an order on a trading venue. This generally occurs when the asset’s price changes in between the time of placing an order, and the time that that order is executed. Slippage can be positive or negative, meaning that the price difference can either be of benefit or detriment to the margins of the trader in question. However, negative slippage is generally what is meant when the term is mentioned.

Machine learning techniques can be employed to provide predictive insight towards price movements, volatility, and liquidity indicators, all of which play major roles in slippage, and in avoiding it. The following sections outline some areas which can be factored into machine learning-driven predictive analytics and agent-based decision-making algorithms to provide users with the most optimal trades.

Liquidity Concentration


In the context of DEXs, liquidity concentration refers to the available liquidity within a given liquidity pool. Due to the nature of AMMs in DEXs, slippage due to placing large orders is often even more of an issue than in other types of venues, as the liquidity concentration of DEX liquidity pools is generally on the smaller side relative to other sources of liquidity. When swapping tokens in a DEX, you are essentially adding one asset to the liquidity pool while simultaneously removing another, the AMM conservation function then automatically rebalances the ratio of these two tokens. This means that trades in pools with small liquidity, and conservation functions can overcompensate and cause dramatic price swings as the proportional ratio will be more affected — conducive to major slippage. This can be thought of analogously as the fluid displacement of a boat (an order) in a body of water (a liquidity pool); a dingy in a river will virtually have no impact on the water level, but a yacht in a pool certainly will.

Liquidity concentration primarily gives us information on the potential slippage within a single pool. Factoring in the liquidity concentration of individual pools when aggregating and routing through a sea of venues is a crucial element in facilitating optimal trade opportunities. The introduction of this data to machine learning models can open avenues for routing which are beyond the naked eye’s ability to comprehend the implications of liquidity depth. This is similar in concept to Balancer’s smart order routing technique of linearizing spot prices in liquidity pools in order to estimate the change in spot price resulting from an order being placed, predictions which are then used in selecting the optimal set of pools to route through. A more detailed breakdown of how this works can be found in our article on automated market makers.

Liquidity Volatility


With the concept of liquidity concentration in mind, we can generalize that larger liquidity pools tend to be less volatile than smaller ones, as each individual trade makes less of an impact on the pool’s overall liquidity. However, liquidity concentration is only one factor of a pool’s volatility (in both price and liquidity). Volatility metrics are a broadly studied econometric subject with much of their own nuance, a pool’s volatility is tied to its liquidity depth, the price volatility of its underlying assets, its volume, its reputability, and many more external and internal factors. Due to its unwieldy and complex nature, volatility is also a topic to which machine learning is particularly suited.

 

Research such as this research article published in Hindawi Complexity in 2021, ‘Forecasting Volatility of Stock Index: Deep Learning Model with Likelihood-Based Loss Function‘, which employed long short-term memory (LSTM) deep learning techniques to predict the volatility of stock indices. In this project, Fang Jia and Boli Yang fed historical volatility data points into their deep learning models and compared their performance against a popular traditional econometrics model known as autoregressive moving average and generalized conditional heteroscedasticity (ARMA-GARCH).

The models created were likelihood-based loss LSTM and deep neural networks (DNN), and mean squared error (MSE) LSTM and DNN (four models in total, compared against ARMA-GARCH). The likelihood-loss LSTM outperformed ARMA-GARCH along with the other deep learning models.

0_Oe336Va6Hi2n3Vwx.png

Artificial intelligence refers to intelligence demonstrated by machines, a sentiment in which intelligence refers to the decision-making capabilities generally associated with biological intelligence. Machine learning is a subset of artificial intelligence which refers to algorithms that are able to ‘learn’. Deep learning is a subset of machine learning originating from the McCulloch-Pitts neuron and Rosenblatt’s perceptron, which models the way in which biological neurons process information.

0_JrCuiWc5n-WsbFth.png
Liquidity Concentration
Liquidity Volatility
How Can Machine Learning Improve Smart Order Routers and DEX Aggregators?
Slippage
Deep Reinforcement Learning


Deep reinforcement learning (DRL) builds on the foundations of deep learning by further introducing aspects from biology and psychology via the introduction of reinforcement learning techniques. DRL agents learn to solve problems or make predictions by attempting to maximize a cumulative reward, akin to an incentive and penalty model. An example reward function for a DRL agent learning to play the game Snake might reward the agent for closing the distance between the snake’s head and apple, and penalize the agent for crashing into itself or walls.

0_9JOpUIUkYMUK_QBU.png
Deep Reinforcement Learning

These techniques can be adapted to fit the DEX aggregation and SOR problem by creating a volatility prediction LSTM model which uses historical data on the volatility of individual pools. This data would essentially be a time series mapping of orders to their respective impact on the market over time within a given pool, creating a continuous historical record of that pool’s volatility in response to orders — data which can be used to estimate the impact a given order would have on the pool’s liquidity, and the potential slippage it may incur. This can also be augmented with historical price movement data to provide the model with a broader view of the system.

Once trained, the model could then predictively assess a pool’s volatility by running over recent transaction data, but can also tap into unfilled transactions by accessing the blockchain’s mempool — a ledger in which blockchain transactions wait before being processed as transactions and permanently appended to the chain.

Liquidity Distribution


If liquidity concentration can be thought of as a body of water, liquidity distribution can be thought of as a system of bodies of water connected by rivers and streams. A user may want to take a bird’s eye view of these individual bodies of water (liquidity pools) and consider them as a network, before deciding which one would best fit their boat (order). It may even be the case that their boat is too big for any of the pools, and would be best broken down into smaller boats and placed across several.

In other words, liquidity distribution gives us insight into potential slippage across a network of liquidity pools. In the context of a DEX (and even moreso in a DEX aggregator), it is not hard to see why this information would be of great value when routing orders.

Liquidity distribution is key whenever large transactions are involved, particularly when the number of assets involved is above a threshold where orders in such quantities would immensely disrupt a pool, or may even be larger than the entire liquidity of that pool. In such cases, it would be best for both the trader and the ecosystem at large to disperse that order across multiple channels, so as not to bring the supply chain to a grinding halt — as this would certainly cause a price movement that would almost certainly incur unfavorable slippage on that trader’s end, and could cascade into larger negative impacts across the network.

Deep Reinforcement Learning for Smart Order Routers and DEX Aggregators


Using machine learning models to generate predictive metrics for use in more traditional algorithmic approaches has proven to be quite an effective solution, for instance, such models could be used in order to generate the weights connecting liquidity pool nodes as described in our article on pathfinding algorithms for DEX aggregators and SOR.

On-Chain Agents


It may be the case that the use of agent-based machine learning techniques such as deep reinforcement learning allows models to form that interpret and react to correlated variables in ways that we may fail to detect as humans. In essence, an on-chain agent is a blockchain-native computational agent, capable of processing data, learning, and carrying out actions such as transacting on the blockchain.

Amalgamating Web 3 and Machine Learning


Bringing deep reinforcement learning execution to blockchains is an area of great interest to Deeplink, and has been the subject of direct research for some time now. Stay tuned to our publication channels for an update on a project centered on exactly such techniques.

 

Reinforcement learning agents require an environment in which to act. As such, in order to facilitate on-chain agents, we must first convert our Web 3 problem into a reinforcement learning comparable environment. The most popular way of converting any given problem into a reinforcement learning environment is via the use of OpenAI’s Gym API, a reinforcement learning class framework for Python. This involves breaking the given problem into iterable steps which can be represented by the following Python functions:

__init__(self)

  • Used to establish the variables used for reinforcement learning, namely, the observation space (the space of all possible observations an agent can make about this environment) and the action space (the space of all possible actions an agent can make in this environment).

 

step(self, action)

  • This is called once every ‘step’ of the environment, in the case of a game, this may be one frame or one turn.

 

reset(self)

  • This both starts the environment on its first run and resets it once an episode has concluded.


In the context of Web 3 problems such as DEX aggregation and SOR, this can be a relatively complex process, not only must the scenario be translated into discrete, iterable steps, it must Web 3 connectivity must be natively built into the environment itself. Essentially, this process can be thought of as building a DApp for your reinforcement learning environment, a DApp that either transacts directly on its own via Web3.py/Web3.js, or which acts more as a keeper for smart contracts, instructing them to transact via TX variables or oracle integration. This script may also need to read data from the blockchain such as balances, transaction hashes, etc., all of which can also be handled via the use of the Web 3 libraries.

It is also likely that in use cases such as DEX aggregation and SOR, model training is best to be done in simulated environments, rather than on live exchanges with real assets. This can be done on private, command line test nets such as Ganache, or on live test nets such as Rinkeby or Goerli. The benefit of private testnets is the ability to control funds without needing to request testnet funds via a faucet, however, this may limit capabilities somewhat, as private test nets such as Ganache are incompatible with oracle functionality. Additionally, if the acquisition of testnet funds via faucets is an issue for your project, it may be a suitable workaround to deploy representative ERC20 (or other) tokens as stand-ins for testing purposes.

DApps can then be built on top of that test net which simulates the scenario which you wish to optimize, generally, this will be built into the reinforcement learning environment script, but may be external to that — in which case, the environment script can simply fetch data from the script running the simulations. For instance, it is likely that a proper backtesting engine (a paper trading, simulated exchange, which may or may not use real data from the exchange/s it is mirroring) would be required to properly train and test a DEX aggregation and SOR deep reinforcement learning agent. Backtesting engines can be more complex than simply APIs which read from an exchange, for instance, it may be wise to incorporate functionality such that the agent’s own orders incur an impact on the market itself. One way of approaching this is to aggregate real market orders which add up to the agent’s order and to consider those as the agent’s order in that time period.

Related Work


In addition to domain-specific conceptual research on the ways in which machine learning can be applied to our specific problem, Deeplink is also conducting an extensive literature review into cutting-edge machine learning techniques that may be leveraged for our purposes. In this section written by Priyanka Pursani Israni, some of our more interesting findings are explored.

Reinforcement Learning for Optimization Problems


Various studies have utilized reinforcement learning for optimization problems like game development, network optimization, etc. NVIDIA (Roy et al., 2021) has unveiled a new technique that makes use of artificial intelligence to construct circuits that are more effective, quicker, and smaller. The result is an increased level of functionality with each new generation of chips. It shows that Deep Reinforcement Learning can teach AI how to build these circuits from scratch. Figure 2 depicts the architecture of the proposed approach while fig.3. shows that the proposed method Prefix RL outperforms other state-of-the-art techniques.

0_JsiRg0BDCCQXC7bI.png
Liquidity Distribution
Deep Reinforcement Learning for Smart Order Routers and DEX Aggregators
On-chain Agents
Related Work
Amalgamating Web 3 and Machine Learning
Reinforcement Learning for Optimization Problems

Fig 2. The architecture of the proposed model [2].

0_Ijs2N2WE0_YLhpch.png

Fig 3. Comparison of the proposed method with state-of-the-art methods.

 


The authors came up with MLGO1 (Perolat et al., 2021), a framework for systematically adding ML techniques to an industrial compiler called LLVM. For demonstration purposes, the description of how and why the machine learning models are used in place of heuristics to improve LLVM’s in lining-for-size optimization is presented. When compared to state-of-the-art LLVM -Oz, the lining-for-size model was trained using two different ML algorithms: Policy Gradient and Evolution Strategies. After months of active development, the same model, trained on a single corpus, generalizes well to a variety of real-world targets as well as to the same set of targets. This trained model property is useful for deploying ML techniques in real-world settings.

The researchers of DeepMind have proposed DeepNash (Perolat et al., 2022), an autonomous agent capable of mastering the game of Stratego with imperfect information, has been proposed by DeepMind researchers. It can play the game at a level comparable to that of a human expert. To learn Stratego on its own, DeepNash employs a model-free, deep reinforcement learning approach that does not rely on search. By directly modifying the underlying multi-agent learning dynamics, the Regularized Nash Dynamics (R-NaD) algorithm, a core component of DeepNash, converges to an approximate Nash equilibrium rather than “cycling” around it. On the Gravon games platform, where it faced off against human Stratego experts, DeepNash achieved a yearly (2022) and all-time top-3 rank, outclassing the state-of-the-art AI methods currently in use. The overview of the stratego and the proposed algorithm is given in fig. 4.

0_VcyqKX9rYZsnnqih.png
References

Fig. 4. Overview of Stratego game and DeepNash approach.

 


In the proposed research (Humphreys et al., 2022), the team develops a semiparametric model-based agent that can forecast future policies and values based on future behavior in a specific state. They also incorporate a retrieval mechanism that allows the model to draw on data from a sizable dataset to help inform their predictions. The authors examined this strategy in Go, a difficult game where the large combinatorial state space favors generalization over direct matching to previous experiences. Moreover, they used fast approximate nearest neighbor techniques to retrieve useful information from a dataset containing tens of millions of states used in expert demonstrations. This is a compelling demonstration of the value of large-scale retrieval in RL agents, as paying attention to it significantly improves prediction accuracy and game-play performance compared to simply using these demonstrations as training trajectories.

The AI economist (Zheng et al., 2022) introduced by Salesforce AI is a Reinforcement Learning (RL) system that outperforms Alternative Tax Systems by learning dynamic tax policies to maximize equality and productivity in simulated economies. The AI Economist significantly outperforms baselines in improving both utilitarian social welfare and the trade-off between equality and productivity in spatiotemporal economies. It does this even though new ways of avoiding taxes are coming up. It also takes into account new specialization of labor, interactions between agents, and changes in behavior. The findings prove that a two-level, deep RL approach to economics is complementary to economic theory and paves the way for an AI-based strategy to design and comprehend economic policy.

Genetic Algorithms for Smart Order Routing and Automated Trading


The research reviewed in this section focuses on genetic algorithms and machine learning techniques for automated trading, in contrast to the use of reinforcement learning for general optimization problem proposed above.

(Xu, 2015) has proposed a continuous-time, partial equilibrium model on the optimal strategies of HFTs without any learning or manipulative ingredients to rationalize the pinging activities that were observed in the data. By analyzing past message traffic, the author can reconstruct limit order books and provide a characterization of the optimal strategies employed by HFT when my model is solved using a viscosity metric. The model’s implications on pinging activities are then compared to the data. The result shows that pinging is not always a way to trick people and can be seen as a part of HFTs’ dynamic trading strategies.

(Liu, 2015) has proposed a Shortfall (IS) strategy using an agent-based simulation technique. The author focused to create an artificial stock market to analyze the optimal execution strategies. Mechanisms for order formation, market clearing, and information dissemination are also developed for that market. The methods utilized are genetic algorithms for numerical optimization.

(Xu and Carruthers,2018) have proposed the machine learning methods like Random forest regressor, gradient boosting regressor, multilayer perceptron regressor, and logistic regression for placing aggressive orders (orders intended to be filled immediately) by minimizing client transaction fees and achieving the best price from a transaction. Also, the proposed method determines the appropriate venue for an aggressive order. Moreover, to make a decision ensemble voting i.e., to combine all four ML models are employed. The data is collected from their trading systems which include level II data for all on-the-run US Treasury bonds from multiple venues in 2017.

(Kearns & Nevmyvaka, 2013) has introduced the advantages and disadvantages of a machine learning approach to HFT and market microstructure. The authors have also taken into account the issues of pure execution across time and space, as well as the challenges of forecasting profitable shifts in strategy. They also discussed ML approaches for smart order routing in dark pools and reinforcement learning for optimized trade execution. From the study, the authors concluded that the ML techniques cannot give better optimization due to their black box nature but if focused on feature engineering and fine-tuning the hyperparameters then it’s good to go for ML methods.

 

 

References

 

  1. Sarker, I, “.Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective.,” SN Computer Science. 2. 10.1007/s42979–021–00535–6. , 2021.

  2. Roy, R., Raiman, J., Kant, N., Elkin, I., Kirby, R., Siu, M., … & Catanzaro, B. (2021, December). Prefixrl: Optimization of parallel prefix circuits using deep reinforcement learning. In 2021 58th ACM/IEEE Design Automation Conference (DAC) (pp. 853–858). IEEE.

  3. Trofin, M., Qian, Y., Brevdo, E., Lin, Z., Choromanski, K., & Li, D. (2021). Mlgo: a machine learning guided compiler optimizations framework. arXiv preprint arXiv:2101.04808.

  4. Perolat, J., de Vylder, B., Hennes, D., Tarassov, E., Strub, F., de Boer, V., … & Tuyls, K. (2022). Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning. arXiv preprint arXiv:2206.15378.

  5. Humphreys, P. C., Guez, A., Tieleman, O., Sifre, L., Weber, T., & Lillicrap, T. (2022). Large-Scale Retrieval for Reinforcement Learning. arXiv preprint arXiv:2206.05314.\

  6. Zheng, S., Trott, A., Srinivasa, S., Parkes, D. C., & Socher, R. (2022). The AI Economist: Taxation policy design via two-level deep multiagent reinforcement learning. Science advances, 8(18), eabk2607.

  7. Xu, J. (2015, November). Optimal strategies of high frequency traders. In AFA 2015 Boston Meetings Paper.

  8. Liu, C. (2015). Optimal Execution Strategies: A Computational Finance Approach (Master’s thesis, University of Waterloo).

  9. Renyuan Xu, Isaac Carruthers, “Machine Learning for Limit-Order Routing in Cash Treasury”. Published on June 2018 by Quantitative Brokers.

  10. Kearns, M., & Nevmyvaka, Y. (2013). Machine learning for market microstructure and high frequency trading. High Frequency Trading: New Realities for Traders, Markets, and Regulators.

  11. Hindawi, Complexity, Fang Jia, Boli Yang, “Forecasting Volatility of Stock Index: Deep Learning Model with Likelihood-Based Loss Function”, 2022 https://downloads.hindawi.com/journals/complexity/2021/5511802.pdf

  12. OpenAI Gym, OpenAI, “gym”, https://github.com/openai/gym, 2022

Genetic Algorithms for Smart Order Routing and Automated Trading
bottom of page