Data modeling and machine learning are an important part of the research we do at Aspect. We were therefore delighted to welcome a cohort of students into the world of algorithmic trading for a 2022 machine learning challenge in collaboration with the Oxford Strategy Group Digital (OSGD), an Oxford-based machine learning consulting society run by students. The OSGD x Aspect Capital Quant ML Challenge was Aspect’s second event run in collaboration with a student organization, following the success of the 2021 Algothon run in partnership with the Imperial Algorithmic Trading Society.
The OSGD x Aspect Capital Quant ML Challenge was hosted as a Kaggle community contest. Kaggle is a popular platform in the data science community and allowed excellent collaboration between the teams who entered and those reviewing submissions.
Aiming to give contestants an insight into the world of algorithmic trading, we wanted to set a problem that was both challenging and realistic. We settled on the challenge of forecasting the overnight performance of a selection of liquid futures contracts. Contestants would need to try to understand how trading activity during the daytime relates to subsequent activity during the night-time session, and in particular which assets experience shifting dynamics as trading volume moves around the globe.
Contestants were then assessed on their ability to predict the overnight Sharpe ratio of an asset. Over the month that the contest was active, the top spot on the leaderboard changed hands many times due to the quality of the submissions produced and refined during the contest. The top three teams at the end of the event were then invited to present their work to a panel of judges, led by Darius Horbrian, one of Aspect’s Researchers, and a number of machine learning experts and quantitative researchers from his team.
Many congratulations to the winning teams, Chris C and ZQZ, who placed first and second respectively.
Approaches to the Challenge
To produce a top solution, teams needed to have a significant understanding of the data they were working with. Some assets trade globally, whereas others are predominantly traded in Europe. They needed to assess questions such as: How big are the correlations between assets and sectors? What is the biggest source of risk?
Having grasped the data, teams then needed to decide on a sensible benchmark and a model evaluation framework to robustly assess model performance.
There were three key themes persistent in the top 5 submissions:
- Rapid iteration: This means ensuring that once built, a model is quickly deployed and then updated as and when needed
- Robust evaluation: Having the ability to retrain the model on a new data set and still achieve the desired behavior
- Feature selection: The process of selecting a subset of relevant variables for use during model construction
The contest ran for over a month, meaning that teams needed to have a sensible methodology to quickly evaluate and improve models; a robust evaluation framework is key to this.
Feature engineering is the process of using domain knowledge to extract more information from raw data. The teams who engineered financially relevant features tended to have stronger model performance. This highlights the importance of a strong grounding in the mechanics of financial markets when building algorithms to interpret market data, as a good financial intuition helps to cut through the noise.
Tips for Interested Students
To engage further in the world of machine learning, we recommend first grasping some of the mathematics behind machine learning itself. The Deep Learning Book is an open-source textbook by Ian Goodfellow, Yoshua Bengio and Aaron Courville that provides an excellent introduction to the techniques.
To obtain real world exposure, Kaggle is a fantastic platform where you can learn and practice your machine learning skills through the community competitions that run regularly.
Wrap Up and Key Takeaways
The machine learning contest encapsulated key themes of building realistic machine learning models. We are delighted with the number of students that engaged and the quality of solutions that they produced.
To further improve the models produced during the contest, more emphasis would need to be placed on a robust model evaluation framework and deeper risk analysis to ensure the models would have complemented the existing portfolio.
The key lessons learned from this challenge include:
- The importance of understanding the underlying data well and where the risk comes from
- Ensuring you have a robust framework for model evaluation
- Feature engineering is absolutely key when your data has a low signal-to-noise ratio
Note: Any opinions expressed are subject to change and should not be interpreted as investment advice or a recommendation. Any person making an investment in an Aspect Product must be able to bear the risks involved and should pay particular attention to the risk factors and conflicts of interests sections of each Aspect Product’s offering documents. No assurance can be given that any Aspect Product’s investment objective will be achieved.