Approaches to the Challenge
To produce a top solution, teams needed to have a significant understanding of the data they were working with. Some assets trade globally, whereas others are predominantly traded in Europe. They needed to assess questions such as: How big are the correlations between assets and sectors? What is the biggest source of risk?
Having grasped the data, teams then needed to decide on a sensible benchmark and a model evaluation framework to robustly assess model performance.
There were three key themes persistent in the top 5 submissions:
- Rapid iteration: This means ensuring that once built, a model is quickly deployed and then updated as and when needed
- Robust evaluation: Having the ability to retrain the model on a new data set and still achieve the desired behavior
- Feature selection: The process of selecting a subset of relevant variables for use during model construction
The contest ran for over a month, meaning that teams needed to have a sensible methodology to quickly evaluate and improve models; a robust evaluation framework is key to this.
Feature engineering is the process of using domain knowledge to extract more information from raw data. The teams who engineered financially relevant features tended to have stronger model performance. This highlights the importance of a strong grounding in the mechanics of financial markets when building algorithms to interpret market data, as a good financial intuition helps to cut through the noise.
Tips for Interested Students
To engage further in the world of machine learning, we recommend first grasping some of the mathematics behind machine learning itself. The Deep Learning Book is an open-source textbook by Ian Goodfellow, Yoshua Bengio and Aaron Courville that provides an excellent introduction to the techniques.
To obtain real world exposure, Kaggle is a fantastic platform where you can learn and practice your machine learning skills through the community competitions that run regularly.