Adaptation and Opportunity: Responding to the rise in Alternative Data

Waqar Rashid, Director of Data at Aspect Capital, discusses some of the opportunities and challenges presented by the increasing prevalence of alternative datasets.

How do you define alternative data?

Interesting question, we could write a whole thesis just on this topic! We recently undertook an exercise to attempt to define the concept, and we realised how difficult it is. It depends to a large extent on the consumer: one person’s “alternative” data is another’s bread-and-butter. It’s also a time-varying thing: what might be considered “alt data” today could be considered mainstream tomorrow, as we’re beginning to see with something like credit card data for example.

Ultimately it is a case-by-case assessment for us whether we would consider a potential new dataset as “non-traditional”.

For how long has Aspect been using alternative data?

This comes back to how you define the concept. We first started using sentiment data in 2007, which for us represented our first foray into what we considered at the time to be “alternative” data.

However, our focus on this area has really increased in the last few years, in particular since around 2016 as we have diversified the business through the development of new systematic investment strategies. Our Systematic Global Macro Programme has been especially active in allocating risk to models that use these sorts of data inputs and this represents a key opportunity to generate alpha within that programme going forward.

What opportunities does alternative data present for systematic managers such as Aspect?

Data is obviously fundamental to how we build our systematic investment models. The opportunity set presented by alternative data will vary depending on the nature of the models and the research being conducted, but you can easily see how alternative data can be used in macro strategies, for example, to identify behavioural or technical effects in markets. To give an example, our Systematic Global Macro Programme uses data derived from natural language processing of news sources as an input into a set of models seeking to exploit investor sentiment. Overall, as much as 50% of the research resource allocation for that programme is now focused on using alternative datasets to forecast the markets it trades.

Looking at the statistics (see Table 1), we currently still see discretionary funds predominating in terms of their interest in alternative data, however the interest from systematic managers is quickly catching up as they continue to explore different use cases. If I was gazing into my crystal ball, I could see for example how it might have a utility in some of the execution research we continually do to enhance our trading systems.

Table 1: Buyside engagement with an Alternative Data Aggregator by strategy type

Source: Eagle Alpha

What are some of the challenges of working with alt data, and how do you address these?

There are at least three key challenges in handling alternative datasets.

The first is the size of the data. In the past we were dealing in megabytes but now we are talking about terabytes of data, which needs to be carefully managed within our infrastructure. One way we have tackled this challenge is to make use of a virtual private cloud, which allows us to handle vast quantities of data without having to invest directly in the hardware necessary to consume it. This is changing the way we do things and enables us to be more flexible and agile. Using this approach, we recently onboarded a dataset that is around eight terabytes in size - it would have been impossible to ingest this straight into our existing infrastructure in a short period of time.

The second challenge relates to handling the morphological characteristics of alternative data. Traditional data has an identifiable set of metrics that can be anticipated in the design of data processing systems: for instance, a daily close bar for a futures market will always have Open, High, Low, Close, Volume and Open Interest. Traditional data can therefore be stored in a fixed system design. But alternative data, in contrast, needs to be handled using a flexible system design: the dimensions of the data can be orders of magnitude larger than for traditional data, and those dimensions likely cannot be anticipated in advance as each dataset will have its own idiosyncrasies. For example, flow data will have multiple metrics that are unique even when compared against other flow datasets. Additionally, the dimensions may vary over time. Take for example Nowcasting data, which is commonly generated using an econometric model. The vendor might be forced to re-calibrate its model in certain circumstances, and with re-calibration they might release a new historical dataset, deprecating the previous history. Ignoring the old history and using the new one for back-testing purposes is a recipe for overfitting back-tests. Such time-varying complexity must be handled elegantly to avoid pitfalls, so again flexibility in the system that handles alternative data is crucial.

Finally, the complexity and detail of some of the newer datasets can require a greater degree of collaboration between the Data team and the Research teams. We can no longer just process the data and make it available to the researchers for their use: we need to work closely with them to ensure the data is correctly processed and is made available to them in the most useful way. It’s no longer just a research problem, it’s also an engineering problem.

Then of course there is the issue of assessing the return on investment of new datasets. This is an interesting problem to solve and one which primarily falls to the investment teams to assess before we pull the trigger on a new dataset.

What about the compliance challenges? Do you have a formal data policy in place?

Yes, we have a data procurement policy in place for non-traditional datasets, with the Data, Legal and Compliance teams all having a role to play. We adopt a risk-based approach, considering both the nature of the data itself and the data vendor.

Some of the obvious things we are trying to address through this process are whether the data has been legitimately obtained, whether the vendor has the right to sell it to us and whether it might contain material non-public information; while other, less obvious red flags include whether the data is being offered to us exclusively as this may convey an unfair informational advantage. The UK’s FCA is alive to the possibility that access to alternative data can impact the integrity of financial markets, whilst admitting this is a grey area and acknowledging the need for market participants to innovate[1]. So, it’s important that we self-police in this area and take a thoughtful approach when appraising new, non-traditional datasets.

Our process is based on our years of experience and addresses the compliance and regulatory challenges head-on. This is a cross-departmental process to ensure we have considered all risks and addressed all potential concerns before the data enters the organisation.

[1] https://www.fca.org.uk/insight/turning-data-inside-out

Why is Aspect well-placed to capitalise on the growth of alt data? Does it represent a competitive advantage?

We have a dedicated, centralised Data team with many years’ experience of working on complex data challenges. The team provides a unique technical service in maintaining the data systems that drive our front, middle and back office functions. Within the team, data engineers are concerned with creating and maintaining data applications/data pipelines that process the data before it is used for signal generation, while data analysts are more involved with the analysis of datasets themselves and assessing new business requirements around data. The Data team designs our overall strategy and also works with a cross-departmental group to analyse new datasets and establish whether they may be of use to one or more of our research teams.

We also naturally find that we are approached by a vast number of data vendors who recognise Aspect as a leading name in the systematic investment industry. This acts as a natural competitive advantage and ensures that, if there is an interesting new dataset out there, we should generally know about it.

Finally, as an organisation employing Agile software practices, we are able to act nimbly and to push into new areas of data and technology in a very reactive way.

What proportion of datasets that are reviewed ultimately make it into your models? How many datasets have you reviewed over the past year or so?

That is a difficult question to answer, because for some datasets, the bank of data is vast, so we could have multiple looks at the data and by intelligently slicing and dicing it, we can come up with a multitude of uncorrelated features each time. But I would say that we have looked in detail at around 15 alternative datasets since the beginning of 2019 and we are currently using four of those in production.

In terms of actual results, what impact has the use of alternative data had on the performance of your investments programmes?

2020 has been a great out-of-sample test of the efficacy of these datasets. We have seen traditional, slow-moving macro datasets being quite ineffective in predicting market direction due to the unprecedented economic and macro impact of the pandemic and the associated central bank action. Conversely, alternative data-based models have captured the oscillating market sentiment well and contributed strongly – and importantly, in an uncorrelated fashion – to performance within our Systematic Global Macro Programme.

Do you see the use of alternative data expanding in the coming years?

Absolutely. The opportunity is ever-growing as not only the number of available datasets increases (see Figure 1) but so does the length of the histories. If you take a dataset that began in the aftermath of the GFC, for example, a user of that dataset will now be able to see how that data has played out over more or less a full market cycle, encompassing the huge equity bull-run of the last ten years or so, its dramatic dislocation in the wake of the Covid-19 crisis and the aftermath of that shock so far. This ability to test and validate hypotheses through different market cycles and events is vital and will make these newer datasets all the more appealing over time. For these reasons, the use of alternative data is rapidly pervading across the financial industry in one form or another.

Figure 1: Number of alternative datasets available via Alternative Data Aggregator

Source: Eagle Alpha

This trend is also not just confined to the investment management or even the wider financial industry. As a case in point, Central Banks are increasingly making use of new datasets to provide a timelier view on economic conditions, which in turn drives monetary policy decisions. Like us, they are looking for anything which will give them an informational advantage in order to appropriately parameterise their approach.

Waqar Rashid - Director of Data

Waqar Rashid is Director of Data at Aspect Capital, having joined the systematic investment manager in December 2006. He is responsible for the entire set of data requirements for all of Aspect’s quantitative programmes. With over twenty years of experience in the industry, Waqar Rashid has held various roles including: Quantitative Analyst in the Equity Strategy Team at Goldman Sachs; Quantitative Analyst in the Equity Quant-Macro Trading Desk at Nations Bank; Head of Quant Research at IBJ Asset Management and Principal of Equinox Capital Management, where he ran European and US Equity Statistical Arbitrage. Waqar Rashid has an M.Sc. in Mathematical Economics and Econometrics from the London School of Economics and a B.A. in Economics and Mathematics from Sussex University.

If you have any questions on this article please contact us.

CONTACT

Disclaimer

Note: Any opinions expressed are subject to change and should not be interpreted as investment advice or a recommendation. Any person making an investment in an Aspect Product must be able to bear the risks involved and should pay particular attention to the risk factors and conflicts of interests sections of each Aspect Product’s offering documents. No assurance can be given that any Aspect Product’s investment objective will be achieved.

SEC Marketing Rule

With effect from 1st November 2022, Aspect came into compliance with the U.S. Securities and Exchange Commission’s (SEC’s) new ”Marketing Rule”. This document was created prior to this date (“Old Material”) and therefore may not reflect certain requirements of the Marketing Rule. Please refer to the following website here for important disclaimers and other information required by the Marketing Rule, which are hereby incorporated into the Old Material by reference, to the extent applicable.

Waqar Rashid, Director of Data at Aspect Capital, discusses some of the opportunities and challenges presented by the increasing prevalence of alternative datasets.

Table 1: Buyside engagement with an Alternative Data Aggregator by strategy type

Figure 1: Number of alternative datasets available via Alternative Data Aggregator

Latest News & Insights

Important Information