Auto ML applied to commodity markets: Translating raw data into predictive, actionable signals
Updated: Apr 18, 2021
What data is additive to your strategy?
Particle 1α AutoRank is an interactive research tool that our team has developed to filter the overwhelming amount of data to find the most powerful and predictive datasets (the Alpha) on the market.
We apply over 30 years of combined experience in Quant Finance, High-Performance Computing, Machine Learning, Natural Language Processing (NLP) and Alternative Data Management to financial markets and futures trading.
One of the most trending technologies in financial markets is AutoML (automated machine learning).
Foundational to our AutoML tool is data. We collect information from original sources in real-time and maintain a point-in-time database of over 150 million time series.
NLP is applied to time series sourced metadata to populate the Knowledge Graph.
Automated machine learning empowers users to build models without having to hire PhDs
Our AutoML technology sits on top of our Knowledge Graph and data, and with it you can:
Rank predictors according to their estimated predictive power (AutoML is a search tool helping you focus attention on the most impactful data)
Find data that is maximally uncorrelated with your existing basket of predictors (AutoML accelerates your research process)
Using proprietary NLP technology we make data investigation intuitive and efficient
Case study: Crude Oil
Selection of candidate times series
After applying several filters, the list of 65,000 different time series relevant to crude oil was narrowed down to 20 economically-motivated predictors.
But the question remains of whether it is possible to construct a portfolio using this data.
The preliminary discovery phase analyzes time series individually and cross-sectionally
Model construction and testing
Using the candidate time series as predictors, we start building models in the second phase of AutoML. A variety of signal transformations may be performed based upon the modeling approach.
The selected predictors are part of the Energy Information Administration’s (EIA) ‘Weekly Petroleum Status Report’, which is released every Wednesday.
The outcome of this stage consists of a small set of candidate predictors and models
U.S. Product Supplied of Petroleum Products, Weekly – Thousand Barrels per Day Measures the removal of petroleum products from the primary supply chain for ultimate delivery to consumers. A proxy for petroleum products’ consumption.
U.S. Net Imports of Crude Oil, Weekly – USA – Thousand Barrels per Day Measure the weekly difference between imports and exports of Crude oil in the United States.
U.S. Ending Stocks of Crude Oil, Weekly –Thousand Barrels Measures the weekly change in the number of barrels of commercial crude oil held by US firms.
Cushing, OK Ending Stocks excluding SPR of Crude Oil, Weekly – Thousand Barrels Measures the weekly change in the number of barrels of commercial crude oil held in the Cushing, Oklahoma region. Cushing is the largest oil-storage tank farm in the world and is the pricing point for WTI.
U.S. Refiner Net Input of Crude Oil, Weekly –Thousand Barrels per Day Measures the weekly difference between gross refinery input and gross refinery production in the United States.
Building an indicator
The third phase combines the selected models into a single indicator. To evaluate its characteristics, we simulate trading a portfolio based on the indicator alone.
Once the portfolio is created, AutoML assesses the strategy’s performance
Using the Particle 1α AutoRank tool, it is possible to narrow the list of potentially interesting time series from 60,000 to only 20 and then test the predictive power of each indicator.The time series are orthogonal (or complimentary) to the existing strategies of our Fundamental Customer.
Using AutoML in model construction improves data coverage and alpha generation speeding up the research process at the same time.