Knowledge Graph: Creating a global commodity data catalog
Updated: Apr 18, 2021
Particle.One is building the global knowledge graph for commodities, which includes:
Analytics and Models
A core difference between Particle.One and traditional data providers lies in our knowledge graph technology. Traditional data providers collect and deliver only time series data. We annotate the data with useful metadata (e.g., supply, geography, associated commodities) and use a knowledge graph to connect metadata in the form of a set of vertices and edges between them, to highlight the otherwise unseen economic relationships between time series collected.
Data is sourced from more than 100 providers, covering more than 100 million time series, in categories such as:
Supply, demand, and inventory
Trade and supply chain
Public company disclosures
The Particle.One Knowledge Graph goes beyond traditional financial terminals since it allows you to reason about the data and its relationships, find answers about financial events, and move from seeing the effect (e.g., price has moved) to seeing the causes of why the data is changing. We empower you to see the unseen:
Find the causes of an increase in oil consumption in China on 2020-06-03
Find all data pertaining to sugar production
What are the variables affecting the current volatility of gold
Rank the top 5 commodities that are needed for manufacturing cars
Predict the local supply for soybean inflation in the next 6 months
Forecast the demand for ethylene in China provinces
Find what US public companies are most affected by the price of nickel
It encodes knowledge that can be used to:
explore relationships between economic quantities (e.g., price, supply, inventory, demand for commodity)
build predictive models for economic quantities
collect economically motivated data for further use in research, investment strategies, etc.
build data-driven market reports
map economic quantities to commodity and equity instruments
assess drivers of portfolio risk
track the different stages of a commodity’s industry chain (e.g., upstream, midstream, downstream)
Investigate the data
A significant percentage of time in model development is spent on basic data onboarding and analysis.
We create tools that save hours on research. Using our notebooks, you can immediately see data distributions, investigate outliers, and see if the data is consistent:
Different representations of numbers (e.g., in thousands, millions)
Number of current data providers: 84
Number of time series currently published: 1.6 million
Number of time series to be published: more than 500 million
Number of commodities covered: 69
Top commodities by number of time series available:
Top countries by number of time series available:
Original source of the data
Many existing data providers rely on each other to source the data. The reason for this approach is that it is simpler and cheaper to resell data captured by someone else, rather than capturing it from the original source.
Particle.One always collects the data from the original sources, so it can be:
published as soon as it is available, without artificial delays
without any alteration
with a point-in-time semantic
We source the data by connecting to each original provider, dealing with the complexity of dishomogeneous semantics and formats, and present the data in a uniform, consistent manner.
Point-in-time means that each piece of data is presented “as-of-date”, i.e., the view of the data reflects what an observer would have seen at that specific time. This means capturing the evolution of the data over time, and not the most recent view.
Data sources often issue amendments, restatements, corrections, change of methodologies, which effectively rewrite history.
A point-in-time semantic is essential for any accurate and representative backtesting.
Only by running the capture system in real-time is possible to have a point-in-time view of the data. It is often difficult or impossible to reconstruct the data from a historical view.
The timestamps used by original data sources often confuse the period of report (i.e., the end of the period that a data point refers to) with the publication timestamp.
The Particle.One Knowledge Graph contains a publication timestamp which represents when the data was available to the customer. We collect both when the data is sampled from the original source and when the data is available to the customer.
Over 1 million time series are available in real time via a REST API or a Python library to make sure captured data goes directly into your data science flow.