The many shades of crude oil – part i

Correlation describes the strength and direction of a linear relationship between two quantitative variables. In other words it is a measurement of the degree of association between two sets of numbers that describes how closely they track or are related to one another.

Except for Crude Palm Oil the data used for commodities are cash prices, that is, the price of the physical commodity as opposed to the prices of future contracts of that commodity from the Wall Street Journal Asia edition website. For Crude Palm Oil we have used the daily settlement prices on future contracts (FCPO prices) for the spot month as available on Bursa Malaysia website. All commodity prices except for Silver and Crude Palm Oil are denominated in US dollar. Silver prices are denominated in Sterling whereas Crude Palm Oil prices are denominate in Malaysian Ringgit.

The correlations have been calculated on data for the period January 2008- December 2009. Besides the numerical measure for correlation (the correlation coefficient or r), correlation have also been graphically depicted:

Firstly by the simple trend in the prices levels over time (in general prices of Brent Crude oil on the left-hand vertical axis, the other commodity’s prices on the right-hand vertical axis) and,

Secondly by looking at scatter plots of a given commodity’s price with that of Crude Oil.

Scatter plots are a visual representation of the correlation between two items, in this instance Brent Crude Oil prices (on the horizontal axis) and the given commodity’s prices (on the vertical axis). The plots are useful in assessing the form, strength and direction of the relationship between the two variables. In addition they are helpful in identifying any outliers in the data.

Form: The way that the data points lie in the scatter plot tell us of the functional form of the relationship, i.e. whether a linear relationship exists or not.

Strength: A line of best fit is used in the scatter plot to assess the strength or weakness of a linear relationship. To determine how strong the relationship is we see how closely a non-horizontal straight line fits the data points of the scatter plot. The greater the dispersion of data points in the plot around the line of best fit the weaker is the correlation of between these two items. A horizontal line of best fit indicates that there is no linear relationship between the prices of the two commodities.

For the numerical measure this is represented by the magnitude of the value which lies between -1 and +1; the stronger the relationship, the larger the absolute value. One way of interpreting the correlation coefficient is by using the following “Rules of Thumb” applied to the absolute value of calculated r:

“r” ranging from zero to about 0.20 indicates “no or negligible correlation”,
“r” ranging from about 0.20 to 0.40 indicates a “low degree of correlation”,
“r” ranging from about 0.40 to 0.60 indicates a “moderate degree of correlation”,
“r” ranging from about 0.60 to 0.80 indicates a “marked degree of correlation”,
“r” ranging from about 0.80 to 1.00 indicates “high correlation”.

Direction: Does the scatter plot or alternatively the line of best fit slope upwards or downwards? A line sloping upwards from left to right represents positive correlation, i.e. it suggests that as the price of one commodity increases the price of the other commodity tends to increase as well. A downward slope line from left to right indicates that there is negative correlation. For the numerical measure this is represented by the sign of the correlation coefficient- a negative sign representing negative correlation, a positive sign indicating positive correlation.

Outliers: These are individual values that fall outside the overall pattern of the relationship and could lead to over inflated or under inflation correlation values. They may be due to errors or anomalies or exceptions in the data.

In addition to assessing correlations over this entire period, we have also looked at how the correlation coefficients (i.e. the numerical measures) have varied over different periods. This was done to assess degree of stability in the correlation figures and also to identify periods of time when correlations broke down or changed. Increased correlations in any period over those normally experience could be indicative of increase systematic risk in the market. The periods considered were:

January 2008 to December 2009
January to December 2008
January to December 2009
January to June 2008
July to December 2008
January to June 2009
July to December 2009