## Saturday, 2 March 2013

### Hierarchical structure in financial markets-Clustering stocks

#### INTRODUCTION

Financial markets are well-defined complex systems. The paradigm of mathematical finance
is that the time series of stock returns are unpredictable. Within this paradigm, time evolutions of
stock returns are well described by random processes. A key point is if the random processes of stock returns time series of different stocks are uncorrelated or, conversely, if economic factors are present in financial markets and are driving several stocks at the same time.

### FORMULA USED

In the present analysis, A hierarchical structure present in a portfolio of n stocks traded in a financial market is detected using synchronous correlation coefficient of the daily difference of logarithm of closure price of stocks for all stocks present in the portfolio in a given time period. The goal is to obtain the taxonomy of a portfolio of stocks traded in a financial market by using the information of time series of stock prices only.The degree of similarity between the synchronous time evolution of a pair of stock price is determined by the correlation coefficient

where i and j are the numerical labels of stocks, Yi = ln Pi(t) ln Pi(t 1) and Pi(t) is the closure price of the stock i at the day t. The statistical average is a temporal average performed on all the trading days of the investigated  time period
The n x n matrix of correlation coeffcients for daily logarithm price differences is determined. The elements of matrix  can vary from 1 (completely anti-correlated pair of  stocks) to 1 (completely correlated pair of stocks). When it is 0 the two stocks are uncorrelated A metric can be determined using as distance a function of the correlation
coefficient. An appropriate function is

With this choice d(i; j) fulfills the three axioms of a metric distance { (i) d(i; j) = 0 if and only if i = j;
(ii) d(i; j) = d(j; i) and (iii) d(i; j) <= d(i; k) + d(k; j)}.
The distance matrix D is then used to determine the minimal spanning tree connecting the n stocks of the portfolio. The method of constructing a MST linking a set of n objects is direct The MST of a set of n elements is a graph with n 1 links. Using this matrix D, MST can be constructed using  any of the minimum spanning tree algorithm.

### INFERENCE

a)DOW JONES

#### Minimal spanning tree associated with the distance matrix D is of great interest from an economic point of view . The more evident and strongly connected group is the group of stocks CHV, TX and XON namely Chevron, Texaco and Exxon. These three companies are working in the same industry (energy) and in the same subindustry (international oils). AA and IP, namely Alcoa (working in the subindustry sector of nonferrous metals) and International Paper (working in the subindustry sector of paper and lumber) form a second group. Both companies provide raw materials. The third group involves companies which are in industry sectors which deals with consumer nondurables (Procter & Gamble, PG) and food drink and tobacco (Coca Cola, KO).

b)S&P 500

Fig. 3. Main structure of the hierarchical tree of the portfolio of stocks used to compute the S&P 500 index. Groups are labeled with integers ranging from 1 to 44.1. Metals (nonferrous metals, gold); 2. Construction (residentialbuilders); 3. No common industry sector; 4. Travel and transport (trucking and shipping); 5. Consumer nondurables (photography and toys); 6. No common industry sector; 7. Metals (steel); 8. Consumer durables (automotive parts); 9. Travel and transport (airlines); 10. Entertainment and information (broadcasting and cable); 11. Financial services (lease and finance); 12. Energy (oil_eld services); 13. Energy (international
oils); 14. No common industry sector. 15. Capital goods (heavy equipment); 16. Business services and supplies (environmental and waste); 17. Construction (commercial builders); 18. Consumer durables (automobiles and trucks); 19. Food drink and tobacco (tobacco); 20. Entertainment and information (publishing); 21. Forest products and packaging (paper and lumber); 22. Metals (nonferrous materials); 23. Metals (nonferrous materials); 24. Metals (nonferrous materials); 25. Computer and communications (peripherals & equipment or software);26. Electric utilities (regional area); 27. Computer and communications (telecommunications); 28. Retailing (department stores and drug & discount); 29. no common industry sector; 30. Travel and transport (railroads); 31. Food drink and tobacco (food processors); 32. no common industry sector; 33. Insurance (property & casualty and diversi_ed); 34. Health(drugs); 35. Health (drugs); 36. Consumer nondurables (personal products); 37. Food drink and tobacco (beverages); 38.Retailing (no common subindustry sector (SS)); 39. Capital goods (electrical equipment); 40. Financial services (no common SS); 41. Financial services (thrift institutions); 42. Financial services (multinational banks); 43. Financial services (regional banks); 44. Financial services (multinational banks).

The same investigation is repeated for the set of stocks used to compute the S&P 500 index as shown in fig 2. A group of financial services, capital goods, retailing, food drink & tobacco and consumer nondurables companies is observed in this strongly connected group of stocks.. A detailed inspection of the hierarchical tree associated to the MST provides a large amount of economic information.With only a few exceptions the groups are homogeneous with respect to industry and often also subindustry sectors suggesting that set of stocks working in the same industry and subindustry sectors respond, in a statistical way, to the same economic factors. For example, ores, aluminum and copper are all classified metals as industry and nonferrous metals as subindustry. From the  analysis, it is detected  that they respond to quite different economic factors. Specifically, ores companies are grouped in a cluster, which is the most distant from all the others groups of stocks of the tree, while aluminum and copper companies constitute a subgroup of the group containing raw materials companies.The detection of a hierarchical structure in a broad portfolio of stocks traded in a financial market is consistent with the assumption that the time series of returns of a stock is affected by a number of economic factors . In general, stocks or groups of stocks departing early from the tree (at high values of the distance d<(i; j)) are mainly controlled by economic factors which are specific to the considered group (for example gold price for the stocks of the group 1 of the tree (see Fig. 3) which is composed only by companies involved in gold mining). When departure occurs for (moderately) low values of d<, the stocks are affected  either by economic factors which are common to all stocks  and by other economic factors which are specific to the considered set of stocks.

The detected hierarchical structure might be  useful in the detection of financial markets and in the search of economic factors affecting specific groups of stocks. The taxonomy associated with the obtained hierarchical structure is obtained by using information present in the time series of stock prices only. This result shows time series of stock prices are carrying valuable (and detectable) economic information.

#### REFERENCES:

[1] R. N. Mantegna, “Hierarchical structure in financial markets,” Euro.
Phys. J. B, vol. 11, pp. 193–197, 1999.
[2]Detecting Stock Market Fluctuation from Stock Network Structure Variation
Jing Liu, Chi K. Tse and Keqing He
[3] S.A. Ross, J. Econ. Theo. 13, 341 (1976).
[4] B.B. Mandelbrot, J. Business 36, 394 (1963).
[5] L.P. Kadano , Simulation 16, 261 (1971).
[6] R.N. Mantegna, Physica A 179, 232 (1991).