Pergunta

I am looking for a package/way that would allow me to download index compositions from various websites. Index compositions changes rarely and are easily available but I can't find any csv available online.

How can I load say the CAC 40 definition ?

PS: What I care about are the names/isin/sicovam not really the weights in the index

Foi útil?

Solução

You can find the composition of the CAC40 at Wikipedia, and download and process with package XML.

The function readHTMLTable() is particularly useful, since it will find and parse all tables on the page. In this case the relevant table is the second, hence the index [[2]] in the code. Try:

library(XML)
url <- "http://en.wikipedia.org/wiki/CAC_40"
dat <- readHTMLTable(url)[[2]]

 head(dat[, 1:3])
        Company           ICB Sector Ticker symbol
1         Accor               hotels            AC
2   Air Liquide  commodity chemicals            AI
3        Alstom industrial machinery           ALO
4 ArcelorMittal                steel            MT
5           AXA  full line insurance            CS
6   BNP Paribas                banks           BNP

The same code also works for the FTSE 100:

url <- "http://en.wikipedia.org/wiki/FTSE_100_Index"
dat <- readHTMLTable(url)[[2]]
head(dat[, 1:3])
                   Company          Sector Market cap (£bn)
1        Royal Dutch Shell     Oil and gas                 135
2                     HSBC         Banking                 129
3                       BP     Oil and gas                  85
4           Vodafone Group       Telecomms                  83
5          GlaxoSmithKline Pharmaceuticals                  73
6 British American Tobacco         Tobacco                  69
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top