Question

I have used codeproject to get share data from yahoo ( http://www.codeproject.com/Articles/37550/Stock-quote-and-chart-from-Yahoo-in-C ).

In yahoo finance, there are 'Key Statistics' which i would like to use, but are not available by this means (e.g. data at http://uk.finance.yahoo.com/q/ks?s=BNZL.L ). Is there any way to get this information directly? I would really rather not screen scrape if possible.

I am using C#/.NET4.

Was it helpful?

Solution

You can use my lib for .NET Yahoo! Managed. There you have the MaasOne.Finance.YahooFinance.CompanyStatisticsDownload class to do exactly what you want.

p/s: You need to use the latest version (0.10.1). v0.10.0.2 is obsolete with Key Statistics Download.

OTHER TIPS

I landed on this question while searching for an answer couple of days ago, thought of providing an answer I created in R (and shared it on R-Bloggers). I know that the answer I am providing is not in C# but XPath and XML are supported in every language so you can use this approach there. The URL to the blog is - http://www.r-bloggers.com/pull-yahoo-finance-key-statistics-instantaneously-using-xml-and-xpath-in-r/

#######################################################################
##Alternate method to download all key stats using XML and x_path - PREFERRED WAY
#######################################################################

setwd("C:/Users/i827456/Pictures/Blog/Oct-25")
require(XML)
require(plyr)
getKeyStats_xpath <- function(symbol) {
  yahoo.URL <- "http://finance.yahoo.com/q/ks?s="
  html_text <- htmlParse(paste(yahoo.URL, symbol, sep = ""), encoding="UTF-8")

  #search for <td> nodes anywhere that have class 'yfnc_tablehead1'
  nodes <- getNodeSet(html_text, "/*//td[@class='yfnc_tablehead1']")

  if(length(nodes) > 0 ) {
   measures <- sapply(nodes, xmlValue)

   #Clean up the column name
   measures <- gsub(" *[0-9]*:", "", gsub(" \\(.*?\\)[0-9]*:","", measures))   

   #Remove dups
   dups <- which(duplicated(measures))
   #print(dups) 
   for(i in 1:length(dups)) 
     measures[dups[i]] = paste(measures[dups[i]], i, sep=" ")

   #use siblings function to get value
   values <- sapply(nodes, function(x)  xmlValue(getSibling(x)))

   df <- data.frame(t(values))
   colnames(df) <- measures
   return(df)
  } else {
    break
  }
}

tickers <- c("AAPL")
stats <- ldply(tickers, getKeyStats_xpath)
rownames(stats) <- tickers
write.csv(t(stats), "FinancialStats_updated.csv",row.names=TRUE)  

#######################################################################

If you don't mind using the key statistics from BarChart.com, here is a simple function script:

library(XML)

getKeyStats <- function(symbol) {
  barchart.URL <- "http://www.barchart.com/profile.php?sym="
  barchart.URL.Suffix <- "&view=key_statistics"
  html_table <- readHTMLTable(paste(barchart.URL, symbol, barchart.URL.Suffix, sep = ""))
  df_keystats = html_table[[5]]
  print(df_keystats)
 }
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top