Вопрос

I'm new to R and even newer to using it with Excel. I want to get a list of all the worksheet names (Notes,Weights,Lengths) in an .xls file. You can see what I'm trying below - the problem is that the output has a $ dollar sign at the end for some reason and is sometimes also surrounded with single quotes.

FileToImport <- "C:\\folder\\filetoimport.xls"

z <- odbcConnectExcel(FileToImport, readOnly = TRUE)

sqlTables(z)
TABLE_CAT TABLE_SCHEM         TABLE_NAME   TABLE_TYPE REMARKS
1 C:\\folder\\filetoimport.xls <NA>     Notes$ SYSTEM TABLE    <NA>
2 C:\\folder\\filetoimport.xls <NA> 'Weights$'        TABLE    <NA>
3 C:\\folder\\filetoimport.xls <NA> 'Lengths$'        TABLE    <NA>

sqlTables(z)[,"TABLE_NAME"]

[1] "Notes$"             "'Weights$'" "'Lengths$'"

I could try to clean these characters up but I don't really know how to go about this since the quotes format is inconsistent - some of the workbooks are "SYSTEM TABLEs" and some are just "TABLEs". Could someone explain what the difference between these worksheets is and give me an idea of how to recreate just the 'clean' tabnames?

Это было полезно?

Решение

Thanks to the above nudge in the right direction, I managed to use regular expressions to get the worksheet names in the desired output (without any punctuation).

gsub("[[:punct:]]","",sqlTables(z)[,"TABLE_NAME"]) 
[1] "Sheet1" "Sheet2" "Sheet3"

Другие советы

I have not much experience with RODBC but do you mean the following output by clean?

 data.frame(sqlTables(z))$TABLE_NAME
 [1] "Sheet1$"  "Sheet2$"  "Sheet3$"  "ZRDaten1"

if you save that in a vector say b you can access them with z[i]. If you only need a certain type what about:

 na.omit(ifelse(data.frame(sqlTables(z))$TABLE_TYPE=='SYSTEM TABLE', data.frame(sqlTables(z))$TABLE_NAME, NA))
 [1] "Sheet1$" "Sheet2$" "Sheet3$"

admittedly unelegant....

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top