Domanda

I have a dataset, a daily timeseries and I want to arrange into a single column, this is my data:

Date       Day 1    Day 2   Day 3   Day 4   Day 5   Day 6 .... Day 31
01/01/1964  0         0       0       0       0       0         3
01/02/1964  NA       NA      NA       NA      NA      NA ...                
01/03/1964  195      445    329      121     61,6     44 ...
01/04/1964  17,2    14,9    17,1     102     54,3    9,33 ...

I want this:

 Day1  0
 Day2  0
       .
       .
       .
 Day31 3

I having problems because of leap years that have 366 days, i trying this, but no succes, thanks in advanced.

EDIT:

I finally got it, but if anyone knows a more easy way, using some package or function, I'm grateful. Or I'll create my own function.

EDIT 2:

Now I have a problem, when I not start in the first month of a year.

rm(list = ls())
cat("\014")


setwd("C:/")

require(XLConnect)

# Load Streamflow Gauging Station

wb <- loadWorkbook("rainfall.xls")
Data<- readWorksheet(wb, sheet = "rainfall",header = FALSE,region = "B02:AF517")

R<- Data; ##1964 - 2006

sum(R[is.na(R)==FALSE])

# Number of days in each month

Ny<- c(31,28,31,30,31,30,31,31,30,31,30,31); # Normal Year
Ly<- c(31,29,31,30,31,30,31,31,30,31,30,31); # Leap/bissextile Year

S1<- c(1,0,0,0) # Leap year, normal year...
S2<- c(0,1,0,0) # Normal year, leap year...
S3<- c(0,0,1,0) #...
S4<- c(0,0,0,1) #...

Iab<- rep(S1,times=ceiling((nrow(R)/12)/4)); # Index of years
Iab<- Iab[1:(nrow(R)/12)];
Rnew<- matrix(numeric(0), 0,0);


#Organize data in a only collumn

for(i in 1:(nrow(R)/12)){
  for(j in 1:12){
    if(Iab[i]==0){
      Rnew<-c(Rnew, t(R[12*(i-1)+j,1:Ny[j]]))
    }else{
      Rnew<-c(Rnew, t(R[12*(i-1)+j,1:Ly[j]]))
    }
  }
}

sum(R[is.na(R)==FALSE])==sum(Rnew[is.na(Rnew)==FALSE]) #Test for succes of organize
sum(R[is.na(R)==FALSE])
sum(Rnew[is.na(Rnew)==FALSE])
È stato utile?

Soluzione

I have a similar problem. However in a way even worse, since I have discharge data (Brasilian ANA station) with several interruptions of several month and years. Vazao01 stands for the discharge at the first day of the month, Vazao02 for the second and the data frame goes up to Vazao31 (which is obviously NA for month with less days, but can as well be NA for existing days without record). The data looks like this and is the data.frame "ANAday"

  Date Vazao01  Vazao02  Vazao03...
20 01.05.1989 3463.00 3476.500 3463.000
21 01.06.1989 1867.70 1835.900 1809.400
22 01.07.1989  809.90  798.200  774.800
23 01.08.1989  344.60  308.700  297.900
24 01.11.1989  376.50  388.100  391.000
25 01.12.1989  279.00  289.800  319.500
26 01.01.1990 1715.00 1649.000 1573.200
27 01.02.1990 1035.20 1005.800  972.200
28 01.03.1990 2905.60 2962.100       NA
29 01.06.1990      NA       NA       NA
30 01.07.1990  297.90  284.400  271.200
31 01.08.1990  228.00  223.200  218.400
32 01.08.1999      NA       NA  144.000
33 01.09.1999   20.74   18.620   16.500
34 01.10.1999  119.85  111.450   95.385
35 01.11.1999   11.20   23.705   48.370
36 01.12.1999  160.10  179.000  187.400
37 01.01.2000  843.00  865.300  914.500
38 01.02.2000 1331.30 1368.900 1387.800
39 01.04.2000 1823.60 1808.000 1789.800
40 01.05.2000 1579.00 1524.100 1445.700

I made a list of the month with data

ANAm=as.Date(ANAday[,1], format="%d.%m.%Y")
format(ANAm, format="%Y-%m")

Than I used the "monthDays" function of the Hmisc package to list the number of days in each month

require(Hmisc)
nodm=monthDays(ANAm)
Nodm=cbind.data.frame(ANAm,nodm)

I prepared a data.frame for the data I want to have with 3 columns for "YEAR MONTH", "DAY" and "DISCHARGE"

ANATS=array(NA,c(1,3))
colnames(ANATS)=c("mY","d","Q")

And used a simple "for" loop to extract the data into one column according to the number of days in each month

for(i in 1:nrow(Nodm)){
selectANA=as.vector(ANAd[i,1:(Nodm[i,2]) ])
selectANA=as.vector(t(selectANA))##to generate a simple vector
dayANA=c(1:(Nodm[i,2]))
monthANA=rep(format(as.Date(Nodm[i,1]),format="%Y-%m"),times=as.numeric(Nodm[i,2]))
ANAts=cbind(monthANA,dayANA,auswahlANA)
ANATS<<-rbind(ANATS,ANAts)
}

The ANATS can than be transferred into a timeseries:

combine.date=as.character(paste(ANATS[,1],ANATS[,2],sep="-"))
DATE=as.Date(combine.date, format="%Y-%m-%d")
rownames(ANATS)=as.character(DATE)


ANATS=ANATS[-1,]
ANAXTS=as.xts(ANATS)

Altri suggerimenti

Maybe I'm having trouble understanding exactly what you're looking for, but are you trying to transpose the data?

t(data)
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top