Question

I created a dataframe:

totalDeposit <- cumsum(testd$TermDepositAMT[s1$ix])

which is basically calculating cumulative sum of TermDeposit amounts in testd dataframe and storing it in totalDeposit. This works perfectly ok.

I then need to calculate the average of the deposit amount and I use the following code:

avgDeposit <- totalDeposit / (1:testd)

But I get an error message:

Error in 1:testd : NA/NaN argument
In addition: Warning message:
In 1:testd : numerical expression has 19 elements: only the first used

testd has some 8000 observations and 19 variables.

Could someone help me get past this issue? I've attempted to locate this error message online but all I have understood so far is that 1:testd basically makes R read testd as a number which it isn't and hence I get an error message. Would simply taking mean(totalDeposit) do the trick? I tried it but the figure I get is absurd and nowhere representative of the average.

Thank you for your help.

Was it helpful?

Solution

The error message is, in this case, helpful.

When you say 1:N, what you're telling R is "give me the sequence of integers between 1 and N". It's from integer1 to integer2. testd isn't an integer, it's (at best) an entire vector of integers, and so R disposes of all but the first value in testd when calculating the sequence. The alternative would be either a horrible error or a set of sequences - one between 1 and the first value in testd, another between 1 and the second value in testd...and so on.

What you want instead is 1:nrow(testd), if testd is a data frame, and either 1:length(testd) or seq_along(testd) if it's a list or vector.

Based on the question, though - the need to calculate averages? - you're actually approaching this wrong, because you don't want a sequence of values, you just want one: since average = total/number of elements that went into that total, you just want 'the number of elements' - which can be retrieved simply with nrow(testd).

OTHER TIPS

It's pretty clear that testd is a dataframe or a list since you didn't get an error from testd$. If you had a testd in which the first element were a number but it was longer than one element you would only have gotten a warning. You perhaps wanted to write:

avgDeposit <- totalDeposit / 1:nrow(testd)

... although I admit that doesn't seem very useful. At least it won't throw an error.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top