I'm getting these programming errors in R - attempt to apply non-function, adding class "factor" to an invalid object

StackOverflow https://stackoverflow.com/questions/22954377

Question

I'm a newbie to R programming..I have a csv file contains items by country, life expectancy and region. And I've to do the following:

  1. List out no. of countries regionwise & draw bar chart
  2. Draw boxplot for each region
  3. Cluster countries based on life expectancy using k-means algorithm
  4. Name the countries that have the min & max life expectancy.

input.csv

Country,LifeExpectancy,Region
India,60,Asia
Srilanka,62,Asia
Myanmar,61,Asia
USA,65,America
Canada,65,America
UK,68,Europe
Belgium,67,Europe
Germany,69,Europe
Switzerland,70,Europe
France,68,Europe

What I did?

1.

mydata <- read.table("input.csv", header=TRUE, sep=",")
barplot(data$ncol(Region))

and I get the error Error in barplot(mydata$ncol(Region)) : attempt to apply non-function

  1. boxplot(LifeExpectancy~Region,mydata=data) ##This is correct

3 Have no idea how to do this!

4.min(mydata$LifeExpectancy);max(mydata$LifeExpectancy) ##This is correct

Was it helpful?

Solution

As I pointed out in my comments, this question is really multiple questions, and does not reflect the title. In future, please try to keep questions manageable and discrete. I'm not going to attempt to answer your third point (about K-means clustering) here. Search SO and I'm sure you will find some relevant questions/answers.

Regarding your other questions, have a careful look at the following. If you don't understand what a particular function is doing, refer to ?function_name (e.g. ?tapply), and for further enlightenment, run nested code from the inside out (e.g. for foo(bar(baz(x))), you could examine baz(x), then bar(baz(x)), and finally foo(bar(baz(x))). This is an easy way to help you get a handle on what's going on, and is also useful when debugging code that produces errors.

d <- read.csv(text='Country,LifeExpectancy,Region
India,60,Asia
Srilanka,62,Asia
Myanmar,61,Asia
USA,65,America
Canada,65,America
UK,68,Europe
Belgium,67,Europe
Germany,69,Europe
Switzerland,70,Europe
France,68,Europe', header=TRUE)

barplot(with(d, tapply(Country, Region, length)), cex.names=0.8, 
        ylab='No. of countries', xlab='Region', las=1)

barplot

boxplot(LifeExpectancy ~ Region, data=d, las=1, 
        xlab='Region', ylab='Life expectancy')

enter image description here

d$Country[which.min(d$LifeExpectancy)]

# [1] India
# Levels: Belgium Canada France Germany India Myanmar Srilanka Switzerland UK USA

d$Country[which.max(d$LifeExpectancy)]

# [1] Switzerland
# Levels: Belgium Canada France Germany India Myanmar Srilanka Switzerland UK USA
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top