Interpreting Density Plot in R

Question 1

The density is different in the two plots because in one case you have 365 times as many units horizontally, so the vertical units will need to be 1/365th those of the other plot, given that probability density functions (the areas under these curves) must sum to one.

This is easier to think about in terms of bins rather than density curves. If you have one bin replacing 365 bins, the probability of landing in the one bin is much higher than the average probability of landing in the individual bins.

For the specific sample data you provide, we can see the conversion between the vertical units by looking at the peaks of both functions:

> max(density(df$age)$y) # max of density in days, more horizontal units
[1] 0.0002178977
> df$ageinyears <- df$age/365 # create an age-in-years variable
> max(density(df$ageinyears)$y) # max density in years, fewer horizontals
[1] 0.07953267
> max(density(df$age)$y)*365 
[1] 0.07953267

The practical reason this is an issue in plotting (and possibly the main thrust of your question) is the function that is estimating the density for ggplot is inheriting the x argument from the parent aes(). So it does not know anything about the custom x-axis you are using. Rather than just changing the x-axis in your first plot, you could explicitly tell geom_density not to use the inherited x values:

ggplot(data = df, aes(x = age)) + 
    geom_density(aes(x = age/365, y = ..density..))

Question 2

The best advice is to just ignore the tick labels on the y-axis, they don't help at all with interpreting the density plot and as you have seen are more likely to confuse than to help.

My preference would be for the default behavior of density plots, histograms, and any similar plots to not label the y-axis tick marks since they generally don't mean anything and only tend to distract from the important parts of the graph and often cause confusion. Even when they are scaled to values intended to be meaningful they are not helpful for the main purpose of the plot and can still cause confusion (I changed the number of bins in my histogram and now my y-tick labels are very different, panic! panic!). Unfortunatly there is so much inertia in plotting them that I alone am unlikely to get this changed.