On coursera what exactly does Andrew Ng say in videos Lectures 60 & 61 of machine learning?

https://datascience.stackexchange.com/questions/10047

16-10-2019
|

Question

Model Selection and Train/Validation/Test Sets - Stanford University | Coursera:

At 10:59~11:10

One final note: I should say that in the machine learning as of this practice today, there aren't many people that will do that early thing that I talked about, and said that, you know...

Is my comprehension correct? Because English subtitles on coursera sometimes are not correct. As I konw, here what Chinese subtitle means is opposite to what English one does. So I am not sure whether Andrew Ng said "there aren't" or "there are"

Thanks for your reading.

I would like to ask another one.

Diagnosing Bias vs. Variance - Stanford University | Coursera:

At 02:34~02:36, what Andrew Ng said is not quite clear as well as the English subtitle.

My comprehension is as following

If d equals 1,.... to be high training error.

It's not that complete.

Would anyone like to identify that?

Thank you...

Solution

No, he actually says the opposite:

One final note: I should say that in the machine learning as of this practice today, there are many people that will do that early thing that I talked about, and said that, you know...

Then he says (the "early thing" he talked about):

selecting your model as a test set and then using the same test set to report the error ... unfortunately many people do that

In this lesson he explains about separating the data set:

training set to train the model;
cross validation set to find the right parameters;
test set to find the final generalization error (of the function with the best parameter values found during using the cross validation set).

So Andrew Ng is complaining that many people us the same data set to find the right parameters, and then report the error of that data set as final generalization error.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange