Specific Machine Learning Query about Estimating Training Values and Adjusting Weights

https://stackoverflow.com//questions/10661581

11-12-2019
|

Question

Hey I am really new to the field of machine learning and recently started reading the book Machine Learning by Tom Mitchell and am stuck on a particular section in the first chapter where he talks about Estimating Training values and also adjusting the weights. An explanation of the concepts of estimating training values would be great but I understand that it is not easy to explain all this so I would be really obliged if someone would be able to point me towards a resource (lecture video, or simple lecture slides, or some text snippet) that talks about the concept of estimating training data and the like.

Again I am sorry I cannot provide more information in terms of the question I am asking. The book sections are 1.2.4.1 and 1.2.4.2 in "Machine Learning by Tom Mitchell" if anyone has read this book and has had the same problem in understanding the concepts described in these sections.

Thanks in advance.

Solution

Ah. Classic textbook. My copy is a bit out of date but it looks like my section 1.2.4 deals with the same topics as yours.

First off, this is an introductory chapter that tries to be general and non-intimidating, but as a result it is also very abstract and a bit vague. At this point I wouldn't worry too much that you didn't understand the concepts, it is more likely that you're overthinking it. Later chapters will flesh out the things that seem unclear now.

Value in this context should be understood as a measure of the quality or performance of a certain state or instance, not as "values" as in numbers in general. Using his checkers example, a state with a high value is a board situation that is good/advantageous for the computer player.

The main idea here is that if you can provide every possible state that can be encountered with a value, and there is a set of rules that defines which states can be reached from the current state by doing which actions, then you can make an informed decision about which action to take.

But assigning values to states is only a trivial task for the end states of the game. The value attained at an end state is often called the reward. The goal is of course to maximize the reward. Estimating training values refers to the process of assigning guessed values to intermediate states based on the results you obtained later on in a game.

So, while playing many many training games you keep a trace of which states you encounter, and if you find that some state X leads to state Y, you can change your estimated value of X a bit, based on the current estimate for X and the current estimate of Y. This is what 'estimating the training weights' is all about. By repeated training, the model gets experienced and the estimates should converge to reliable values. It will start to avoid moves that lead to defeat, and favor moves that lead to victory. There are many different ways of doing such updates, and many different ways to represent the game state, but that is what the rest of the book is about .

I hope this helps!

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow