The idea of passing the next one along makes no sense to me. So I want a chain of:
A - B - C - D
How does C know about D? If it's in the code for C, then any change to the chain is going to be a huge hassle to implement.
The chain needs to either follow some other path that already exists, as for instance responders do when they just bubble up requests for help each to its respective parent (the example in the Gang of Four book), or you need to construct the chain, which is why at the bottom of the section in Go4, they mention the Composite Pattern as a naturally occurring accomplice.
Note also that one of the main reasons for doing Chain of Responsibility is when the types that might operate on the item are different. Which makes implementing it with an interface in Java perfect.
To answer your main question: the benefit of using Chain of Responsibility in this case is two fold: 1. you are not making a god object that knows about all things that could ever happen to achieve the goal (the successful construction of a Policy), and 2. you are not having to put in a lot of ugly checking code to see when you have reached terminus, because whoever handles it, by not calling its successor, will prompt the return of the finished item.