First, the best documentation on the subject is Part V of CLR via C#.
Secondly, I would expect the Parallel.Foreach to perform better because it will not only create Tasks, but group them. In Jeffrey Richter's book, he explains that tasks that are started individually, will be put on the thread pool queue. There is some overhead to locking the actual thread pool queue. To combat this, Tasks themselves have their own queue for Tasks that they create. This task sub-queue held by the Tasks can actually do some work without locking!
I would have to read that chapter again (Chapter 27), so I am not sure that Parallel.Foreach works this way, but this is what I would expect it to do.
Locking, he explains, is expensive because it requires accessing a kernel level construct.
In either case, do not expect them to process sequentially. Using Parallel.Foreach is less likely to process sequentially than the foreach keyword due to the aforementioned internals.