You should probably read a bit about the how the task scheduler works.
http://msdn.microsoft.com/en-us/library/ff963549.aspx (latter half of the page)
"The .NET thread pool automatically manages the number of worker
threads in the pool. It adds and removes threads according to built-in
heuristics. The .NET thread pool has two main mechanisms for injecting
threads: a starvation-avoidance mechanism that adds worker threads if
it sees no progress being made on queued items and a hill-climbing
heuristic that tries to maximize throughput while using as few threads
as possible.
The goal of starvation avoidance is to prevent deadlock. This kind of
deadlock can occur when a worker thread waits for a synchronization
event that can only be satisfied by a work item that is still pending
in the thread pool's global or local queues. If there were a fixed
number of worker threads, and all of those threads were similarly
blocked, the system would be unable to ever make further progress.
Adding a new worker thread resolves the problem.
A goal of the hill-climbing heuristic is to improve the utilization of
cores when threads are blocked by I/O or other wait conditions that
stall the processor. By default, the managed thread pool has one
worker thread per core. If one of these worker threads becomes
blocked, there's a chance that a core might be underutilized,
depending on the computer's overall workload. The thread injection
logic doesn't distinguish between a thread that's blocked and a thread
that's performing a lengthy, processor-intensive operation. Therefore,
whenever the thread pool's global or local queues contain pending work
items, active work items that take a long time to run (more than a
half second) can trigger the creation of new thread pool worker
threads."
You can mark a task as LongRunning but this has the side effect of allocating a thread for it from outside the thread pool which means that the task cannot be inlined.
Remember that the ParallelFor treats the work it is given as blocks so even if the work in one loop is fairly small the overall work done by the task invoked by the look may appear longer to the scheduler.
Most calls to the GC in and of them selves aren't blocking (it runs on a separate thread) but if you wait for GC to complete then this does block. Remember also that the GC is rearranging memory so this may have some side effects (and blocking) if you are trying to allocate memory while running GC. I don't have specifics here but I know the PPL has some memory allocation features specifically for concurrent memory management for this reason.
Looking at your code's output it seems that things are running for many seconds. So I'm not surprised that you are seeing thread injection. However I seem to remember that the default thread pool size is roughly 30 threads (probably depending on the number of cores on your system). A thread takes up roughly a MB of memory before your code allocates any more so I'm not clear why you could get an out of memory exception here.