Can a single process go-parallel on a single thread and without Forking itself?
That is up to the compiler. The compiler is free to implement instructions between sequence points any way it wants, although most compilers don't parallellize across multiple cores in single-threaded programs.
Example: Compilers for vector processors (like the Cray supercomputers) or for GPUs (graphics cards) use unrolling to parallellize loops.