'Concurrency' vs 'Parallelism' -- 'Threads' vs 'Processes'

Question 1

First, to clarify the terminology that you are using:

a process is an entity managed by an operating system, typically the execution of a program;
a thread is an entity within a process that executes instructions sequentially.

In this context, processes have an amount of state that is maintained by the operating system to record details of registers, memory, permissions etc. The size of this state is typically larger than that of a thread and therefore the overhead of managing processes (as you say) is greater. See Wikipedia for more details.

So, to answer your question, threads and processes (as defined above) can be executed in parallel on multiple processors, if the operating system or underlying architecture by which they are executed supports it.

Conventional parallel processors are shared-memory and a conventional operating system is Linux. Linux supports the parallel execution of both processes and threads on shared memory (symmetric) multicores, but it does not support the execution of processes (or threads) over multiple processors (that is, unless they are in a shared-memory configuration). There have been a number of distributed operating systems that are designed to support the execution of processes or threads over multiple processors without shared memory, but these never caught on; see Wikipedia.

Conventional cluster-based systems (such as supercomputers) employ parallel execution between processors using MPI. MPI is a communication interface between processes that execute in operating system instances on different processors; it doesn't support other process operations such as scheduling. (At the risk of complicating things further, because MPI processes are executed by operating systems, a single processor can run multiple MPI processes and/or a single MPI process can also execute multiple threads!)

Finally, a simple (although unconventional) example, where threads and processes have a slightly different meaning, is the XMOS processor architecture. This allows multiple processor chips to be connected together and for multiple threads of sequential execution to execute over and communicate between them, without an operating system.

Question 2

everyone seems to agree that this is the only alternative for computing on multiple CPUs in parallel.

I have never heard this. In any case it is not true.

Does this mean that threads are not capable of running in real parallel on multiple CPU cores

The thread is the unit of scheduling in most OS'es. Processes are not scheduling units. At the very most they come into play as inputs to scheduling heuristics. Threads run on CPUs (in parallel), not processes.

or does it mean that multi-process computing is the only viable option if you need to run calculations on multiple physical CPU chips, such as cluster network supercomputers?

No. Processes do not enhance the scheduling capabilities of the OS.

The question was not very precisely asked. I hope I could clarify the important points.