Threads share a view of memory that processes do not. If you have a case where executors frequently need to make changes to the view of memory, a multi-threaded approach can be slower than a multi-process approach because of contention for the locks that internally protect the view of memory.
Threads also share file descriptors. If you have a case where files are frequently opened and closed, threads could wind up blocking each other for access to the process file descriptor table. A multi-process approach won't have this issue.
There can also be internal synchronization overhead in library functions. In a single-thread case, locks that protect process-level structures can be no-ops. In a multi-threaded case, these locks may require expensive atomic operations.
Lastly, multi-threaded processes may require frequent access to thread-local storage to implement things like errno
. On some platforms, these accesses can be expensive and can be avoided in a single-threaded process.