GO - Goroutine and Concurrency

https://softwareengineering.stackexchange.com/questions/341262

06-01-2021
|

Domanda

Background:

pthreads follow pre-emptive scheduling, whereas C++ fibers follow cooperative scheduling.

With Pthreads: the current execution path may be interrupted or preempted at any time This means that for threads, data integrity is a big issue because one thread may be stopped in the middle of updating a chunk of data, leaving the integrity of the data in a bad or incomplete state. This also means that the operating system can take advantage of multiple CPUs and CPU cores by running more than one thread at the same time and leaving it up to the developer to guard data access.

Using C,
int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
                          void *(*start_routine) (void *), void *arg);
Using threads, application may have Concurrency,

Properties of Concurrency:

1) Multiple actors

2) Shared resource

3) Rules to access(Atomic/Conditional synchronization)

With C++ fibers: the current execution path is only interrupted when the fiber yields execution This means that fibers always start and stop in well-defined places, so data integrity is much less of an issue. Also, because fibers are often managed in the user space, expensive context switches and CPU state changes need not be made, making changing from one fiber to the next extremely efficient. On the other hand, since no two fibers can run at exactly the same time, just using fibers alone will not take advantage of multiple CPUs or multiple CPU cores.

In Win32, a fiber is a sort of user-managed thread. A fiber has its own stack and its own instruction pointer etc., but fibers are not scheduled by the OS: you have to call SwitchToFiber explicitly. Threads, by contrast, are pre-emptively scheduled by the operation system.

So roughly speaking a fiber is a thread that is managed at the application/runti me level rather than being a true OS thread.

Using C,
void __stdcall MyScheduler(void *param){
  ....
}

LPVOID *FiberScheduler = CreateFiber(0, MyScheduler, NULL);

Why C++ fibers?

OS threads give us everything we want, but for a heavy performance penalty: switching between threads involves jumping back and forth from user to kernel mode, possibly even across address space boundaries. These are expensive operations partly because they cause TLB flushes, cache misses and CPU pipelining havoc: that’s also why traps and syscalls can be orders of magnitude slower than regular procedure calls.

In addition, the kernel schedules threads (i.e. assigns their continuation to a CPU core) using a general-purpose scheduling algorithm, which might take into account all kinds of threads, from those serving a single transaction to those playing an entire video.

Fibers, because they are scheduled at the application layer, can use a scheduler that is more appropriate for their use-case. As most fibers are used to serve transactions, they are usually active for very short periods of time and block very often. Their behavior is often to be awakened by IO or another fiber, run a short processing cycle, and then transfer control to another fiber (using a queue or another synchronization mechanism).Such behavior is best served by a scheduler employing an algorithm called “work-stealing”;When fibers behave this way, work-stealing ensures minimal cache misses when switching between fibers.

Fiber does not exploit the power of multiple core, because what OS knows is, single threaded process.

In GO, we invoke goroutines using go keyword

func main(){
  go f() // f runs as new go routine
  f()
}

Question:

1) Is GO routine(f) a fiber that is non-preemptively scheduled by GO runtime, in user space?

2) If yes, Does Concurrency situation arise in GO environment?

3) Does GO support api for OS level thread?

Soluzione

Question #1: Not really

Goroutines are a bit weird. They are somewhat similar to fibers, but also somewhat similar to threads.

They might be preempted.
They might be concurrent.
They might share resources.
They often block on a queue (channel).
They have their own stacks.
They are not directly scheduled by the OS, but by the golang runtime.

The golang runtime usually starts up to GOMAXPROCS threads by default and schedules your goroutines among them. On any given thread, a goroutine will run until completion or until it blocks on a channel.

That means you can think of goroutines as fibers shared among threads.

This means that you can think of goroutines that don't access global state like you would think of fibers. But for goroutines that do access global state, you need to treat of them like threads.

Question #2: Yes

You need to be mindful when accessing global state!

However, the default communication mechanism, channels, synchronizes access to shared resources, which eases concurrent programming in Go by a lot.

Question #3: Not in the standard library

If you really want to, you could start threads by writing a library in C for Go that gives you access to the underlying OS thread functions (like pthread_create).

However, I strongly doubt you could use goroutines and channels on threads created in this way, as the golang runtime scheduler has no knowledge of them.

This might also cause problems with calls to other libraries (like the standard library!) that assume access to goroutines and channels.

In conclusion, I don't think directly creating and managing threads in Go is a good idea.

Altri suggerimenti

Golang Goroutines are a compiler facility. Conceptually Goroutines and fibers are cooperative multitasking methods with respect to the Environment. Fibers are a OS level concept, whereas goroutine is a compiler level concept.

Goroutines may match a specification of Fibre (there seem to be many) but not necessarily fibre in the strictest sense. In fact a goroutines may be spread across multiple threads as decided by the scheduler.

For the provided explanation of fibre. Yeah, Goroutines could be called as an implementation of the fibre spec put by you.
You should ideally be thinking in terms of CSPs rather than concurrency.

Communicating Sequential Processes (CSP) is go's model of concurrency and Goroutines enable that. Isolating away the goroutine component and thinking of concurrency is a bit off from golangs perspective.

You are encouraged to use Channels to maintain flow of data and not rely on any other mechanism for syncing (as pike says, do not communicate by sharing, share by communicating). Though you can still use mutexes if you desire, but, you still cannot create threads on your own.

In fact, go's approach to concurrency is closer to erlang than other traditional languages and their implementations.

For the same reason as above. Go doesn't allow you to directly create threads (you can spawn multiple processes, though). You are encouraged to use only Goroutines with channels to prevent confusions from using multiple multitasking approaches.

Go uses the same number of OS threads as set by environment variable GOMAXPROCS (which as of GO1.5 is set to the CPU core count).

Here is a great discussion on it

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a softwareengineering.stackexchange