Concurrency in Go is built around two concepts: goroutines and channels (note that the code examples below are roughly taken from these links). A goroutine is a fairly simple concept: it simply allows you to spawn a function that runs concurrently. For example, the statement go list.Sort() spawns a goroutine that sorts the list. The nice thing about this is that Go abstracts away the details of creating and managing threads while still efficiently utilizing the CPU. It is able to identify when a goroutine is blocking on I/O, for example, and ensure that the OS thread that is executing that goroutine is freed up to execute a different one. A world in which functions run concurrently and independently is great, but the interesting stuff comes when they need to communicate, which leads us to channels. It turns out that they are not very complex either; a channel is more or less just a blocking queue that multiple goroutines can take from and put into. It allows them to synchronize at certain points and send messages to each other. For example, if you wanted to wait for a goroutine to finish (e.g. the equivalent of joining a thread in Java), you would do something like this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
c := make(chan int) | |
go func() { | |
list.Sort() | |
c <- 1 | |
}() | |
<-c |
In the above code snippet, the c <- 1 line sends the value 1 (arbitrary) to the channel, while the <-c line blocks on receiving a value from the channel (which is ignored). By default channels have no buffer (imagine a blocking queue of size 0) so sends and receives wait for each other, but they can be given a capacity as we will see next.
Let's look at a slightly more involved example with a channel being used as a semaphore. In the code below we simulate a system that is processing requests from a queue; we would like to process requests in parallel, but limit the maximum number of simultaneous requests being processed (i.e. a typical semaphore use case).
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
var sem = make(chan int, MaxOutstanding) | |
func init() { | |
for i := 0; i < MaxOutstanding; i++ { | |
sem <- 1 | |
} | |
} | |
func Serve(queue chan *Request) { | |
for req := range queue { | |
<-sem | |
go func() { | |
process(req) | |
sem <- 1 | |
}() | |
} | |
} |
The number of elements in the channel indicates how many permits the semaphore has available, so getting a permit is receiving from the channel and releasing a permit is sending to the channel. We spawn a goroutine for each request in the queue (which is itself a channel) that processes the request and releases a permit.
The more interesting thing about this example, however, is that it contains an insidious bug. You might expect the closure that the goroutine executes to capture the current request that is being dispatched, but unfortunately it doesn't. In this case, the language closes over the reference to req rather than the value at the time of dispatch, which means that you get undefined behavior over which requests are actually processed. Each goroutine will process whatever request the req variable points to when it executes instead of when it was created. It is unfortunate that such a common pattern has a pitfall like this, and the pain is evidenced by numerous mentions of the problem (e.g. here and here) that suggest rather ugly solutions like defining another variable in the loop body or passing the value explicitly as a parameter to the goroutine.
As far as simple and familiar models of concurrent programming go, CSP based on goroutines and channels is not bad. The danger really lies in the fact that Go has a memory model similar to that of C++ (i.e. pointers and references) which can lead to bad code with the use of channels. The article explains it as, "Go enables simple, safe concurrent programming but does not forbid bad programming." This feels accurate to me and may be a practical choice for the language, but given our understanding of how easy it is to "program badly," one might wish that Go took a more progressive approach in addressing the problems around concurrency.
No comments:
Post a Comment