I've been giving a lot to how we tackle concurrency in Fan. After I finish porting the compiler to Fan, I believe concurrency is the next major feature to design. Concurrency is going to be a major opportunity for Fan to leapfrog main stream programming languages to boost productivity and to handle CPUs going multi-core. This post is a bit of out load thinking to capture my current thoughts and get you guys thinking about the problem.
I've been mulling over three basic approaches:
Low level locks like the Java synchronized keyword
Software Transactional Memory
Message queues
Both Java and C# stick with the basic low primitives which requires manual locking of shared resources. This mechanism is completely out of the question in my opinion. Moving from pointers to GCed references was a huge stride in boosting developer productivity. Today the number one problem is race conditions and deadlocks - using low level locks is just too damn hard to get right. My goal is to eliminate this class of errors in Fan applications, so that we aren't stuck wasting time trying to reproduce race conditions.
Perhaps the most novel approach is software transactional memory. This is basically using database like transactions for memory based objects. It can be very efficient because it can use optimized locking. However I'm having a hard time grokking it. Implementation is complicated, potentially requiring a counter on all mutable object instances. Furthermore it still requires the developer to manually write code inside a transaction - if the developer forgets then we can still end up with race conditions.
So I've kind of settled on message queues as the optimal solution - Erlang is the prototypical example. In this approach the only way for threads to share data is by posting messages to queues used to asynchronously communicate. Messages would be either immutable or serialized (could just be a deep copy for in-process). This design has a lot of strengths. First it is pretty much impossible to mess up because two threads can never even have a reference to the same object - effectively you have a logical heap per Thread. Another huge advantage is that it becomes trivial to transparently move threads onto different machines. It's also an in-your-face design - developers are forced to deal with concurrency, as opposed to hoping the developer thought about re-entrant code with proper synchronization (something I'm definitely guilty of not doing).
brian Sat 19 Aug 2006
I've been giving a lot to how we tackle concurrency in Fan. After I finish porting the compiler to Fan, I believe concurrency is the next major feature to design. Concurrency is going to be a major opportunity for Fan to leapfrog main stream programming languages to boost productivity and to handle CPUs going multi-core. This post is a bit of out load thinking to capture my current thoughts and get you guys thinking about the problem.
I've been mulling over three basic approaches:
Both Java and C# stick with the basic low primitives which requires manual locking of shared resources. This mechanism is completely out of the question in my opinion. Moving from pointers to GCed references was a huge stride in boosting developer productivity. Today the number one problem is race conditions and deadlocks - using low level locks is just too damn hard to get right. My goal is to eliminate this class of errors in Fan applications, so that we aren't stuck wasting time trying to reproduce race conditions.
Perhaps the most novel approach is software transactional memory. This is basically using database like transactions for memory based objects. It can be very efficient because it can use optimized locking. However I'm having a hard time grokking it. Implementation is complicated, potentially requiring a counter on all mutable object instances. Furthermore it still requires the developer to manually write code inside a transaction - if the developer forgets then we can still end up with race conditions.
So I've kind of settled on message queues as the optimal solution - Erlang is the prototypical example. In this approach the only way for threads to share data is by posting messages to queues used to asynchronously communicate. Messages would be either immutable or serialized (could just be a deep copy for in-process). This design has a lot of strengths. First it is pretty much impossible to mess up because two threads can never even have a reference to the same object - effectively you have a logical heap per Thread. Another huge advantage is that it becomes trivial to transparently move threads onto different machines. It's also an in-your-face design - developers are forced to deal with concurrency, as opposed to hoping the developer thought about re-entrant code with proper synchronization (something I'm definitely guilty of not doing).