2.13.1Falcon multithreading model.

Forewords

This document is by no mean an exhaustive explanation of multithreading in general. Concepts as "mutex", "synchronization primitive", "thread", and the like are given for granted. The reader should already know the basics of multithreading and have an idea on the topic, as this document just deals with the specificities of Falcon approach to multithreading.

Basic Principles

Falcon multithreading is amied to maximize the efficiency of the VM running in a multithreading context, and of the execution of the scripts in separate threads.

Principles of efficient and robust multithreading can be resumed in two points:

Threads must run unhindered and free from synchronization with the rest of the application for the vast part of their life. Data exchange with other threads must happen rarely, and it must take a fraction of the time needed to perform data elaboration.
Synchronization must happen at the topmost layers of the logic controlling threads, as it is a critical operation in their life, which must be given maximum care and control. Burying synchronization down in the lower layers of code, or worse, hiding its presence through class encapsulation is to be avoided.

While real world is not perfect and there can be exceptions to this rules, using this two simple principles as a guide it is possible to write programs which maximally exploit the parallel computing facilities that modern computers provide, eliminating the risk of incurring in multithreading programming specific errors, as races, deadlocks and similar.

Scripting languages, as Falcon, perform a lot of operations in background that are out of the control of the script writers, and this makes them a quite hard landscape for multithreading. A rule that can be considered a corollary to the two main principles, "mutexes must be held for the shortest possible time", is quite hard to be respected when the simplest istruction in a scripting language can take many complex actions (in the order of hundreds) at machine code level.

For this reason, Falcon threading enforces the above principles with a "pure agent-based threading model".

Each Falcon thread is an agent, bound to perform non-trivial and long-lasting operations, which can exchange data with the other agents in the application through objects called "synchronization structures". Structures are relatively complex objects that allow safe communication and interaction between threads.

Each agent has its own application space where it is free to perform operations unhindered by the intervention of any other thread or by the Falcon engine. Exchange of data with the rest of the application can happen only thorough synchronization structures. Some of this structures are quite strightforward to be used, and in example, it is possible to share plain memory which can be directly manipulated by each thread as it prefers.

Multithreading implementation

The Falcon threading module provides each agent with a new Virtual Machine created on the spot. Those VMs are created "empty", that is, they will contain only the modules that were linked by the VM that started the thread as it was right before starting its execution.

VM related operations, as setting the garbage collection properties, termination requests, sleep requests, exceptions, memory pools and garbage collection loops are all local to a certain agent.

Exchange of Falcon items, as objects, strings, vectors and so on is performed through serialization in memory. As each item lives and resides in a VM, each item must stay consistent to the VM that created it. Providing other agents the ability to change agent data would require synchronization at deep level in very frequent spots, and would rapidly make a script multithreading application run almost a thread at a time, instead of being fully parallel.

However, the fact that items need to be serialized to be shared among agents doesn't necessarily means that they can't share memory. Falcon items are shells, representation of inner data which is provided by system-level code. While the items, which carry those data, must be copied between agents, the inner core which interacts with the system can just be shared and provided with proper synchronization at system level.

In example, it is possible to share Falcon streams between agents. Each agent will have its own copy of the stream object, with a local view of stream data as i.e. the number of bytes written in the last operation, or the position in a file, but the underlying system resource will be shared and concurrency will be regulated by system calls.

Embedding applications or other modules willing to work in multithreading can adopt the same strategy. Falcon gives the embedding applications and the module the ability to carry their data in a Falcon object, and receive relevant callbacks when a script needs to interact with that data. When an object is cloned for serialization, the embedding application will be notified, and it may perform proper action to prepare the data to be shared among threads. It may be as simple as a MT safe reference counting, or it may require more sophisticate operations; in example, sharing a Falcon Stream requires the inner layer of code to call a dup() system request to ask the operating system to create a duplicated file resource.

Once shared between threads in this way, application or module data must take actions to ensure proper synchronization. That is, property access and method call must be prepared to be called by different threads concurrently.

In this sense, it may be said that Falcon doesn't provide a "memory model", but allows each object to provide its own. While this may be thought as "confusing" for the script writers, once that the overall rules of the system (nothing is shared but...) plus the specific rules of the shared objects actually used by the script (... except this thing, when you do so) are known and followed, the overall complexity of a MT application built following this approach is by no mean higher than the complexity of a MT application built on a layer with a consistent and unique memory model.

Overall complexity of a MT application depends on the data flow, and primarily on the synchronization logic and on the interactions between threads. There is nothing preventing an application with local, object specific memory models to be actually less complex than one with a burned-in memory model. The constraints given by each synchronization structure, which may have different visibility and sharing rules, ensure that a simple set of rules are valid locally, while the rest of the program is simply "safe and local". This actually works towards simplification and legibility of MT code.

This may requires deep-level synchronization, which seems to contrast with the overall principles enunciated at the beginning of this section; but as long as this synchronization is kept minimal (i.e. just to ensure visibility of shared properties), or as long as the synchronization rules are available, known and controllable by the topmost level, the overall agent-based model is not broken.

Synchronization structures

Falcon agent-based model leverages on the concept of non-primitive structures used to synchronize and coordinate threads. Each structure has a different working principle which is explained in its detailed description, but they all share the concept of wait, acquisition and release.

An agent can wait for one or more structures to be ready for acquisition. When this happens, the agent is notified and it acquires atomically one of the structures it was waiting for.

Once perfomred the work for which the structure was needed, the agent must release the structure, and is then free to wait again or terminate.

Acquisition and release are not necessarily bound to exclusive access. There are structures that may be acquired by many threads, or others that can only be acquired (their release is an void operation). The concept of acquisition is rather a semantic equivalent to "allowance", "permission" to perform certain operations bound with the structure.

More details are explained in the description of each structure.