Inspirel banner

Possible Syntax for C++ Threads

Introduction

At the time of this writing, C++ has no built-in support for threads. There are many possible approaches to this that are based on library support - these range from low-level APIs (like pthreads) to higher-level library solutions (like Boost.Threads).

This article describes what could be a possible built-in support for threads in C++. This approach is not based on any library, but at the same time preserves "the C++ culture" in terms of the level of integration between thread-related constructs and the rest of the language.

This article does not discuss the necessary memory model and is not a formal language extension proposal.

What is the problem

The problem is that library-based solutions impose either the "thread is a function" or the "thread is an object" point of view. While such approaches have their merits, the threads themselves should be decoupled from any such programming paradigms.

In other words, if we consider concurrent structure to be in some way a generalization of the sequential structure, then the language constructs that support concurrency should extend the set of constructs used for sequential code. Threads should be perceived only on the basis of their dynamic ability to perform some action. This is why threads should be handled at the level of program control statements, just like loops or branches. In in this sense, threads cannot have a good and satisfying support on the library level, without fundamental support from the language instruction set.

Basics - control statements

The following is a control statement:

threaded
{
    // A
    // ...
}
other
{
    // B
    // ...
}
// with possibly more branches:
other
{
    // N
    // ...
}

// X
// ...

The semantics of the above is that:

The threaded statement may have arbitrary number of branches.

Note: Single-branch threaded statement does not introduce any concurrency.

The following is another control statement:

async threaded
{
    // A
    // ...
}

// X
// ...

and it has the following semantics: instructions A are executed, but instructions X are executed without waiting for A to complete.

Alternatives:

Example 1:

void fun()
{
    threaded
    {
        play("silentnight.mp3");
    }
    other
    {
        cout << "Merry Christmas!" << endl;
    }
}

Above, the "silent night" will be played and "Merry Christmas" will be displayed, possibly in parallel, where "possibly" is related to the actual ability of the underlying platform to provide true concurrency. The fun() function will return only when both of these actions are done.

The above example is not really concerned whether there is a separate thread physically created each time this is executed or not - it might as well reuse some thread that was created previously and that already finished its work; this should not be a concern for the programmer. Moreover, if the instructions in all branches are relatively simple, then the threaded statement may even map into explicit parallel execution in the multicore CPU, if that is the target, without actually using anything like system-level threads. This issue shows that the threaded statement deals with concurrency at the pure conceptual level, which itself is decoupled from the underlying mechanisms.

Example 2:

void handleConnection(socket s);

void runWWWServer()
{
    socket s;
    s.listen(80);

    while (true)
    {
        socket a = s.accept();

        async threaded
        {
            handleConnection(a);
        }
    }
}

Above, the WWW server spawns a separate thread for each accepted client connection. This connection is processed in a separate thread by the handleConnection() function.

Similarly to the previous example, it is not really important whether the new physical thread is actually created or not each time there is a new connection - what is important is only the fact that each new connection will be handled in parallel with regard to other actions in the program and that the main server loop does not have to wait while the individual requests are processed.

Visibility of names

The tricky part is to provide some way for the threads to see names from the enclosing local scope.

Let's consider the threaded control statement first:

void fun()
{
    int i = 7;
    int j = 8;

    threaded
    {
        cout << i << endl;
    }
    other
    {
        j += i;
    }

    cout << j << endl;
}

The above statements are supposed to print the value of variable i (7) and add it to the variable j - possibly in parallel. After both of these actions are finished, the value of j is printed (which should be 15).

Since the threaded control structure does not conceptually leave the enclosing local scope (which is the scope of function fun() here), all names that are available in that scope can be also visible in the scopes created by all branches of the threaded statement - the objects from enclosing local scopes are guaranteed to live during the whole execution of all branches.

From the implementation point of view the visibility of names could be achieved by passing references to all used objects (but only from the enclosing local scopes) to all new threads. Of course, if any two branches of the threaded statement need to modify the same object from the enclosing local scope, they have to synchronize their actions.

The async threaded control statement is a bit more problematic:

void fun()
{
    int i = 7;

    async threaded
    {
        cout << i << endl;
    }
}

The above example is supposed to print the value of variable i (7), but the function fun() may actually return before the printing takes place. This means that there is a problem of referencing names from the enclosing scope that may no longer exist at the time they are used in the separate thread.

To solve this problem, the automatic objects from the enclosing scope (those which are used in the async threaded block) are copied to the separate thread's stack before the spawning thread is allowed to continue. Local objects with static storage duration can be accessed by reference. The copy performed for automatic local objects should be conceptually equivalent to the following:

T t; // some local object referenced inside the async threaded block

async threaded // at this point the main thread is blocked
{
    T __my_own_copy(t);

    // at this point both threads are allowed to continue

    // ...
}

This would mean that in the following example:

void fun()
{
    int i = 7;
    int &r = i;

    async threaded
    {
        i = 8; // (1) safe, but has no effect on the enclosing scope
        r = 9; // (2) dangerous
    }

    // ...
}

what actually happens is conceptually equivalent to this:

void fun()
{
    int i = 7;
    int &r = i;

    async threaded // at this point the calling thread (the one executing fun()) is blocked
    {
        int __i(i);
        int &__r(r);

        // at this point the calling thread is allowed to continue

        __i = 9; // (1) safe, but has no effect on the enclosing scope: working on local copy
        __r = 8; // (2) might be a dangling reference if the enclosing block already left
    }

    // ...
}

Above, The line marked as (1) has no effect on the variable i from the enclosing scope even when the proper synchronization is used to ensure memory visibility between threads (and even assuming that the enclosing scope still exists when this is performed), because the variable __i is a local copy and it is that copy which is modified, not the original variable i. The line marked as (2) is dangerous if the threads do not perform any synchronization to ensure that the automatic object i still exists when this line is executed. The language is not in a position to "fix" it, since it is always possible to create a dangling reference (or a pointer) and threads are no special in this matter.

The subject of dangling references and its possible solutions appears also in another article, Possible Syntax for C++ Lambda.

Library support

The library support is needed to provide some set of synchronization objects. At the minimum, objects like mutex (possibly with RAII handler like mutex::lock) and condition variable are needed. RW-locks and other specialized (or higher-level) synchronization objects are also welcome. In any case - the important thing is that threads synchronize themselves (and between each other) using objects.

It is possible to imagine further language (built-in) support for implementing monitors or protected objects, but this is orthogonal to the main subject of this article. The focus was on presenting the possible control structures and syntax that expresses the concurrency in the program and this seems to be independent from different strategies and ways of synchronizing their work.

Feasibility and similar existing solutions

The syntax and control structures presented in this article can be implemented in terms of POSIX threads. Depending on the context, it may also map into the parallel execution in multicore CPUs.

Note also that the threaded/other construct is not conceptually new. This is in fact what is known as cobegin and exists in a very similar form for example in occam2, as the PAR construct.

Known questions

Problems