Tag Archives: C++

Husband’s notes: C³ as a Systems Programming Language

Bjarne Stroustrup describes C++ as a “general purpose programming language with a bias towards systems programming“. I aim to make C³ simply a “general purpose programming language”. It will provide high-level features that I hope will qualify C³ as the best language for tasks that are currently best served by domain-specific, scripting, or other general purpose languages. It is an incremental improvement over C++: it means that C³ will be suitable for a wider range of software projects than C++. It does not mean that systems programming will receive less care! Some languages make low-level tasks impossible or harder because they try to provide high-level facilities. In C³, all “levels” will get full support! In today’s notes, I explain the design choices that will make C³ the best systems programming language: there will be no bias, but there will be no compromise either.

I want C³ to become the first choice for systems programming tasks such as heavy-load application infrastructures, cutting edge 3D games, real-time applications, and operating systems. To achieve this, C³ will follow the C++ design guideline “you don’t pay for what you don’t use” even more than its predecessor. (See The Design and Evolution of C++.)

I always hear about new languages that add “just a little bit” of overhead to enable easier programming techniques. For example, they have garbage collection and built-in types like strings and maps. I believe it is possible to design a core language that will make easy programming possible without these integrated features. The key is to make the “user-defined” features first-class in syntax and performance as if they were integrated. Some C++ features like copy constructors, operator overloading and inline functions are steps in this direction but the resulting types often have rough edges and are not as efficient as built-in types. C³ users will be able to define seamless and fully optimizable modules, data types, and even control structures! Features like communication facilities, dynamic arrays, and loop structures (including multi-threaded ones!) will be available in the C³ library.

Because I want to support all possible programming environments, there will be no core language features that require any kind of system runtime support. When coding an OS, you’re on your own; there is no system support! The only requirement is a compiler back-end that maps a set of “native” abstractions to the target machine language. Everything will be provided to build higher abstractions from the ground up. Of course, many facilities that require support will be provided! You’ll find them in the C³ library and they’ll be very easy to use, as explained in my previous point.

The C programming language succeeded for OS programming because its core abstractions were close enough to the available hardware platforms. But we can do even better! An innovation in C³ is that the set of native abstractions may change between target machines. This flexibility will allow the C³ system programmer to target different memory models (remember near and far pointers?) and new computer architectures (quantum computing, anyone?). Moreover, it will enable more target platforms that are not necessarily hardware. Someone could implement a back-end that targets the JVM, the CLR, or even javascript (for web applications).

C is still the number one language for kernel-level programming. Unfortunately, it makes OS hacking harder than it should be because C does not provide high-level features. C++ tried to combine the best of both worlds but failed in this niche in part because of its system support requirements, even if they are small. It also does not give enough control on the implementation of its complex features. For example, sometimes, the exact layout of the vtable of an object must be in the control of the programmer. (See this interesting discussion featuring the Linux creator.) In C³, everything that is of a higher-level than C constructs will be fully customizable and implemented in C³. For known object-oriented features, usage will be mostly like in C++, but the definition will be accessible in the C³ library instead of its compiler. This will also enable programmers to design their own interoperability layer with other languages.

I could continue to write for hours about various details that will make systems programming in C³ both effective and enjoyable but I also have some compiler code to write. Maybe next time I will talk more about the C³ library that I keep mentioning without further explanations… See you soon!

Why would someone build its own compiler?

c3wifecompilerneed1Many programmers live in a very fluffy world, where there is an API for everything, where a nice compiler takes all that nice code into something that the machine will interpret correctly. Others will go further, and hack their way through uneasy code. Only a few will come to the end of the darkness, where no compiler fits their needs.

I admit this isn’t a feature that is often required, but when you faced it once, you never want to go back again. Having to choose between making your own compiler and a big, ugly massive hack that you will regret for a long time, telling yourself “if only I had access to the compiler sources, I could change this one little thing and everything would be perfect”.

It happened to me once in my short career as a programmer, back when I was working on porting someone else’s application to a new platform. We were working on a library that handled the generic part of most of our tasks, not an unusual strategy. For each project, we received frequent drops of the client’s code, hence we wanted to minimize the changes to their code. In one particular project, we had a nasty crash, hard to reproduce: it took a while to happen, it seemed random, no specific reproduction step could be identified. The stack trace didn’t give much information of what it was, but we finally identified the source: the memory management system in the client’s code.

The client’s code used a memory pool, which is very common in these kinds of program. They defined a global new and delete in order to wrap everything into their memory managing system. This means everything we created in our library that used global new and delete (int, long etc.) was also getting managed by the memory pool. The buffers being at their limit, linking with our library was making everything explode. Blindly increasing the size of the buffer solved this problem, but created others. A brief analysis of upcoming projects shown us that a similar problem could happen again. We had to made our library “global new and delete proof”.

The clean solution would have been to create a new and delete specific to a namespace, but this is impossible to do without modification to the compiler. We didn’t have the budget or task force to buy or create the compiler, so we resign to another solution. We replaced everything with malloc, add a mention to the training program that new and delete were banned, and quit on integrating other libraries in which we couldn’t make this ugly modification.

There are many other reasons to build your own compiler. The Excel team at Microsoft had its own C compiler, allowing them optimizations not possible otherwise and a total control on their software. You can read more about this in this very interesting article from Joel on Software.

My husband want to make the compiler code open (the license has yet to be defined). Once he’ll bootstrap it, he’ll make it available to the world. An important aspect of C3 is that you should not find yourself limited by the language itself. You may need to work hard for something unusual, but everything should be possible (code-wise).

The early days

The first ideas about C3 emerged while my husband was at the university. He was a first class student with very high marks even though he wasn’t the most studious. These led him to internship in major companies, but the inspiration didn’t really came from there, the dream was already in him before entering the university in year 2000. At that time, he wanted to created a language as simple as Visual Basic, but as powerful as C++. Learning more about C++ made the idea evolve into something else, the more he coded, the more ideas he had about a new language that would get the name C3 three years later.

The first step into reality was a C3 parser made with Spirit, which is a C++ compile-time parser generator based on expression templates. This parser takes a C3 file, creates the abstract syntax tree and outputs it in a file in XML format. This is somewhat of a proof of concept, a demonstration that the grammar works on practical examples. He acknowledge that this isn’t a full proof, because he haven’t shown that it works for all possibilities. This isn’t a priority anyway, many languages haven’t proved that either. This experience proved two important things. First, the grammar is well designed and there is no need for hacks to generate the abstract syntax tree (which is the case for many languages, including C++). And, he is a good enough programmer to handle such a project! I wouldn’t have married him if I wasn’t sure of that. Ah ah! At least, I wouldn’t be writing this blog.

There is a big void after this, he talked about it quite often, probably thought a lot about it, things continued to grow in this head, but no coding happened for a while. That’s when I met him, I don’t feel I was the distraction however. The company we worked for was very small, and even though we had fun, we worked many hours which didn’t leave much inspiration for coding at home. This would last from 2004 to 2008. The company changed a lot in this period, there are now more than 10 times the number of people than there was when we were hired. At the end of 2008, my soon to be husband completed a very important project and took a few weeks off, connecting with the Christmas vacation, which would make a full month of time to spare. In the second week, he started coding again, like this essential need to create C3, buried for too long, found the surface once again. He hasn’t stopped since, and now that he transfered to a more stable department, I feel confident he won’t stop. I hope that my vulgarization work will help him keep his motivation, this is my main goal in writing this blog (and maybe entertain you all a little, ah!)

Why the name C3

My husband told me last night the meaning of the name C3. He surprised himself not having already told me about this a long time ago. Of course, the notion of a third C comes to the mind, the first and second ones being C and C++. That’s all right and my husband think that it’s a worthy successor, one that keeps the spirit of its predecessors, at least more than other pretenders like C# and D.c3wifewhatisc3

You can call it “C three” if you want, but in fact the real pronunciation is “C cubed”. This notion of volume expresses the orthogonality of the language, the fact that all paradigms can be used in conjunction with the others. In fact Cn would have suited the language better, but still, the notion of volume says something very important: what is drawn inside simple axis is bigger than the sum of its basic elements.

For some times he thought about changing the language name, maybe break out with the C legacy. But when you called something by a name for that many year, it’s hard to get rid of it. I put an ultimatum on this when I bought my domain name. There is no intention to change it now.

Husband’s notes: Multiparadigm

C3 is a multi-paradigm language. It means that, like in C++, you can mix and match various programming styles. Here is the classical multi-paradigm example combining object-oriented, generic and functional programming and given by Bjarne Stroustrup himself:

for_each(shapes.begin(), shapes.end(), mem_fun(&Shape::Draw));

where Shape is a base class with a Draw virtual function and shapes is a container of pointers to different kinds of shape. The object-oriented part is the polymorphic call to Draw that will automatically dispatch to the appropriate Draw function depending on the actual type of the shapes. The generic part is the for_each algorithm that can be applied to any container of any type.

c3wifehusband1The mem_fun function is just an adapter that convert the member function pointer into a functor that accept a single argument: an object pointer. This last part constitutes the “functional” part of the example but as you can see, C++ does not handle this paradigm is an elegant way. The mem_fun function is a necessary evil because the object-oriented syntax is not directly compatible with function object as first-class citizen.

Please compare this last example with the same thing, written in C3:

foreach(shapes) {
    {1}->Draw();
}

The following syntax is also valid and equivalent:

foreach(shapes, {1}->Draw());

The first syntax was valid because the last parameter of a function can be outside the parenthesis when it is itself a function. This allow us to define basic language constructs like for and foreach loops as functions.

The {1} refers to the first parameter of a lambda procedure that is implicitly defined. The following example makes more explicit use of a lambda procedure:

foreach(shapes) void(Shape* pShape) {
    pShape->Draw();
}

And, if we decide to name the procedure and make it global:

void DrawShape(Shape* pShape) {
    pShape->Draw();
}

foreach(shapes, DrawShape);

Since a method is automatically also a procedure with an additional “this” parameter, we can also skip the additional procedure and simply pass the Draw method as parameter:

foreach(shapes, Draw);

But the previous examples will be more common since we usually need to do a little more than call a single function inside the loop.

C3 algorithms act on ranges. Containers are interpreted as ranges for all their content but there are other ranges that can be passed. For example, there is a method in random access containers to get a reference to a subrange of their content. It is used here to get the first 5 elements:

foreach(shapes.subrange(0, 5), Draw);

As a bonus for those interested, without additional explanations, here is the definition of the foreach procedure:

template <type T>
void foreach(Range<T> _range, void _action(T)) {
    for(Iterator<T> i = GetIterator(_range); !AtEnd(_range, i); ++i) {
        _action(*i);
    }
}

I will explain more details of the C3 algorithm library in another post. Meanwhile, make sure you know STL since my library will be mostly inspired by Alex Stepanov’s masterpiece.