Specifying C++ Concepts

Gabriel Dos Reis and Bjarne Stroustrup. Specifying C++ Concepts. POPL06. January 2006.

We discussed work on improving the C++ template facility before. The basic notion in this paper is concepts, a type system for templates, which will increase the expressiveness of template parameters, and improve compile time error messages. Separate compilation is also an important concern.

I am happy to report that Ada is mentioned, which is a good sign. However, I think there's room for a more detailed discussion and comparison. Other LtU readers will be glad to see that Haskell mentioned as well...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Bad template error reporting is not caused by lack of concepts.

All that is needed is to move the error reports outside of the template classes. For example:

template < class T > T square(T a, T b) {
    return a * b;
}

int main(int argc, char* argv[]) {
    const char *str = "hello world";
    square(str, str);
    return 0;
}

The error report of the MSVC++ 6.0 compiler is:

C:\temp\test_templ\test_templ.cpp(8) : error C2296: '*' : illegal, left operand has type 'const char *'
        C:\temp\test_templ\test_templ.cpp(15) : see reference to function template instantiation 'const char *__cdecl square(const char *,const char *)' being compiled
C:\temp\test_templ\test_templ.cpp(8) : error C2297: '*' : illegal, right operand has type 'const char *'
        C:\temp\test_templ\test_templ.cpp(15) : see reference to function template instantiation 'const char *__cdecl square(const char *,const char *)' being compiled

while it should have been this:

C:\temp\test_templ\test_templ.cpp(15) : error C7296: datatype 'const char *' does not support 'operator *'

or better:

C:\temp\test_templ\test_templ.cpp(15) : error C7296: multiplication is not defined over strings.

The idea above is to move error reporting from the template to the instantiation.

Of course better error reporting will not totally solve all the related problems, so concept checking is a good idea (plus it opens the door for better compile-time programming). If C++ had a proper type system, then everything would be a subtype of something, and concepts would not be needed to better error reporting. The division between user-defined types and language types is an artificial one that has harmed the language all these years.

But as it is, it would be helpful to have a concepts system. The concepts syntax seems peculiar though: wouldn't it be better to use a signature-based syntax? For example, the forward mutable iterator concept could be written like this:

template < class T > concept MutableForwardIterator {
    bool operator != (MutableForwardIterator < T > &) const;
    T operator * () const;
    MutableForwardIterator < T > operator = (const MutableForwardIterator < T > &);
    T operator ++();
};

I think it is a bad idea to have to declare variables in order to show to the compiler what I want to do. It's like showing somebody something by using examples, instead of a formal theory. The above seems to me much better than this (taken from the document):

concept Mutable_fwd < typename Iter, typename T > {
    Var < Iter > p;
    Var < const T > v;
    Iter q = p;
    bool b = (p != q);
    ++p;
    *p = v;
};

If C++ had a proper type


If C++ had a proper type system, then everything would be a subtype of something, and concepts would not be needed to better error reporting. The division between user-defined types and language types is an artificial one that has harmed the language all these years.

Why they don't want to use explicitly declared subtyping for concepts is argued in the article quite clearly (and is in contrast to the whole design of templates in C++). If you mean that implicit subtyping should be used, then concepts can be seen as that, IMHO.

The concepts syntax seems peculiar though: wouldn't it be better to use a signature-based syntax?

Because declaring all the possible operators in question would take much more code. The problem (as described in the article) is that it is very hard to take care of all the possible ways in which an expression is interpreted (directly applying the function, using an intermediate conversion or two, et cetera).

If C++ had a proper type system (2)

Why they don't want to use explicitly declared subtyping for concepts is argued in the article quite clearly

There is no mention of the word 'subtype' or its derivatives anywhere in the article. Despite that, I can understand the reasons subtyping is not equal to concepts: subtyping forces a strong hierrarchical structure, but usually things do not fall nicely in hierrarchical taxonomies.

But I have two things to point out:

1) subtyping would really help declare less concept declarations. For example, If I wanted an algorithm to accept only int-compatible types, I wouldn't have to declare the concept of int: I would use int subtypes.

2) there need not be new syntax, but only a new keyword: the keyword 'like'. It would be used to declare a type to be 'like' another one, i.e. to have the same signatures. For example:

class MyInterface {
    void method1();
    int method2();
};

template < class T like MyInterface > class MyTemplate {
};

Because declaring all the possible operators in question would take much more code. The problem (as described in the article) is that it is very hard to take care of all the possible ways in which an expression is interpreted

It's better having to write more code that is formal and follows established rules instead of having to write less code that is strange and introduces many unfamiliar paradigms. I expect lots of people to 'freak out' with concepts using variables, and once more C++ will be deemed 'complex'.

Concepts


1) subtyping would really help declare less concept declarations. For example, If I wanted an algorithm to accept only int-compatible types, I wouldn't have to declare the concept of int: I would use int subtypes.

I'm not quite sure what you mean here, but I'll say two things which both are connected to subtyping.

One of the proposed standard concepts is Usable_as<A, B>, which means that elements of type B are usable as elements of type A. (Usable_as<Shape*,B> being the standard example).

Regarding specifying conecepts. In section 3.1 of the article, they write that a concept may refer to other concepts, effectively reusing their definition. This makes it possible to extend previously defined concepts.


2) there need not be new syntax, but only a new keyword: the keyword 'like'. It would be used to declare a type to be 'like' another one, i.e. to have the same signatures.

Again, there is the problem of growth of the number of functions one needs to specify. A good example is the expression "*p++ = v;" (from section 2.5). Specifying it fully using signatures and types requires 2 auxilliary types and 15 functions.

For more on the design decisions behind concepts, see N1522,.


It's better having to write more code that is formal and follows established rules instead of having to write less code that is strange and introduces many unfamiliar paradigms. I expect lots of people to 'freak out' with concepts using variables, and once more C++ will be deemed 'complex'.

I don't really understand why you think that code specifying the valid expressions is so strange. For me, this is what I try to model when I write the interface of a class ("I want to be able to add this class with such an argument, I want to be able to call size()and get an integer, et.c."). The strangest part of the proposal, in my view, are the explicit assertions.

If you want, you could take a look at the other proposal for concepts. It may be more to your liking.

I'm not quite sure what you

I'm not quite sure what you mean here

It's simple, really: we want a template parameter to follow the interface specified either in a concept or in a supertype. For classes, there is not problem. For primitive types though, it is not. For example, a concept of primitive types is 'number', i.e. all numbers. Had 'int', 'double' etc be subtypes of 'number', there wouldn't be an issue of defining a concept for 'number'. In other words, primitives are not classes.

One of the proposed standard concepts is Usable_as<A, B>, which means that elements of type B are usable as elements of type A. (Usable_as<Shape*,B> being the standard example).

It is an ugly syntax.

Regarding specifying conecepts. In section 3.1 of the article, they write that a concept may refer to other concepts, effectively reusing their definition. This makes it possible to extend previously defined concepts.

Concepts could inherit from other concepts. If there was no separate concept declaration, the standard class declaration scheme would be enough; it would also allow using other classes as concepts.

Again, there is the problem of growth of the number of functions one needs to specify. A good example is the expression "*p++ = v;" (from section 2.5). Specifying it fully using signatures and types requires 2 auxilliary types and 15 functions.

First of all, only 'operator =', 'operator ++', 'operator *' are needed.
Secondly, I would rather have proper class syntax than this poor 'show by example' stuff.
Thirdly, it will create lots of problems for IDEs and other tools.
Forthly, if suddently C++ was found to be too verbose, we might as well dump it and start all over. It would be easier to make a source-to-source translator anyway.

I don't really understand why you think that code specifying the valid expressions is so strange.

1) it breaks all established concepts so far; for C++, the concept of class is central: interfaces, subtyping, encapsulation, inheritance, message passing, strong typing are all things revolving around classes. Suddently we have another thing to digest: the concept thing (concept foo {}), which not only does not behave like a class, but it also has weird syntax as well.

2) it will create problems for compiler vendors, since what is now code shall be treated as interface declarations. It's already complex to make a proper C++ compiler, now it shall be even more difficult. Of course that's not my problem really, but as a programmer, I have found compiler changes create unforeseen consequences. It will also take ages to correctly implement such a thing.

3) it's more difficult to thing about interfaces in terms of variables, than instead of types. If we have types only, all we need to think about is what types are allowed in our operations. If we have variables, we have an infinite amount of trouble in our hands. Functional languages try to eliminate variables, and C++ is going to use them in places they are not meant to? now that is strange!

4) newbies will be even more puzzled. They will see code that is not code, but it declares an interface. But that's what classes are for!

I want to be able to call size()and get an integer, et.c.").

What's wrong with this?

class MyPreciousClass {
    int size() const;
};

template < class T like MyPreciousClass > void myFunction(T *a) {
    ...
}

Perfectly understandable C++ (even to newbies), does not break already established concepts, it only needs one more keyword, syntax easy on the eye...

If you want, you could take a look at the other proposal for concepts. It may be more to your liking.

I agree with those guys, but I think the keyword 'concept' and the 'where' part are reduntant. Concepts is all about interfaces; the current syntax is more than capable (with the addition of a keyword similar to 'like') to specify any concept.

huh?

The division between user-defined types and language types is an artificial one that has harmed the language all these years.

How on earth does that follow? C++ goes way out of its way to ensure that abstract data types can behave just like built-in types. This division has been a great thorn in my side in Java, but doesn't exist in C++. Now, granted, not having this division makes the language pretty complex, what with different kinds of initializers, copy constructors, operator overloadings, &c., but the extent to which a class you code can be made to work just like a built-in unsigned char is one of the great strengths of C++. Maybe people don't like C++ because it's not as brain-dead as Java, but lack of ADT support is not among its faults!

Built-in types can not be subclassed.

Is 'int' a class? can I do the following?

class MyInt : public int {
};

I can not do it, because 'int' is not a class. Why 'int' is not a class? C++ goes a long way to make classes behave like primitives, but primitives are not classes. Since, in C++, objects are value types, there is no reason why primitives are not classes.

Let me remind you that C++ offers almost all the semantics of classes to primitives, except inheritance and operator (). Primitives have a default constructor (which initialize the primitive to a default value), copy constructors, assignment and arithmetic operators, logical operators, operator new and delete, etc. But they are not classes!

Point taken

I hadn't thought about it from the other direction -- you're right.

However, with int declaring no virtual functions or anything of the sort, without turning on RTTI, your MyInt wouldn't be very useful. I think the reason is simply that C++ isn't purely object-oriented, but lets you write very C-like code if you want, and making all datatypes into classes would only be useful if you then make their operators into virtual member functions instead of standalone functions. Needing to have a vtbl (and the associated double-dispatch) just to use an int would certainly be painful to some of the embedded users of C++.

Since one of the guiding lights of C++ is "you don't pay for what you don't use", I think that fixing this could only be done if you either:

A) abandoned C++'s goal to be a "better C" -- not likely

B) came up with some serious type-trait template hackery for built-ins -- beyond my skills

However, with int declaring

However, with int declaring no virtual functions or anything of the sort, without turning on RTTI, your MyInt wouldn't be very useful.

Actually, it would be VERY useful. It would allow us programmers to declare strong subtypes of primitives, without needing to type many unnecessary things.

Let me give you a real example: in one of the projects at work, we were using 'int' to express seconds and milliseconds. Since even 'typedef' does not really introduce a new type, at some obsure point in the code (we are talking about 120,000 lines of code), milliseconds were assigned to seconds, creating havoc and a very difficult to trace bug. If we could just inherit from 'int', we would have created two classes, specifying the conversion from/to seconds to milliseconds. Of course we could do that with existing C++, but we did not have time to go into typing all the code needed for a class to emulate int behaviour. If we could subclass 'int', then we would have to type much less code.

In the context of concepts, having 'int' as a class that can be subclassed, would allow us to declare template parameters that map to 'int' concepts (for example) without much effort.

making all datatypes into classes would only be useful if you then make their operators into virtual member functions instead of standalone functions

C++ classes have vtables only if they have at least a virtual method. It is not required for a class to have a vtable.

We're talking about 20-odd

We're talking about 20-odd lines including a macro or two in a 120,000 line project. Sounds to me like you did indeed have the time to do it, chose against and got bitten.

It was not the only data type. They were lots more.

It was not the only data type. They were lots more numeric datatypes based on primitives. What I mentioned was just an example.

So wrap the primitives once

So wrap the primitives once each and work from there with the tools you yourself were advocating as solutions.

hmmm

So the point is not to extend the int behavior, but to pick up all the existing behaviors without having to reinvent the wheel, like one does with template base-delegation (for example). It wouldn't have occurred to me to use a built-in type in such a manner, but I can see how that would be immensely useful.

Use enums

If we could just inherit from 'int', we would have created two classes, specifying the conversion from/to seconds to milliseconds.

Two words: Use enums.

More words: Values of integeral types need to be explicitly converted to an enumeration type. Conversion from an enumeration to an integral type is implicit. The syntax required the introduce a new enumeration is relatively light.

I can not use enums to express Seconds/Milliseconds.

I can not use enums to express Seconds/Milliseconds. Enum variables take only the values specified within the enum block, and operations like addition/subtraction/multiplication/division etc are not allowed on them. Plus enums are 32-bit ints, whereas some of the types where not 32 bits.

And if using enums somehow solved the problem, then they can not be used in the context of concepts.

About enums

Sorry, I don't want to waste my time quoting passages from the C++ standard or TCPL, but...

Enum variables take only the values specified within the enum block

That is incorrect. The number of bits an enumeration can hold depends on the maximum and minimum values of its enumerators.

operations like addition/subtraction/multiplication/division etc are not allowed on them

Conversion from an enumeration type to an integral type is implicit. Thus you can easily perform those operations on values of an enumeration type and the result will be a value of an integral type. However, you need to explicitly cast the result to an enumeration type.

enums are 32-bit ints

Maybe in some specific C++ implementation, but, in general, that is implementation and enumeration type dependent. But, if the type needs to be some specific number of bits wide (for binary compatibility) then enumeration types can't help.

And if using enums somehow solved the problem, then they can not be used in the context of concepts.

Enums allow you to effectively define new integral types that are distinct from all other types. Thus you can have an integer type for Seconds and another type for Milliseconds. This means that a value of Seconds can not be implicitly assigned to a variable of Milliseconds and vice versa.

I have no idea of why "the context of conceps" has anything to do with this.

I do not know where did you get that.

Here is a small C++ program to prove otherwise:

#include "stdafx.h"

enum ENUM {
    VALUE_FIRST = 0,
    VALUE_LAST = 255
};

int main(int argc, char* argv[])
{
    ENUM v;    
    v = 15;
    return 0;
}

And here is what MSVC++ 2005 compiler says (MSVC++ 2005 has 99% conformance to the current ISO C++ standard):

------ Build started: Project: test_enums, Configuration: Debug Win32 ------
Compiling...
test_enums.cpp
c:\temp\test_enums\test_enums\test_enums.cpp(14) : error C2440: '=' : cannot convert from 'int' to 'ENUM'
        Conversion to enumeration type requires an explicit cast (static_cast, C-style cast or function-style cast)
Build log was saved at "file://c:\temp\test_enums\test_enums\Debug\BuildLog.htm"
test_enums - 1 error(s), 0 warning(s)
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========
The number of bits an enumeration can hold depends on the maximum and minimum values of its enumerators.

If I modify the program to print the size of the enum, I get '4'.

Then, if I do this:

enum ENUM {
    VALUE_FIRST = 0,
    VALUE_LAST = _I64_MAX
};

I get this from the compiler:

c:\temp\test_enums\test_enums\test_enums.cpp(9) : warning C4341: 'VALUE_LAST' : signed value is out of range for enum constant
c:\temp\test_enums\test_enums\test_enums.cpp(9) : warning C4309: 'initializing' : truncation of constant value
However, you need to explicitly cast the result to an enumeration type.

Effectively this disallows using enums instead of integers, unless one is willing to cast everything to the enum, which is not a good thing to have to do, for obvious reasons.

But, if the type needs to be some specific number of bits wide (for binary compatibility) then enumeration types can't help.

I have not yet seen a platform where an enum is not 32 bits. Have you?

Enums allow you to effectively define new integral types that are distinct from all other types. Thus you can have an integer type for Seconds and another type for Milliseconds. This means that a value of Seconds can not be implicitly assigned to a variable of Milliseconds and vice versa.

No, enums are not for 'new integral types'. Enums are homogeneous union types that are subtypes of scalars and that are automatically enumerated by the compiler.

I have no idea of why "the context of conceps" has anything to do with this.

The discussion is about C++ concepts. True subtypes of integral types would allow for using them as concepts without the need to declare new classes.

Conversion to enumeration

Conversion to enumeration type requires an explicit cast

Yes, that's what I've been saying all the time. The point is that requiring explicit casts eliminates the possibility of implicitly assigning Seconds to Milliseconds or vice versa.

signed value is out of range for enum constant

A limitation of a particular compiler.

I have not yet seen a platform where an enum is not 32 bits. Have you?

Yes, I have, for example, seen compilers that use 16-bit enums. The C++ standard also allows an enumeration type to have more or less bits than the type int has. Unfortunately, many C++ implementations don't do that, because their legacy ABI doesn't allow it.

unless one is willing to cast everything to the enum

You can introduce overloaded operators for your enumeration type to avoid casting. (C++ 101)

No, enums are not for 'new integral types'.

Sigh. What I'm saying here is that you can (mis)use enumeration types to introduce new integer like types. The benefit, when compared to other solutions, is that it is syntactically light; only a single enum declaration is initially needed (and overloaded operators can be introduced later if explicit conversions are very common). I've used enumeration types to help to distinguish between different kinds of integers in C++ many times. Although it isn't a panacea, it helps to eliminate bugs like the one you encountered.

Enums in C++ (Final Draft) Standard

(MSVC++ 2005 has 99% conformance to the current ISO C++ standard)

Really? Where did you get that number? I would bet that the C++ standard isn't even close to 99% conforming with itself.

boost:date_time

Achilleas Margaritis: Let me give you a real example: in one of the projects at work, we were using 'int' to express seconds and milliseconds. Since even 'typedef' does not really introduce a new type, at some obsure point in the code (we are talking about 120,000 lines of code), milliseconds were assigned to seconds, creating havoc and a very difficult to trace bug. If we could just inherit from 'int', we would have created two classes, specifying the conversion from/to seconds to milliseconds. Of course we could do that with existing C++, but we did not have time to go into typing all the code needed for a class to emulate int behaviour. If we could subclass 'int', then we would have to type much less code.

Sounds to me like you somehow overlooked Boost.Date_Time. Boost also has some approaches to "Concepts" that are geared towards existing compilers, so it's always worth taking a look at to see if it does what you need, like Date_Time, or whether you can build on some of its frameworks, e.g. its various Traits classes, to build what you need.

Your link leads to this site.

Boost was not an option in 1999 when the project was started. And it would not be an option today, because the customer does not want a multi-MB download that is extra to Visual Studio, nor does the licence permit it anyway.

Makes No Sense

Achileas Margaritis: Boost was not an option in 1999 when the project was started. And it would not be an option today, because the customer does not want a multi-MB download that is extra to Visual Studio, nor does the licence permit it anyway.

I confess that I don't understand: I'm reasonably confident that Boost existed in 1999 (although I don't know whether Boost.Date_Time was present then even if Boost in some form was). Prior to the use of the Boost Software License, most Boost libraries used one or another of the non-viral Open Source licenses, albeit with variation as to whether an acknowledgement needed to be in the documentation etc. And Boost isn't monolithic; there'd be no need for a multi-megabyte download since Boost.Date_Time wouldn't have added multiple megabytes to your binary.

Or perhaps you mean that your product was distributed in source form, in which case you could just have included the then-current version of Boost on the CD.

Regardless, even if you don't use any of the code, Boost remains an excellent source of information as to how to use C++ features in popular (and some not-so-popular!) compilers to implement requirements like the one that you described, even if all you do is crib their approach, but do your own implementation.

Details:

Indeed, the contractor wants the executable and the source code. Back in 1999, they wanted the code in diskettes. The reason was that the target laptop computer did not have a CD. It was a military project, with a special laptop, and special version of Windows, and the contractor's programmers were familiar with ADA but not with C++, so that's the reason they only wanted the libraries Microsoft ships with Visual Studio.

alternative link