Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What new capabilities do user-defined literals add to C++?

People also ask

What are user defined literals?

Published October 8, 2021. User defined literals were introduced in C++11, evolved in C++14 and C++17, and are a nice way to write more expressive code. The general idea behind user defined literals is that they allow to write a value and tack on a term describing what this value represents.

What are string literals in C?

A "string literal" is a sequence of characters from the source character set enclosed in double quotation marks (" "). String literals are used to represent a sequence of characters which, taken together, form a null-terminated string. You must always prefix wide-string literals with the letter L.

How do you define a literal in C++?

Literals are data used for representing fixed values. They can be used directly in the code. For example: 1 , 2.5 , 'c' etc. Here, 1 , 2.5 and 'c' are literals.

What is integer literal in C?

Integer Literals An integer literal can be a decimal, octal, or hexadecimal constant. A prefix specifies the base or radix: 0x or 0X for hexadecimal, 0 for octal, and nothing for decimal. An integer literal can also have a suffix that is a combination of U and L, for unsigned and long, respectively.


At first sight, it seems to be simple syntactic sugar.

But when looking deeper, we see it's more than syntactic sugar, as it extends the C++ user's options to create user-defined types that behave exactly like distinct built-in types. In this, this little "bonus" is a very interesting C++11 addition to C++.

Do we really need it in C++?

I see few uses in the code I wrote in the past years, but just because I didn't use it in C++ doesn't mean it's not interesting for another C++ developer.

We had used in C++ (and in C, I guess), compiler-defined literals, to type integer numbers as short or long integers, real numbers as float or double (or even long double), and character strings as normal or wide chars.

In C++, we had the possibility to create our own types (i.e. classes), with potentially no overhead (inlining, etc.). We had the possibility to add operators to their types, to have them behave like similar built-in types, which enables C++ developers to use matrices and complex numbers as naturally as they would have if these have been added to the language itself. We can even add cast operators (which is usually a bad idea, but sometimes, it's just the right solution).

We still missed one thing to have user-types behave as built-in types: user-defined literals.

So, I guess it's a natural evolution for the language, but to be as complete as possible: "If you want to create a type, and you want it to behave as much possible as a built-in types, here are the tools..."

I'd guess it's very similar to .NET's decision to make every primitive a struct, including booleans, integers, etc., and have all structs derive from Object. This decision alone puts .NET far beyond Java's reach when working with primitives, no matter how much boxing/unboxing hacks Java will add to its specification.

Do YOU really need it in C++?

This question is for YOU to answer. Not Bjarne Stroustrup. Not Herb Sutter. Not whatever member of C++ standard committee. This is why you have the choice in C++, and they won't restrict a useful notation to built-in types alone.

If you need it, then it is a welcome addition. If you don't, well... Don't use it. It will cost you nothing.

Welcome to C++, the language where features are optional.

Bloated??? Show me your complexes!!!

There is a difference between bloated and complex (pun intended).

Like shown by Niels at What new capabilities do user-defined literals add to C++?, being able to write a complex number is one of the two features added "recently" to C and C++:

// C89:
MyComplex z1 = { 1, 2 } ;

// C99: You'll note I is a macro, which can lead
// to very interesting situations...
double complex z1 = 1 + 2*I;

// C++:
std::complex<double> z1(1, 2) ;

// C++11: You'll note that "i" won't ever bother
// you elsewhere
std::complex<double> z1 = 1 + 2_i ;

Now, both C99 "double complex" type and C++ "std::complex" type are able to be multiplied, added, subtracted, etc., using operator overloading.

But in C99, they just added another type as a built-in type, and built-in operator overloading support. And they added another built-in literal feature.

In C++, they just used existing features of the language, saw that the literal feature was a natural evolution of the language, and thus added it.

In C, if you need the same notation enhancement for another type, you're out of luck until your lobbying to add your quantum wave functions (or 3D points, or whatever basic type you're using in your field of work) to the C standard as a built-in type succeeds.

In C++11, you just can do it yourself:

Point p = 25_x + 13_y + 3_z ; // 3D point

Is it bloated? No, the need is there, as shown by how both C and C++ complexes need a way to represent their literal complex values.

Is it wrongly designed? No, it's designed as every other C++ feature, with extensibility in mind.

Is it for notation purposes only? No, as it can even add type safety to your code.

For example, let's imagine a CSS oriented code:

css::Font::Size p0 = 12_pt ;       // Ok
css::Font::Size p1 = 50_percent ;  // Ok
css::Font::Size p2 = 15_px ;       // Ok
css::Font::Size p3 = 10_em ;       // Ok
css::Font::Size p4 = 15 ;         // ERROR : Won't compile !

It is then very easy to enforce a strong typing to the assignment of values.

Is is dangerous?

Good question. Can these functions be namespaced? If yes, then Jackpot!

Anyway, like everything, you can kill yourself if a tool is used improperly. C is powerful, and you can shoot your head off if you misuse the C gun. C++ has the C gun, but also the scalpel, the taser, and whatever other tool you'll find in the toolkit. You can misuse the scalpel and bleed yourself to death. Or you can build very elegant and robust code.

So, like every C++ feature, do you really need it? It is the question you must answer before using it in C++. If you don't, it will cost you nothing. But if you do really need it, at least, the language won't let you down.

The date example?

Your error, it seems to me, is that you are mixing operators:

1974/01/06AD
    ^  ^  ^

This can't be avoided, because / being an operator, the compiler must interpret it. And, AFAIK, it is a good thing.

To find a solution for your problem, I would write the literal in some other way. For example:

"1974-01-06"_AD ;   // ISO-like notation
"06/01/1974"_AD ;   // french-date-like notation
"jan 06 1974"_AD ;  // US-date-like notation
19740106_AD ;       // integer-date-like notation

Personally, I would choose the integer and the ISO dates, but it depends on YOUR needs. Which is the whole point of letting the user define its own literal names.


Here's a case where there is an advantage to using user-defined literals instead of a constructor call:

#include <bitset>
#include <iostream>

template<char... Bits>
  struct checkbits
  {
    static const bool valid = false;
  };

template<char High, char... Bits>
  struct checkbits<High, Bits...>
  {
    static const bool valid = (High == '0' || High == '1')
                   && checkbits<Bits...>::valid;
  };

template<char High>
  struct checkbits<High>
  {
    static const bool valid = (High == '0' || High == '1');
  };

template<char... Bits>
  inline constexpr std::bitset<sizeof...(Bits)>
  operator"" _bits() noexcept
  {
    static_assert(checkbits<Bits...>::valid, "invalid digit in binary string");
    return std::bitset<sizeof...(Bits)>((char []){Bits..., '\0'});
  }

int
main()
{
  auto bits = 0101010101010101010101010101010101010101010101010101010101010101_bits;
  std::cout << bits << std::endl;
  std::cout << "size = " << bits.size() << std::endl;
  std::cout << "count = " << bits.count() << std::endl;
  std::cout << "value = " << bits.to_ullong() << std::endl;

  //  This triggers the static_assert at compile time.
  auto badbits = 2101010101010101010101010101010101010101010101010101010101010101_bits;

  //  This throws at run time.
  std::bitset<64> badbits2("2101010101010101010101010101010101010101010101010101010101010101_bits");
}

The advantage is that a run-time exception is converted to a compile-time error. You couldn't add the static assert to the bitset ctor taking a string (at least not without string template arguments).


It's very nice for mathematical code. Out of my mind I can see the use for the following operators:

deg for degrees. That makes writing absolute angles much more intuitive.

double operator ""_deg(long double d)
{ 
    // returns radians
    return d*M_PI/180; 
}

It can also be used for various fixed point representations (which are still in use in the field of DSP and graphics).

int operator ""_fix(long double d)
{ 
    // returns d as a 1.15.16 fixed point number
    return (int)(d*65536.0f); 
}

These look like nice examples how to use it. They help to make constants in code more readable. It's another tool to make code unreadable as well, but we already have so much tools abuse that one more does not hurt much.


UDLs are namespaced (and can be imported by using declarations/directives, but you cannot explicitly namespace a literal like 3.14std::i), which means there (hopefully) won't be a ton of clashes.

The fact that they can actually be templated (and constexpr'd) means that you can do some pretty powerful stuff with UDLs. Bigint authors will be really happy, as they can finally have arbitrarily large constants, calculated at compile time (via constexpr or templates).

I'm just sad that we won't see a couple useful literals in the standard (from the looks of it), like s for std::string and i for the imaginary unit.

The amount of coding time that will be saved by UDLs is actually not that high, but the readability will be vastly increased and more and more calculations can be shifted to compile-time for faster execution.


Let me add a little bit of context. For our work, user defined literals is much needed. We work on MDE (Model-Driven Engineering). We want to define models and metamodels in C++. We actually implemented a mapping from Ecore to C++ (EMF4CPP).

The problem comes when being able to define model elements as classes in C++. We are taking the approach of transforming the metamodel (Ecore) to templates with arguments. Arguments of the template are the structural characteristics of types and classes. For example, a class with two int attributes would be something like:

typedef ::ecore::Class< Attribute<int>, Attribute<int> > MyClass;

Hoever, it turns out that every element in a model or metamodel, usually has a name. We would like to write:

typedef ::ecore::Class< "MyClass", Attribute< "x", int>, Attribute<"y", int> > MyClass;

BUT, C++, nor C++0x don't allow this, as strings are prohibited as arguments to templates. You can write the name char by char, but this is admitedly a mess. With proper user-defined literals, we could write something similar. Say we use "_n" to identify model element names (I don't use the exact syntax, just to make an idea):

typedef ::ecore::Class< MyClass_n, Attribute< x_n, int>, Attribute<y_n, int> > MyClass;

Finally, having those definitions as templates helps us a lot to design algorithms for traversing the model elements, model transformations, etc. that are really efficient, because type information, identification, transformations, etc. are determined by the compiler at compile time.


Bjarne Stroustrup talks about UDL's in this C++11 talk, in the first section on type-rich interfaces, around 20 minute mark.

His basic argument for UDLs takes the form of a syllogism:

  1. "Trivial" types, i.e., built-in primitive types, can only catch trivial type errors. Interfaces with richer types allow the type system to catch more kinds of errors.

  2. The kinds of type errors that richly typed code can catch have impact on real code. (He gives the example of the Mars Climate Orbiter, which infamously failed due to a dimensions error in an important constant).

  3. In real code, units are rarely used. People don't use them, because incurring runtime compute or memory overhead to create rich types is too costly, and using pre-existing C++ templated unit code is so notationally ugly that no one uses it. (Empirically, no one uses it, even though the libraries have been around for a decade).

  4. Therefore, in order to get engineers to use units in real code, we needed a device that (1) incurs no runtime overhead and (2) is notationally acceptable.


Supporting compile-time dimension checking is the only justification required.

auto force = 2_N; 
auto dx = 2_m; 
auto energy = force * dx; 

assert(energy == 4_J); 

See for example PhysUnits-CT-Cpp11, a small C++11, C++14 header-only library for compile-time dimensional analysis and unit/quantity manipulation and conversion. Simpler than Boost.Units, does support unit symbol literals such as m, g, s, metric prefixes such as m, k, M, only depends on standard C++ library, SI-only, integral powers of dimensions.