Down with template (or not)!

If you are one of the few people in the world who, like me, actively follow the progress of the C++ standardization, you know that the C++ standard evolves significantly from each version to the next. The proposals often even have funny names like “I Stream, You Stream, We All Stream for istream_iterator”, “More trailing commas”, “std::array is a wrapper for an array!”, or “Down with typename!”.

When I first saw the paper “Down with typename!”, I thought that made a lot of sense. This change in C++20 allows you to write static_cast<std::vector<T>::value_type>(...) instead of static_cast<typename std::vector<T>::value_type>(...). Nice!

Recently I stumbled upon a similar weird-looking and unnecessary keyword in our code base:

template <typename T, typename F>
struct JoinFilterHasher {
   template <typename S>
   static uint64_t hashDict(const byte* data, unsigned index, const byte* arg) {
      const T* dict = unalignedLoad<const T*>(arg);
      const S pos = *reinterpret_cast<const S*>(data + index * sizeof(S));
      return F::template hash<type>(dict + pos, nullptr);
      //        ^^^^^^^^ what is this???
   }
};

When I saw this, I immediately thought that maybe the time had come for a new paper titled “Down with template!”. Unfortunately, after I fell into the rabbit hole of better understanding the template keyword, I understood why removing it isn’t (easily) possible.

However, I found an absolutely crazy workaround. Don’t try this at home kids! Read on and follow me through this journey of C++ template madness!

The template Keyword

The template keyword itself is not something that will surprise any C++ programmer. Every time you define a template, you’ll have to use the template keyword. Pretty obvious.

However, there are also cases where you’ll have to tell the compiler whether a name refers to a template by prefixing the name with the template keyword. Usually, you don’t do that and just write std::vector<int> but if you really wanted, you could also write std::template vector<int>!

Why would you ever need that? To explain this, let’s look at this simplified example:

struct Foo {
    template <int I>
    static int myfunc(int a) { return a + I; }
};

template <typename T>
int genericFoo(int a) {
    return T::myfunc<42>(a);
}

int test() {
    return genericFoo<Foo>(1);
}

This code snippet does not compile! The reason is that in genericFoo the type T is generic, i.e., T is only known when you instantiate genericFoo with a specific type. So, the compiler can’t know in advance whether T::myfunc ends up referring to a member variable, a member function, a nested type specification, or a template (in C++ standardese this is called a “dependent name”). But it needs to know what T::myfunc is in order to correctly parse the body of genericFoo. So, you have to use the template keyword to make this unambiguous.

Any good compiler will tell you that you should write T::template myfunc to fix the compilation. If the compiler already knows, why can’t we just get rid of the template keyword for this altogether?

Down with template?

In order to understand why we need the template keyword, let’s take a closer look at the compiler output:

<source>: In function 'int genericFoo(int)':
<source>:8:15: warning: expected 'template' keyword before dependent template name [-Wmissing-template-keyword]
    8 |     return T::myfunc<42>(a);
      |               ^~~~~~
      |               template

Ok, as I said, the compiler knows exactly that we need to add the template keyword. But wait a minute: This is just a compiler warning and not an error. Huh? If it’s just a warning, why doesn’t this code compile anyway? Let’s continue reading the compiler messages:

<source>:8:21: error: invalid operands of types '<unresolved overloaded function type>' and 'int' to binary 'operator<'
    8 |     return T::myfunc<42>(a);
      |               ~~~~~~^~~

Here we found the actual compilation error. But it doesn’t say anything about templates at all. Instead, it mentions an “unresolved operator< ”? What is going on here?!

It turns out, if you don’t prefix such a dependent name with the template keyword, the compiler treats it as a regular variable. So, what the compiler actually sees is this, broken down by tokens:

return // return keyword
T      // name referring to a type
::     // scope resolution operator

// !!! Here's the interesting part !!!
myfunc // name referring to a variable
<      // operator<
42     // integer literal
>      // operator>
// !!! end of interesting part !!!

(      // opening parenthesis (not a function call!)
a      // name referring to a variable
)      // closing parenthesis
;      // semicolon

So, the compiler doesn’t see any template at all, just the variables myfunc and a and the constant 42 that are being compared using < and >. This is an equivalent statement, just formatted and parenthesized differently:

return (T::myfunc < 42) > a;

Since the compiler can’t know whether you just wanted a weirdly formatted comparison or an actual template, you need to specify that you’re actually dealing with a template using the template keyword.

Case closed. Or, is it?

Template Madness

We know that syntactically the code without the template keyword is correct. It just parses it as comparison operators instead of template arguments. So, can we somehow find a way to make this compile without changing the genericFoo function or the Foo class?

The answer is: Yes, but only if you’re willing to give up your sanity, sacrifice all your RAM to your compiler, and generally don’t care about runtime performance at all. Sounds good?

To solve this, we need to write code that can handle being compared using < and > and somehow translates this again into an application of template arguments.

Since we don’t want to change Foo , we’ll create a new class that will contain our evil hacks called DownWithTemplate. Our goal is the following: Write DownWithTemplate so that calling genericFoo<DownWithTemplate>(1) is equivalent to a fixed version of genericFoo<Foo>(1) where we add the template keyword as suggested by the compiler.

Operator Overloading

Let’s start with the easy part: overloading operator< and operator>. We know that genericFoo wants to access a value called myfunc , so myfunc will be a static member variable whose type overloads operator<:

struct DownWithTemplate {
    /* ... */
    struct MyFunc {
        OtherHalf operator<(int i) const {
            return OtherHalf{i};
        }
    };
    static constexpr MyFunc myfunc{};
};

The value that operator< returns needs to implement the other half of our hack, namely it needs to overload operator>:

struct DownWithTemplate {
    struct OtherHalf {
        int value;
        int operator>(int i) const {
            return /* ??? */;
        }
    };
    /* ... */
};

In OtherHalf::operator> we now have the two values of our expression: value contains the “template argument” (42 in our implementation of genericFoo ) and i gets the value of a in genericFoo.

Now, how do we call Foo::myfunc? Ideally we would like to just write Foo::myfunc<value>(i). If we do that, we’ll get the following compiler error:

<source>:8:16: note: template argument deduction/substitution failed:
<source>:27:32: error: '*(const DownWithTemplate::OtherHalf*)this' is not a constant expression
   27 |             return Foo::myfunc<value>(i);
      |                                ^~~~~

Since value is not a compile time constant, we can’t pass it as a template argument which must be known at compile time.

Runtime Template Arguments (or: Inner Circle of C++ Template Hell)

How do we bridge the gap between a value only known at runtime and a template that needs to know the value at compile time? There is no easy fix. Conceptually, a template must know its template arguments at compile time but a runtime value obviously is only known at runtime.

So, if we don’t know the template argument, we’ll just have to select the correct template at runtime. For this to work, we’ll have to generate all possible template instantiations with all possible values as template arguments.

This is what we want to implement (I hope you never want to do this):

switch (value) {
    case 0: return Foo::myfunc<0>(i);
    case 1: return Foo::myfunc<1>(i);
    /* ... */
    case 2147483647: return Foo::myfunc<2147483647>(i);
    case -1: return Foo::myfunc<-1>(i);
    case -2: return Foo::myfunc<-2>(i);
    /* ... */
    case -2147483648: return Foo::myfunc<-2147483648>(i);
}

Obviously, we don’t want to write this code manually. So, let’s see what the tool box of C++ templates gives us to help us out here: We’ll use std::integer_sequence to generate all possible integer values at compile time and we’ll use something called “Fold expressions” that work on “Parameter packs” to generate the code for all cases automatically.

If you’ve never heard of these three C++ features, that’s probably a good sign. Your colleagues will be thankful to never have to review code using them!

Anyway, there’s no easy way to prepare you for this, so I’ll just show you the code in all its C++ template hack ugliness and explain afterwards:

struct DownWithTemplate {
    template <int... Is>
    static int callMyFunc(std::integer_sequence<int, Is...>, int a, int b) {
        return (((Is == a) ? Foo::myfunc<Is>(b) : 0) + ...) +
            (((-Is-1 == a) ? Foo::myfunc<-Is-1>(b) : 0) + ...);
    }
    struct OtherHalf {
        int value;
        int operator>(int i) const {
            return callMyFunc(std::make_integer_sequence<int, std::numeric_limits<int>::max()>{}, value, i);
        }
    };
    /* ... */
};

First, we create all possible positive integers using std::make_integer_sequence. There is no equivalent template that gives you all negative integers, so we’ll just go through the range of all positive integers twice and negate the values once.

Unfortunately, fold expressions are just that — expressions —, and not statements. So, we can’t write a real switch case statement. What we’ll do instead is to write a long list of additions like this:

((value == 0) ? Foo::myfunc<0>(i) : 0) +
((value == 1) ? Foo::myfunc<1>(i) : 0) +
/* ... */

So, we add zero for all “cases” that don’t match and only call Foo::myfunc for the correct value. The exact syntax of fold expressions is very weird so you’ll just have to trust me that the code above is equivalent to this sum.

The Fallout

Let us roughly estimate how much work the compiler will have to do: We have 2^32 different possible templates. Each template instantiation contains at least a new Foo::myfunc expression. In Clang, a C++ function template instantiation (the class FunctionTemplateSpecializationInfo) uses at least 32 bytes of memory. So, at a minimum, the compiler would need 2^32*32 bytes = 128 GiB of memory to compile our code!

You can quickly confirm this by trying to compile this program:

clang++ -c -std=c++23 -O0 ./down_with_template.cpp

On my machine, this quickly leads to furious swapping of memory and eventually the OOM killer killing the compiler process (and a bunch of other processes as well). Don’t try this at home!

If you try to compile the same example with a 16-bit integer, Clang will not eat all your RAM but it will refuse to compile the program citing a “maximum nesting level for fold expressions”.

I tried compiling it with gcc on our largest machine as well: After consuming over 300 GiB of RAM, the OOM killer also got to gcc.

For now, our code only compiles if you are using 8-bit integers. You can find the full code at the bottom.

Conclusion

If you see the template keyword in an unexpected location in C++, you now know that it’s there to disambiguate between template arguments and comparison operators!

Still, C++ allows you to employ evil template hacks to work around this. If you are willing to sacrifice all of your sanity and RAM, you’ll be able to get rid of the template keyword!

Appendix

Benchmark Code: Vectorized Sums

down_with_template.cpp

Show Code

#include <limits>
#include <utility>

// DANGER: changing this to any larger type will make the compiler eat all of your RAM!
using Int = signed char;

struct Foo {
    template <Int I>
    static Int myfunc(Int a) { return a + I; }
};

template <typename T>
Int genericFoo(Int a) {
    return T::myfunc<42>(a);
}

struct DownWithTemplate {
    template <Int... Is>
    static Int callMyFunc(std::integer_sequence<Int, Is...>, Int a, Int b) {
        return (((Is == a) ? Foo::myfunc<Is>(b) : 0) + ...) +
            (((-Is-1 == a) ? Foo::myfunc<-Is-1>(b) : 0) + ...);
    }

    struct OtherHalf {
        Int value;
        Int operator>(Int i) const {
            return callMyFunc(std::make_integer_sequence<Int, std::numeric_limits<Int>::max()>{}, value, i);
        }
    };
    struct MyFunc {
        OtherHalf operator<(Int i) const {
            return OtherHalf{i};
        }
    };
    static constexpr MyFunc myfunc{};
};

Int test() {
    return genericFoo<DownWithTemplate>(1);
}