October 30, 2024

How to Correctly Sum Up Numbers

When you learn programming, one of the first things every book and course teaches is how to add two numbers. So, developers working with large data probably don't have to think too much about adding numbers, right? Turns out it's not quite so simple!

Moritz Sichert
Moritz Sichert
Moritz Sichert
Moritz Sichert
How to Correctly Sum Up Numbers

How to Correctly Sum Up Numbers

One of the first examples for loops in probably every programming language course is taking a list of numbers and calculating the sum of them. And in every programming language, it’s trivial to write up a solution:

int sumUpNumbers(std::span<const int> numbers) {
    int sum = 0;
    for (int i : numbers)
        sum += i;
    return sum;
}

If you compile this with a modern optimizing compiler, you will even get very efficient vectorized code, that sums millions of integers per second. Nice!

Unfortunately, as you might have guessed, this is not the end of the story. There are several issues with this solution if used in any serious application where you want correct results, e.g., a database system:

  • Does this solution still work for unsigned integers?
  • How about floating point numbers?
  • What do you do when the result does not fit into the integer type of the argument? What if it doesn’t fit into any type?
  • How do the answers to any of these question change if we not only consider addition but also multiplication?

In this post, we are exploring the world of handling exceptional cases in numerical computations. We will focus only on the initial example of adding up a list of numbers as even this simple example comes with more than enough edge cases.

Numerical Overflow

Obviously, when we talk about edge cases in arithmetic on a computer, “numerical overflows” are the first thing most people will think of. Since most hardware only supports fixed-size values for the arithmetic operation, there is always a limit to the numbers the hardware can represent. Most CPU architectures support only up to 64-bit integers in their registers, for example.

For integers, overflow happens when you add two numbers where the sum is larger than the largest supported integer or lower than the smallest supported integer. For 8-bit twos-complement signed integers, all of these operations result in integer overflow: 127 + 1, -100 - 29, -100 + (-29), 100 * 2, -128 / -1.

Basically all relevant hardware uses twos-complement integers and handles integer overflow by “wrapping” the result into the supported range of values. This can often lead to surprising results, e.g., 127 + 1 = -128, -100 - 29 = 127, -128 / -1 = -128.

For IEEE floating point numbers, overflows are handled differently. The result of 2e38 + 2e38 does not fit into a 32-bit float but floating point operations do not wrap like integer operations. Instead, depending on the hardware and configuration, this operation results in the special “infinity” value.

Since multiplication with values between zero and one leads to a smaller result but the precision of float values is obviously limited, arithmetic with float values also has an additional edge case: If you calculate 1e-38 * 1e-8, for example, the result is exactly zero. So, multiplication of two non-zero numbers can lead to a zero result. This behavior is usually called “underflow” (not to be confused with overflow of negative integers!) and can easily lead to unexpected results or even program crashes if you then divide by zero.

Before we dive into how you should actually deal with any kind of overflow or underflow, let’s first look at how some programming languages handle these exceptional cases.

Overflow in C++

In C++, overflow is generally not allowed and is considered “undefined behavior”:

If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined.

C++ Standard [expr.pre]

However, the actual behavior of overflows in C++ is more subtle. For example, overflow on unsigned integers silently wraps to the supported range of values (explicitly defined in [basic.fundamental]). Also, floating point operations are not required to be compatible with IEEE 754, but in practice, all compilers generate compatible instructions as they only target IEEE 754 compatible hardware.

As a result, C++ compilers generally generate a single add instruction for every addition and will leave you with the fallout caused by overflows.

Overflow in Rust

In Rust, overflowing integer operations lead to a “panic” (a controlled crash of the program) when compiled in debug mode. However, in release mode all integer operations wrap their results to their integer types, so overflows must be detected manually. Some people might consider this unsafe.

For floating point operations, the Rust compiler guarantees compatibility with IEEE 754 which means that generally overflows lead to infinity values and underflows are converted to zero.

Overflow in Python

The integer type in Python is implemented using a variable-sized integer which means it can store any integer value. So, integer arithmetic will never overflow in Python. All operations on variable-sized integers are more expensive than on fixed-size integers, so Python generally doesn’t perform that well when doing a lot of integer arithmetic.

Float values use the semantics of the underlying interpreter. The main Python interpreter, called CPython, is written in C, so floats in Python generally follow the IEEE 754 semantics, as well.

How to Handle Overflows

It should be obvious that not properly accounting for possible overflow can have unintended consequences. In extreme cases, unintended integer overflow can even lead to injury or death as in the infamous case of the Therac-25 X-ray machine.

Interestingly, not even all database systems guard against unintended overflows. Let’s look at a simple example:

with my_ints(i) as (
    -- 9e18 barely fits into a bigint, 10e18 is too large already
    values (9e18::bigint), (9.1e18::bigint)
)
select sum(i) from my_ints;
Try it in CedarDB

The temporary table my_ints contains two numbers that individually barely fit into a bigint but adding them would result in an overflow. So, what should the result of summing the numbers up be? If you run this on CedarDB, the query will abort and you get the error message “numeric overflow”.

Other systems that do not handle overflow will instead successfully execute the query and return the result -346744073709551616. You don’t need to look at the exact number to see that returning a negative result when summing up values of a table containing only positive numbers is not ideal.

Overflow Checking in Hardware

Now, the interesting question is: How do systems like CedarDB actually detect and handle overflow? We could carefully compare all operands of arithmetic operations before doing the operation to rule out overflows. However, that would add significant overhead to all numeric calculations.

Instead, we can leverage the fact that all modern hardware already tracks integer overflows on every operation. The add instruction of the x86 instruction set, for example, sets several CPU flags according to the result. One of these flags is called the “overflow flag” (OF), which is set exactly every time an integer overflow occurs.

Other CPU architectures have similar flags which can be used for cheap overflow detection. Fortunately, you don’t have to know the specifics of every CPU architecture to use these flags as most programming languages have special functions that abstract the arithmetic operations and checking of flag bits.

In C++, the GCC and Clang compilers support several builtin functions for checked integer arithmetic. A better version of the initial example that handles overflows could look like this:

std::optional<int> sumUpNumbersCorrectly(std::span<const int> numbers) {
    int sum = 0;
    for (int i : numbers) {
        if (__builtin_add_overflow(sum, i, &sum))
            return std::nullopt;
    }
    return sum;
}

Rust also has a similar API for checked arithmetic:

fn sum_up_numbers_correctly(numbers: &[i32]) -> Option<i32> {
    let mut sum: i32 = 0;
    for i in numbers {
        sum = sum.checked_add(*i)?;
    }
    Some(sum)
}

Now, this doesn’t have zero overhead as the compiler can’t vectorize the function using checked arithmetic. However, the new solution ensures that the function doesn’t return any unexpected results.

For floating point arithmetic, CPUs have different ways of handling or detecting overflows. Usually, a program can choose whether the CPU generates a hardware exception or just sets a flag bit. Unfortunately, most programming languages don’t allow you to control the behavior of overflows (or underflows) of floating point arithmetic or read the flag bits set by floating point operations. So, you will have to resort to writing machine code manually if you want to efficiently check overflows of float numbers.

Overflow Handling in Database Systems

In CedarDB, we are very strict about not allowing silent overflows. Like most database systems, we consider it an error if a query returns an unexpected result due to an overflow. Because overflow checking can degrade performance, other database systems, such as ClickHouse, choose to explicitly allow overflows instead.

Our friends at Firebolt summarized our opinion on this very well:

Throwing errors in these cases makes it easier to use the system: instead of returning potentially meaningless results, the query engine lets the user know that they aren’t using the functions in a way that’s safe.

Making a Query Engine Postgres Compliant Part I - Functions, Firebolt

Explicitly checking for overflows is not the only strategy that database systems have for preventing unexpected results. Often, intermediate results use larger types; for example, when summing 32-bit integers, the intermediate result might be stored in a 64-bit or even 128-bit integer. Not only does this make overflows much less likely (or even impossible with 128-bit integers), larger intermediate results also allow for better precision when calculating some aggregation functions, such as avg().

Summary

To summarize our short overview of overflow handling, let me answer the questions from the beginning:

Does this solution still work for unsigned integers? Integer overflow happens with signed and unsigned integers, so you have to handle overflows for both. The only significant difference is how C++ handles overflows for unsigned integers: It explicitly allows them for unsigned integers and disallows them for signed integers.

How about floating point numbers? Floating point operations can have overflow and even underflow as well. However, the default behavior of most programming languages and CPU architectures usually doesn’t lead to completely unexpected results.

What do you do when the result does not fit into the integer type of the argument? What if it doesn’t fit into any type? To avoid many cases of overflow, you can use a larger integer type for any intermediate result. Any additional problems can be solved precisely by checking for overflows. If the result of an operation doesn’t fit into a given integer type, you can use efficient overflow checking functions to handle the overflow, e.g., by printing an error message. Printing an error message is also much better than silently corrupting data!

How do the answers to any of these question change if we not only consider addition but also multiplication? Compilers usually offer special functions that handle overflow not only for integer addition but also for the other arithmetic operators. As programming languages don’t usually support handling overflows of floating point additions, they also don’t support overflow checking for multiplication.


CedarDB uses a few more tricks to ensure super-fast but correct query processing. Stay tuned for our next blog posts exploring more parts of our query execution engine and in the meantime, sign up for our waitlist!
Join our waitlist!

Supported by
BMWK logo
EU logo
Das Projekt LunaBase wird im Rahmen des EXIST-Programms durch das Bundesministerium für Wirtschaft und Klimaschutz und den Europäischen Sozialfonds gefördert.