How to Correctly Sum Up Numbers
One of the first examples for loops in probably every programming language course is taking a list of numbers and calculating the sum of them. And in every programming language, it’s trivial to write up a solution:
int sumUpNumbers(std::span<const int> numbers) {
int sum = 0;
for (int i : numbers)
sum += i;
return sum;
}
If you compile this with a modern optimizing compiler, you will even get very efficient vectorized code, that sums millions of integers per second. Nice!
Unfortunately, as you might have guessed, this is not the end of the story. There are several issues with this solution if used in any serious application where you want correct results, e.g., a database system:
- Does this solution still work for unsigned integers?
- How about floating point numbers?
- What do you do when the result does not fit into the integer type of the argument? What if it doesn’t fit into any type?
- How do the answers to any of these question change if we not only consider addition but also multiplication?
In this post, we are exploring the world of handling exceptional cases in numerical computations. We will focus only on the initial example of adding up a list of numbers as even this simple example comes with more than enough edge cases.
Numerical Overflow
Obviously, when we talk about edge cases in arithmetic on a computer, “numerical overflows” are the first thing most people will think of. Since most hardware only supports fixed-size values for the arithmetic operation, there is always a limit to the numbers the hardware can represent. Most CPU architectures support only up to 64-bit integers in their registers, for example.
For integers, overflow happens when you add two numbers where the sum is larger
than the largest supported integer or lower than the smallest supported
integer. For 8-bit twos-complement signed integers, all of these operations
result in integer overflow: 127 + 1
, -100 - 29
, -100 + (-29)
, 100 * 2
,
-128 / -1
.
Basically all relevant hardware uses twos-complement integers and handles
integer overflow by “wrapping” the result into the supported range of values.
This can often lead to surprising results, e.g.,
127 + 1 = -128
, -100 - 29 = 127
, -128 / -1 = -128
.
For IEEE floating point numbers, overflows are handled differently. The result
of 2e38 + 2e38
does not fit into a 32-bit float but floating point operations
do not wrap like integer operations. Instead, depending on the hardware and
configuration, this operation results in the special “infinity” value.
Since multiplication with values between zero and one leads to a smaller result
but the precision of float values is obviously limited, arithmetic with float
values also has an additional edge case: If you calculate 1e-38 * 1e-8
, for
example, the result is exactly zero. So, multiplication of two non-zero numbers
can lead to a zero result. This behavior is usually called “underflow” (not to
be confused with overflow of negative integers!) and can easily lead to
unexpected results or even program crashes if you then divide by zero.
Before we dive into how you should actually deal with any kind of overflow or underflow, let’s first look at how some programming languages handle these exceptional cases.
Overflow in C++
In C++, overflow is generally not allowed and is considered “undefined behavior”:
If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined.
However, the actual behavior of overflows in C++ is more subtle. For example, overflow on unsigned integers silently wraps to the supported range of values (explicitly defined in [basic.fundamental]). Also, floating point operations are not required to be compatible with IEEE 754, but in practice, all compilers generate compatible instructions as they only target IEEE 754 compatible hardware.
As a result, C++ compilers generally generate a single add
instruction for every
addition and will leave you with the fallout caused by overflows.
Overflow in Rust
In Rust, overflowing integer operations lead to a “panic” (a controlled crash of the program) when compiled in debug mode. However, in release mode all integer operations wrap their results to their integer types, so overflows must be detected manually. Some people might consider this unsafe.
For floating point operations, the Rust compiler guarantees compatibility with IEEE 754 which means that generally overflows lead to infinity values and underflows are converted to zero.
Overflow in Python
The integer type in Python is implemented using a variable-sized integer which means it can store any integer value. So, integer arithmetic will never overflow in Python. All operations on variable-sized integers are more expensive than on fixed-size integers, so Python generally doesn’t perform that well when doing a lot of integer arithmetic.
Float values use the semantics of the underlying interpreter. The main Python interpreter, called CPython, is written in C, so floats in Python generally follow the IEEE 754 semantics, as well.
How to Handle Overflows
It should be obvious that not properly accounting for possible overflow can have unintended consequences. In extreme cases, unintended integer overflow can even lead to injury or death as in the infamous case of the Therac-25 X-ray machine.
Interestingly, not even all database systems guard against unintended overflows. Let’s look at a simple example:
with my_ints(i) as (
-- 9e18 barely fits into a bigint, 10e18 is too large already
values (9e18::bigint), (9.1e18::bigint)
)
select sum(i) from my_ints;
with my_ints(i) as (
-- 9e18 barely fits into a bigint, 10e18 is too large already
values (9e18::bigint), (9.1e18::bigint)
)
select sum(i) from my_ints;
The temporary table my_ints
contains two numbers that individually barely fit
into a bigint but adding them would result in an overflow. So, what should the
result of summing the numbers up be? If you run this on CedarDB, the query will
abort and you get the error message “numeric overflow”.
Other systems that do not handle overflow will instead successfully execute the query and return the result -346744073709551616. You don’t need to look at the exact number to see that returning a negative result when summing up values of a table containing only positive numbers is not ideal.
Overflow Checking in Hardware
Now, the interesting question is: How do systems like CedarDB actually detect and handle overflow? We could carefully compare all operands of arithmetic operations before doing the operation to rule out overflows. However, that would add significant overhead to all numeric calculations.
Instead, we can leverage the fact that all modern hardware already tracks
integer overflows on every operation. The add
instruction of the x86
instruction set, for example, sets several CPU
flags according to the
result. One of these flags is called the “overflow flag” (OF), which is set
exactly every time an integer overflow occurs.
Other CPU architectures have similar flags which can be used for cheap overflow detection. Fortunately, you don’t have to know the specifics of every CPU architecture to use these flags as most programming languages have special functions that abstract the arithmetic operations and checking of flag bits.
In C++, the GCC and Clang compilers support several builtin functions for checked integer arithmetic. A better version of the initial example that handles overflows could look like this:
std::optional<int> sumUpNumbersCorrectly(std::span<const int> numbers) {
int sum = 0;
for (int i : numbers) {
if (__builtin_add_overflow(sum, i, &sum))
return std::nullopt;
}
return sum;
}
Rust also has a similar API for checked arithmetic:
fn sum_up_numbers_correctly(numbers: &[i32]) -> Option<i32> {
let mut sum: i32 = 0;
for i in numbers {
sum = sum.checked_add(*i)?;
}
Some(sum)
}
Now, this doesn’t have zero overhead as the compiler can’t vectorize the function using checked arithmetic. However, the new solution ensures that the function doesn’t return any unexpected results.
For floating point arithmetic, CPUs have different ways of handling or detecting overflows. Usually, a program can choose whether the CPU generates a hardware exception or just sets a flag bit. Unfortunately, most programming languages don’t allow you to control the behavior of overflows (or underflows) of floating point arithmetic or read the flag bits set by floating point operations. So, you will have to resort to writing machine code manually if you want to efficiently check overflows of float numbers.
Overflow Handling in Database Systems
In CedarDB, we are very strict about not allowing silent overflows. Like most database systems, we consider it an error if a query returns an unexpected result due to an overflow. Because overflow checking can degrade performance, other database systems, such as ClickHouse, choose to explicitly allow overflows instead.
Our friends at Firebolt summarized our opinion on this very well:
Throwing errors in these cases makes it easier to use the system: instead of returning potentially meaningless results, the query engine lets the user know that they aren’t using the functions in a way that’s safe.
Making a Query Engine Postgres Compliant Part I - Functions, Firebolt
Explicitly checking for overflows is not the only strategy that database
systems have for preventing unexpected results. Often, intermediate results use
larger types; for example, when summing 32-bit integers, the intermediate
result might be stored in a 64-bit or even 128-bit integer. Not only does this
make overflows much less likely (or even impossible with 128-bit integers),
larger intermediate results also allow for better precision when calculating
some aggregation functions, such as avg()
.
Summary
To summarize our short overview of overflow handling, let me answer the questions from the beginning:
Does this solution still work for unsigned integers? Integer overflow happens with signed and unsigned integers, so you have to handle overflows for both. The only significant difference is how C++ handles overflows for unsigned integers: It explicitly allows them for unsigned integers and disallows them for signed integers.
How about floating point numbers? Floating point operations can have overflow and even underflow as well. However, the default behavior of most programming languages and CPU architectures usually doesn’t lead to completely unexpected results.
What do you do when the result does not fit into the integer type of the argument? What if it doesn’t fit into any type? To avoid many cases of overflow, you can use a larger integer type for any intermediate result. Any additional problems can be solved precisely by checking for overflows. If the result of an operation doesn’t fit into a given integer type, you can use efficient overflow checking functions to handle the overflow, e.g., by printing an error message. Printing an error message is also much better than silently corrupting data!
How do the answers to any of these question change if we not only consider addition but also multiplication? Compilers usually offer special functions that handle overflow not only for integer addition but also for the other arithmetic operators. As programming languages don’t usually support handling overflows of floating point additions, they also don’t support overflow checking for multiplication.
CedarDB uses a few more tricks to ensure super-fast but correct query processing. Stay tuned for our next blog posts exploring more parts of our query execution engine and in the meantime, sign up for our waitlist!