
Announcing the CedarDB Community Edition
We are thrilled to announce the general availability of our Community Edition today! Experience the power of CedarDB for free, for ever!

24/04/2025
What It Takes to Be PostgreSQL Compatible
Many systems and tools, including CedarDB, claim to be “PostgreSQL compatible”, but what does that actually mean? In this article, we explain why PostgresSQL compatibility has several layers, what is required to achieve each layer, and where CedarDB fits into this hierarchy.

02/04/2025
Fast Compilation or Fast Execution: Just Have Both!
Learn the basics of code generation, which is one of the secrets behind CedarDB's performance. CedarDB creates custom machine code for every query. This keeps data in CPU registers as long as possible and minimizes unnecessary data transfers.

06/03/2025
To B or not to B: B-Trees with Optimistic Lock Coupling
B-Trees stand the test of time. In this article, we explore why we still use a 55 year old data structure: It is still super efficient on modern hardware when we use contention free optimistic lock coupling!

29/01/2025
Why Trees Without Branches Grow Faster: The Case for Reducing Branches in Code
In some of our blog posts, we explained what steps we take to reduce the number of branching instructions in our critical paths. However, we only ever claimed that this is much better and faster, and always omitted explaining why. So today we will fix this and take a deep dive into the gritty details of instruction execution on modern CPUs.

24/12/2024
Helping Christmas Elves Count Presents (or: Vectorized Overflow Checking)
In a previous post, we explained the importance of overflow checks when summing numbers, and mentioned that the usual approaches are not easily vectorized. Read here how to get 4x the performance when adding integers by using specialized vector instructions.

03/12/2024
The History of the Decline and Fall of In-Memory Database Systems
A decade ago, there was a sudden surge of high-performance in-memory systems dominating the world of interactive analytics. Today, almost everyone has gone back to using persistent storage. Does this mean that building these in-memory systems was a mistake?

19/11/2024
Offset Considered Harmful or: The Surprising Complexity of Pagination in SQL
Have you ever wondered why you sometimes see duplicate results when clicking on the second page of a website? In this blog post, we explore techniques for result pagination, how they impact the work necessary to compute the results, and why using the SQL offset keyword for it is not a good idea.

30/10/2024
How to Correctly Sum Up Numbers
When you learn programming, one of the first things every book and course teaches is how to add two numbers. So, developers working with large data probably don't have to think too much about adding numbers, right? Turns out it's not quite so simple!

08/10/2024
Why You Shouldn't Forget to Optimize the Data Layout
The underlying data layout of your program can either help or hurt your algorithms. We explore how to improve the runtime of your program by analyzing the impact of the data layout on low-level properties such as cache line optimizations and SIMD execution, as well as other memory-level optimizations such as compression.

24/09/2024
The Hidden Cost of Data Movement
Moving data between disk, memory, caches, and CPU registers is one of the most critical paths when processing large amounts of data. In this post, we explore the often-overlooked costs of these data movements and show how they can be reduced for algorithms in general and specifically in a database system.

10/09/2024
Why I Prefer Exceptions to Error Values
Exceptions are often a better way to handle errors than returning them as values. We argue that traditional exceptions provide better user and developer experience, and show that they even result in faster execution.

27/08/2024
Reclaiming SQL’s Declarative Power
A good SQL optimizer can dramatically improve query performance, allowing you to focus on writing readable SQL instead of getting lost in the details of database optimization. This post discusses query optimizers, and query unnesting in particular, to emphasize the importance of query optimization.

27/08/2024
Reclaiming SQL’s Declarative Power
A good SQL optimizer can dramatically improve query performance, allowing you to focus on writing readable SQL instead of getting lost in the details of database optimization. This post discusses query optimizers, and query unnesting in particular, to emphasize the importance of query optimization.

13/08/2024
Can You Do Both: Fast Scans and Fast Writes in a Single System?
Most database systems focus on either analytical or transactional performance due to their contrasting data access patterns. CedarDB, however, achieves high performance in both areas (HTAP) with its unified storage system, Colibri, which combines compressed columnar data and row-based storage.

30/07/2024
A Deep Dive into German Strings
Our last post on our "German Strings" has received tremendous attention. We want to follow up on some of your comments and dive deeper into the reasons behind some of our string optimizations.

16/07/2024
Why German Strings are Everywhere
Many data processing systems have adapted our custom string format. Find out what makes it so special and why it is so relevant to them.

02/07/2024
Working with JSON and Graphs in CedarDB
The only thing that stops a bad guy with a database system, is a good guy with a database system: Learn how to work with semi-structured and graph data in CedarDB by hunting for Germany's most wanted white-collar criminal, Jan Marsalek, on a real world dataset.

18/06/2024
Why Your SSD (Probably) Sucks and What Your Database Can Do About It
SSDs have effectively replaced spinning disks as the go-to solution for persistent storage for database systems. While they offer amazing read and write throughput, they come with their own issues vendors don't like to talk about. We dive into those issues and explain how CedarDB overcomes them.

05/06/2024
Simple, Efficient, and Robust Hash Tables for Join Processing
Efficient join processing is at the heart of CedarDB. Join us for a deep dive into the tech behind the world's fastest hash join implementation.

28/05/2024
An ode to PostgreSQL, and why it is still time to start over
CedarDB is a relational-first database system that delivers best-in-class performance for all your workloads, from transactional to analytical to graph, accessible through PostgreSQL’s tools and SQL dialect. Here's the story of why we're doing what we're doing, how we got here, and why it should matter to you.