Blog
May 14, 20255 Minutes

Announcing the CedarDB Community Edition

We are thrilled to announce the general availability of our Community Edition today! Experience the power of CedarDB for free, for ever!

Read more
All
Dev
News
Blog

24/04/2025

What It Takes to Be PostgreSQL Compatible

Many systems and tools, including CedarDB, claim to be “PostgreSQL compatible”, but what does that actually mean? In this article, we explain why PostgresSQL compatibility has several layers, what is required to achieve each layer, and where CedarDB fits into this hierarchy.

Blog

02/04/2025

Fast Compilation or Fast Execution: Just Have Both!

Learn the basics of code generation, which is one of the secrets behind CedarDB's performance. CedarDB creates custom machine code for every query. This keeps data in CPU registers as long as possible and minimizes unnecessary data transfers.

Blog

06/03/2025

To B or not to B: B-Trees with Optimistic Lock Coupling

B-Trees stand the test of time. In this article, we explore why we still use a 55 year old data structure: It is still super efficient on modern hardware when we use contention free optimistic lock coupling!

Blog

29/01/2025

Why Trees Without Branches Grow Faster: The Case for Reducing Branches in Code

In some of our blog posts, we explained what steps we take to reduce the number of branching instructions in our critical paths. However, we only ever claimed that this is much better and faster, and always omitted explaining why. So today we will fix this and take a deep dive into the gritty details of instruction execution on modern CPUs.

Blog

24/12/2024

Helping Christmas Elves Count Presents (or: Vectorized Overflow Checking)

In a previous post, we explained the importance of overflow checks when summing numbers, and mentioned that the usual approaches are not easily vectorized. Read here how to get 4x the performance when adding integers by using specialized vector instructions.

Blog

03/12/2024

The History of the Decline and Fall of In-Memory Database Systems

A decade ago, there was a sudden surge of high-performance in-memory systems dominating the world of interactive analytics. Today, almost everyone has gone back to using persistent storage. Does this mean that building these in-memory systems was a mistake?

Blog

19/11/2024

Offset Considered Harmful or: The Surprising Complexity of Pagination in SQL

Have you ever wondered why you sometimes see duplicate results when clicking on the second page of a website? In this blog post, we explore techniques for result pagination, how they impact the work necessary to compute the results, and why using the SQL offset keyword for it is not a good idea.

Blog

30/10/2024

How to Correctly Sum Up Numbers

When you learn programming, one of the first things every book and course teaches is how to add two numbers. So, developers working with large data probably don't have to think too much about adding numbers, right? Turns out it's not quite so simple!

Blog

08/10/2024

Why You Shouldn't Forget to Optimize the Data Layout

The underlying data layout of your program can either help or hurt your algorithms. We explore how to improve the runtime of your program by analyzing the impact of the data layout on low-level properties such as cache line optimizations and SIMD execution, as well as other memory-level optimizations such as compression.

Blog

24/09/2024

The Hidden Cost of Data Movement

Moving data between disk, memory, caches, and CPU registers is one of the most critical paths when processing large amounts of data. In this post, we explore the often-overlooked costs of these data movements and show how they can be reduced for algorithms in general and specifically in a database system.

Blog

10/09/2024

Why I Prefer Exceptions to Error Values

Exceptions are often a better way to handle errors than returning them as values. We argue that traditional exceptions provide better user and developer experience, and show that they even result in faster execution.

Blog

27/08/2024

Reclaiming SQL’s Declarative Power

A good SQL optimizer can dramatically improve query performance, allowing you to focus on writing readable SQL instead of getting lost in the details of database optimization. This post discusses query optimizers, and query unnesting in particular, to emphasize the importance of query optimization.

Blog

27/08/2024

Reclaiming SQL’s Declarative Power

A good SQL optimizer can dramatically improve query performance, allowing you to focus on writing readable SQL instead of getting lost in the details of database optimization. This post discusses query optimizers, and query unnesting in particular, to emphasize the importance of query optimization.

Blog

13/08/2024

Can You Do Both: Fast Scans and Fast Writes in a Single System?

Most database systems focus on either analytical or transactional performance due to their contrasting data access patterns. CedarDB, however, achieves high performance in both areas (HTAP) with its unified storage system, Colibri, which combines compressed columnar data and row-based storage.

Blog

30/07/2024

A Deep Dive into German Strings

Our last post on our "German Strings" has received tremendous attention. We want to follow up on some of your comments and dive deeper into the reasons behind some of our string optimizations.

Blog

16/07/2024

Why German Strings are Everywhere

Many data processing systems have adapted our custom string format. Find out what makes it so special and why it is so relevant to them.

Blog

02/07/2024

Working with JSON and Graphs in CedarDB

The only thing that stops a bad guy with a database system, is a good guy with a database system: Learn how to work with semi-structured and graph data in CedarDB by hunting for Germany's most wanted white-collar criminal, Jan Marsalek, on a real world dataset.

Blog

18/06/2024

Why Your SSD (Probably) Sucks and What Your Database Can Do About It

SSDs have effectively replaced spinning disks as the go-to solution for persistent storage for database systems. While they offer amazing read and write throughput, they come with their own issues vendors don't like to talk about. We dive into those issues and explain how CedarDB overcomes them.