May 14, 2025 • 5 Minutes

Announcing the CedarDB Community Edition

We are thrilled to announce the general availability of our Community Edition today! Experience the power of CedarDB for free, for ever!

All

Dev

News

24/04/2025

What It Takes to Be PostgreSQL Compatible

Many systems and tools, including CedarDB, claim to be “PostgreSQL compatible”, but what does that actually mean? In this article, we explain why PostgresSQL compatibility has several layers, what is required to achieve each layer, and where CedarDB fits into this hierarchy.

02/04/2025

Fast Compilation or Fast Execution: Just Have Both!

Learn the basics of code generation, which is one of the secrets behind CedarDB's performance. CedarDB creates custom machine code for every query. This keeps data in CPU registers as long as possible and minimizes unnecessary data transfers.

06/03/2025

To B or not to B: B-Trees with Optimistic Lock Coupling

B-Trees stand the test of time. In this article, we explore why we still use a 55 year old data structure: It is still super efficient on modern hardware when we use contention free optimistic lock coupling!

29/01/2025

Why Trees Without Branches Grow Faster: The Case for Reducing Branches in Code

In some of our blog posts, we explained what steps we take to reduce the number of branching instructions in our critical paths. However, we only ever claimed that this is much better and faster, and always omitted explaining why. So today we will fix this and take a deep dive into the gritty details of instruction execution on modern CPUs.

24/12/2024

Helping Christmas Elves Count Presents (or: Vectorized Overflow Checking)

In a previous post, we explained the importance of overflow checks when summing numbers, and mentioned that the usual approaches are not easily vectorized. Read here how to get 4x the performance when adding integers by using specialized vector instructions.

03/12/2024

The History of the Decline and Fall of In-Memory Database Systems

A decade ago, there was a sudden surge of high-performance in-memory systems dominating the world of interactive analytics. Today, almost everyone has gone back to using persistent storage. Does this mean that building these in-memory systems was a mistake?

19/11/2024

Offset Considered Harmful or: The Surprising Complexity of Pagination in SQL

Have you ever wondered why you sometimes see duplicate results when clicking on the second page of a website? In this blog post, we explore techniques for result pagination, how they impact the work necessary to compute the results, and why using the SQL offset keyword for it is not a good idea.

30/10/2024

How to Correctly Sum Up Numbers

When you learn programming, one of the first things every book and course teaches is how to add two numbers. So, developers working with large data probably don't have to think too much about adding numbers, right? Turns out it's not quite so simple!

08/10/2024

Why You Shouldn't Forget to Optimize the Data Layout

The underlying data layout of your program can either help or hurt your algorithms. We explore how to improve the runtime of your program by analyzing the impact of the data layout on low-level properties such as cache line optimizations and SIMD execution, as well as other memory-level optimizations such as compression.

24/09/2024

The Hidden Cost of Data Movement

Moving data between disk, memory, caches, and CPU registers is one of the most critical paths when processing large amounts of data. In this post, we explore the often-overlooked costs of these data movements and show how they can be reduced for algorithms in general and specifically in a database system.

10/09/2024

Why I Prefer Exceptions to Error Values

Exceptions are often a better way to handle errors than returning them as values. We argue that traditional exceptions provide better user and developer experience, and show that they even result in faster execution.

27/08/2024

Reclaiming SQL’s Declarative Power

A good SQL optimizer can dramatically improve query performance, allowing you to focus on writing readable SQL instead of getting lost in the details of database optimization. This post discusses query optimizers, and query unnesting in particular, to emphasize the importance of query optimization.

27/08/2024

Reclaiming SQL’s Declarative Power

13/08/2024

Can You Do Both: Fast Scans and Fast Writes in a Single System?

Most database systems focus on either analytical or transactional performance due to their contrasting data access patterns. CedarDB, however, achieves high performance in both areas (HTAP) with its unified storage system, Colibri, which combines compressed columnar data and row-based storage.

30/07/2024

A Deep Dive into German Strings

Our last post on our "German Strings" has received tremendous attention. We want to follow up on some of your comments and dive deeper into the reasons behind some of our string optimizations.

16/07/2024

Why German Strings are Everywhere

Many data processing systems have adapted our custom string format. Find out what makes it so special and why it is so relevant to them.

02/07/2024

Working with JSON and Graphs in CedarDB

The only thing that stops a bad guy with a database system, is a good guy with a database system: Learn how to work with semi-structured and graph data in CedarDB by hunting for Germany's most wanted white-collar criminal, Jan Marsalek, on a real world dataset.

18/06/2024

Why Your SSD (Probably) Sucks and What Your Database Can Do About It

SSDs have effectively replaced spinning disks as the go-to solution for persistent storage for database systems. While they offer amazing read and write throughput, they come with their own issues vendors don't like to talk about. We dive into those issues and explain how CedarDB overcomes them.

05/06/2024

Simple, Efficient, and Robust Hash Tables for Join Processing

Efficient join processing is at the heart of CedarDB. Join us for a deep dive into the tech behind the world's fastest hash join implementation.

28/05/2024

An ode to PostgreSQL, and why it is still time to start over

CedarDB is a relational-first database system that delivers best-in-class performance for all your workloads, from transactional to analytical to graph, accessible through PostgreSQL’s tools and SQL dialect. Here's the story of why we're doing what we're doing, how we got here, and why it should matter to you.

Subscribe to our Newsletter

What It Takes to Be PostgreSQL Compatible

Fast Compilation or Fast Execution: Just Have Both!

To B or not to B: B-Trees with Optimistic Lock Coupling

Why Trees Without Branches Grow Faster: The Case for Reducing Branches in Code

Helping Christmas Elves Count Presents (or: Vectorized Overflow Checking)

The History of the Decline and Fall of In-Memory Database Systems

Offset Considered Harmful or: The Surprising Complexity of Pagination in SQL

How to Correctly Sum Up Numbers

Why You Shouldn't Forget to Optimize the Data Layout

The Hidden Cost of Data Movement

Why I Prefer Exceptions to Error Values

Reclaiming SQL’s Declarative Power

Reclaiming SQL’s Declarative Power

Can You Do Both: Fast Scans and Fast Writes in a Single System?

A Deep Dive into German Strings

Why German Strings are Everywhere

Working with JSON and Graphs in CedarDB

Why Your SSD (Probably) Sucks and What Your Database Can Do About It

Simple, Efficient, and Robust Hash Tables for Join Processing

An ode to PostgreSQL, and why it is still time to start over