July 2024

Summertime Progress

August 15, 2024

We hope you had a wonderful July. The heat of the German summer did not slow us down. During the last month, we focused mainly on increasing the compatibility with PostgreSQL tools in preparation for an upcoming alpha release.

But that is not all that is new:

What’s new at CedarDB

We kept busy with our Blog:

  • In Why German Strings are Everywhere, we explored why you may have already encountered some of our technologies in the wild, and what makes our string format so special that it has been adopted by technologies like DuckDB, Appache Arrow, Polars, and Facebook Velox.
  • Given the attention and interested questions generated by our intro to German Strings, we took the time to write A Deep Dive into German Strings, expanding on the motivation behind their design and taking a look under the hood of some of their optimizations.
  • Finally, we explored CedarDB’s hybrid storage format that combines compressed columnar data and row-based data in a single relation. This allows us to answer the question Can You Do Both: Fast Scans and Fast Writes in a Single System? with “yes!”.

Database Changes

Compatibility improvements & more

  • Continuing our focus on compatibility from last month, we added and extended more than 10 common PostgreSQL system tables and functions, increasing compatibility with many of our supported clients.
  • In addition, we now support all PostgreSQL timezone abbreviations and added parsing support for some PostgreSQL optimization hints such as CTE materialization and COPY FREEZE.
  • As a first step towards supporting upserts, CedarDB now supports the ON CONFLICT DO NOTHING option for inserts. This can help you avoid unnecessary aborts in bulk inserts in the case of duplicate or corrupt input data.
  • In preparation for supporting even more transactionally consistent DDL statements, we have separated the schema and database file versions. Unfortunately, this change breaks compatibility with older database files. While this is undoubtedly an inconvenience in the short term, it helps us reduce the need to break backward compatibility in the future.
  • We have optimized the handling of numeric types in JSONB, reducing their storage footprint while increasing access performance.
  • We have also added support for calling custom SQL table functions in SELECT statements. You can read more about custom SQL functions in our docs.
  • Finally, our prototype of vector support similar to pgvector is shaping out great. While mainline support is still a bit out, we would love to get some feedback on use-cases for vector support that we can test internally. Please reach out to contact@cedardb.com

Past & Future Events

Events we attended

Scientific Retreat

retreat

In keeping with our academic background, the entire CedarDB team attended the scientific retreat of the TUM database research group to Schliersee in the Bavarian Alps. We had the chance to hear about the latest research topics straight from the source, and to share some insights into current pressing needs and worthwhile research topics that we were able to gather in conversations with many companies.

Podcast Time - Yet Another Infra Deep Dive

In another first for CedarDB, we were featured on a podcast. Moritz and Chris sat down with Tim Chen and Ian Livingstone on the Yet Another Infra Deep Dive Podcast to discuss the current state of data processing and how CedarDB will finally be able to reduce some of the tool sprawl that is common today. You can listen to the episode already!

Events coming up

US Visit

Moritz and Chris will visit the German Accelerator in Palo Alto in September (15th to 21st) as part of a delegation from TUM to strengthen ties with the US market. While their schedule is quite busy during the day, get in touch if you want to meet for dinner or a drink!

Thank you!


That’s all for today. We’re looking forward to sharing more awesome progress in our next newsletter!

Until our next update!

Your CedarDB Team