Operate in the Cloud

Operate in the Cloud

You can easily deploy CedarDB on your own AWS EC2 or GCP instances.

Installation

Here’s a quick setup example for running CedarDB in the cloud.

We recommend using the latest Ubuntu LTS release (i.e., Ubuntu 24.04 as of writing).

ℹ️
By using CedarDB, you agree to our Terms and Conditions.

Instance sizing guidelines

When deploying CedarDB in the cloud, performance depends on three key resource dimensions:

  • Main Memory: CedarDB caches hot data and intermediate query results in RAM. For best performance, choose an instance with enough memory to fit your working set.
  • CPU: CedarDB scales seamlessly from a single core to hundreds. Analytical workloads benefit significantly from more CPU cores.
  • Storage:
    • For analytical workloads, throughput is critical, especially for cold data not yet in memory.
    • For transactional workloads, durability and write latency are key.

Recommended EC2 instance types

As a starting point:

  • Use the m7a range of instances with the m7a.4xlarge as a good baseline for bigger workloads.
  • Choose the compute-optimized c7a family for compute-heavy workloads where RAM demand is lower.
  • Use the memory-optimized r7a family if you have a large working set but latency is not as big of a concern.
  • Use a network-optimized c6in or m6in family if you store your data on S3 and process large amounts of data.

Recommended GCP instance types

As a starting point:

  • Use the c4-standard range of instances with the c4-standard-48 as a good baseline for medium workloads.
  • Choose the compute-optimized c4-highcpu family for compute-heavy workloads where RAM demand is lower.
  • Use the memory-optimized c4-highmem family if you have a large working set.

Storage guidelines

For an overview of AWS storage types, see the EBS volume types. For an overview of GCP Compute Engine storage types, read the durable block storage docs.

AWS recommendations by use case:

  • Analytical, read-heavy workloads: Use gp3 volumes. They are cost-efficient and sufficient when the working set fits into memory.
  • High durability and transactional throughput: Use io2 volumes with enough provisioned IOPS to ensure consistent latency and reliability.
  • Ephemeral storage for temporary workloads: If you don’t need persistence across instance shutdowns, instances with attached ephemeral NVMe SSDs offer fast, low-latency storage at a lower price. This is a good fit for: Batch workloads, temporary database instances, or situations where data is already backed up elsewhere.
ℹ️
Want to store your data on AWS S3 or Google Cloud Storage instead for increased performance and much lower cost? Sign up for our Enterprise trial license or contact us!