Postgres and Kubernetes

When I decided to deploy all my services on Kubernetes, one of the critical components was the PostgreSQL database. Deploying databases on Kubernetes can be challenging due to their stateful nature. In this post, I’ll share how I deployed PostgreSQL on Kubernetes, the challenges I faced, and why I chose the CrunchyData PostgreSQL Operator.

Understanding Stateful vs. Stateless Applications

Before diving into deployment, it’s essential to understand the difference between stateful and stateless applications.

Stateless Applications: Do not store data or state between sessions. Each request is independent, and the application doesn’t need to remember previous interactions.
Stateful Applications: Maintain state across sessions. Databases like PostgreSQL are stateful because they need to store and retrieve data reliably.

Kubernetes was initially designed for stateless applications, but over time, support for stateful workloads has improved with features like StatefulSets and PersistentVolumes.

PostgreSQL Architecture Overview

PostgreSQL typically follows a primary-replica architecture:

Primary Node: Handles all write operations and reads.
Replica Nodes: Handle read operations and replicate data from the primary node.

Replication can be configured as:

Synchronous Replication: The primary waits for confirmation from replicas before completing a transaction. This ensures data consistency but can impact performance.
Asynchronous Replication: The primary doesn’t wait for replicas, which improves performance but may risk data loss if the primary fails before replication.

Handling Failover

In production environments, it’s crucial to handle scenarios where the primary node fails:

Automatic Failover: A replica is promoted to primary automatically.
Monitoring and Orchestration: Tools are needed to monitor the cluster and manage failover processes.

Deploying PostgreSQL on Kubernetes

To manage PostgreSQL clusters on Kubernetes effectively, several operators have been developed:

Zalando Postgres Operator
CrunchyData PostgreSQL Operator
CloudNativePG
And more

These operators automate tasks like deployment, scaling, backups, and failover.

Using the Zalando Postgres Operator

Why Zalando?

I initially chose the Zalando Postgres Operator because:

Simplicity: It’s straightforward to deploy.
Active Community: It’s maintained by Zalando with good community support.
Patroni Integration: Uses Patroni for high availability and automatic failover.

Deployment Steps

Add Helm Repository and Install Operator

helm repo add postgres-operator-charts https://opensource.zalando.com/postgres-operator/charts/postgres-operator

kubectl create namespace postgres
helm install postgres-operator postgres-operator-charts/postgres-operator --namespace postgres

Apply PostgreSQL Cluster Manifest

kubectl apply -f minimal-postgres-manifest.yaml

PostgreSQL Cluster Manifest

Here’s the minimal-postgres-manifest.yaml:

apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
  name: postgres
  namespace: postgres
spec:
  teamId: "acid"
  enableLogicalBackup: true
  numberOfInstances: 2
  users:
    zalando:
    - superuser
    - createdb
    <admin>:
    - createdb
  databases:
    astring_dev: <admin>
  preparedDatabases:
    astring_dev: {}
  postgresql:
    version: "16"
  volume:
    size: "1Gi"

How It Works

Operator Handles Logic: The Zalando operator manages the primary-replica setup, failover, and backups using Patroni.
Automatic Failover: If the primary node fails, the operator promotes a replica to primary.
Scaling: You can adjust numberOfInstances to scale replicas.

Challenges Faced

While Zalando’s operator is excellent, I encountered some issues:

Backup Configuration: Difficulty in configuring backups to S3-compatible storage.
Documentation Gaps: Limited guidance on restoring from backups and disaster recovery.
Customization Limitations: Needed more control over backup schedules and retention policies.

Switching to CrunchyData PostgreSQL Operator

Why CrunchyData?

After facing challenges with Zalando, I explored the CrunchyData PostgreSQL Operator:

Advanced Backup Options: Supports full and incremental backups to S3.
Comprehensive Documentation: Clear instructions for backup, restore, and disaster recovery.
Enhanced Metrics: Provides detailed monitoring for connections, queries, and transactions.
Greater Control: More flexibility in configuration and management.

Deployment Steps

Install the Operator

Follow the CrunchyData installation guide to deploy the operator in your Kubernetes cluster.

Create kustomization yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: pgo

secretGenerator:
- name: pgo-s3-creds
  files:
  - s3.conf

generatorOptions:
  disableNameSuffixHash: true

resources:
- postgres.yaml

Create s3.conf

[global]
repo1-s3-key=<key>
repo1-s3-key-secret=<secret>

Create postgres.yaml

apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  name: postgres
spec:
  image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-16.3-1
  postgresVersion: 16
  instances:
    - name: instance1
      replicas: 2
      dataVolumeClaimSpec:
        accessModes:
        - "ReadWriteOnce"
        resources:
          requests:
            storage: 1Gi
  backups:
    pgbackrest:
      image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.51-1
      configuration:
      - secret:
          name: pgo-s3-creds
      global:
        repo1-retention-full: "14"
        repo1-retention-full-type: time
        repo1-path: /pgbackrest/postgres-operator/postgres/repo1
      repos:
      - name: repo1
        schedules:
          full: "0 1 * * 0"
          differential: "0 1 * * 1-6"
        s3:
          bucket: <backup_repo>
          endpoint: "s3.ap-southeast-1.wasabisys.com"
          region: "ap-southeast-1"
  users:
    - name: postgres
      options: 'SUPERUSER'
    - name: <admin_user>
      databases: [astring-prod, astring-dev]
    - name: <admin_user>
      databases: [warehouse]

  patroni:
    dynamicConfiguration:
      postgresql:
        pg_hba:
          - "hostnossl all all all md5"

  monitoring:
    pgmonitor:
      exporter:
        image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-5.6.1-0

apply these
```
kubectl apply -k ./
```

Key Features Configured

Instances: Set up with 2 replicas for high availability.
Backups:
- pgBackRest: Configured for backups to S3-compatible storage.
- Schedules:
  - Full Backups: Every Sunday at 1 AM.
  - Differential Backups: Monday to Saturday at 1 AM.
- Retention: Keeps backups for 14 days.
Users and Databases:
- Created users with specific roles and database access.
Patroni Configuration:
- Manages automatic failover and replication.
Monitoring:
- Enabled pgMonitor exporter for detailed metrics.

Benefits Experienced

Backup and Restore:
- Seamless configuration of backups to S3.
- Ability to restore backups in different clusters.
Detailed Metrics:
- Access to comprehensive monitoring data.
Flexibility:
- More control over cluster settings and behaviors.
Documentation:
- Clear guidance on setup, maintenance, and troubleshooting.

Conclusion

Deploying PostgreSQL on Kubernetes requires careful planning, especially for stateful applications needing persistent storage and high availability. While the Zalando Postgres Operator is user-friendly, it didn’t meet all my requirements for backup and customization.

Switching to the CrunchyData PostgreSQL Operator provided the features and control I needed. With robust backup options, detailed metrics, and excellent documentation, it proved to be the better choice for my deployment.

Understanding Stateful vs. Stateless Applications#

PostgreSQL Architecture Overview#

Handling Failover#

Deploying PostgreSQL on Kubernetes#

Using the Zalando Postgres Operator#

Why Zalando?#

Deployment Steps#

PostgreSQL Cluster Manifest#

How It Works#

Challenges Faced#

Switching to CrunchyData PostgreSQL Operator#

Why CrunchyData?#

Deployment Steps#

Key Features Configured#

Benefits Experienced#

Conclusion#

Understanding Stateful vs. Stateless Applications

PostgreSQL Architecture Overview

Handling Failover

Deploying PostgreSQL on Kubernetes

Using the Zalando Postgres Operator

Why Zalando?

Deployment Steps

PostgreSQL Cluster Manifest

How It Works

Challenges Faced

Switching to CrunchyData PostgreSQL Operator

Why CrunchyData?

Deployment Steps

Key Features Configured

Benefits Experienced

Conclusion