When I decided to deploy all my services on Kubernetes, one of the critical components was the PostgreSQL database. Deploying databases on Kubernetes can be challenging due to their stateful nature. In this post, I’ll share how I deployed PostgreSQL on Kubernetes, the challenges I faced, and why I chose the CrunchyData PostgreSQL Operator.
Understanding Stateful vs. Stateless Applications
Before diving into deployment, it’s essential to understand the difference between stateful and stateless applications.
- Stateless Applications: Do not store data or state between sessions. Each request is independent, and the application doesn’t need to remember previous interactions.
- Stateful Applications: Maintain state across sessions. Databases like PostgreSQL are stateful because they need to store and retrieve data reliably.
Kubernetes was initially designed for stateless applications, but over time, support for stateful workloads has improved with features like StatefulSets and PersistentVolumes.
PostgreSQL Architecture Overview
PostgreSQL typically follows a primary-replica architecture:
- Primary Node: Handles all write operations and reads.
- Replica Nodes: Handle read operations and replicate data from the primary node.
Replication can be configured as:
- Synchronous Replication: The primary waits for confirmation from replicas before completing a transaction. This ensures data consistency but can impact performance.
- Asynchronous Replication: The primary doesn’t wait for replicas, which improves performance but may risk data loss if the primary fails before replication.
Handling Failover
In production environments, it’s crucial to handle scenarios where the primary node fails:
- Automatic Failover: A replica is promoted to primary automatically.
- Monitoring and Orchestration: Tools are needed to monitor the cluster and manage failover processes.
Deploying PostgreSQL on Kubernetes
To manage PostgreSQL clusters on Kubernetes effectively, several operators have been developed:
- Zalando Postgres Operator
- CrunchyData PostgreSQL Operator
- CloudNativePG
- And more
These operators automate tasks like deployment, scaling, backups, and failover.
Using the Zalando Postgres Operator
Why Zalando?
I initially chose the Zalando Postgres Operator because:
- Simplicity: It’s straightforward to deploy.
- Active Community: It’s maintained by Zalando with good community support.
- Patroni Integration: Uses Patroni for high availability and automatic failover.
Deployment Steps
-
Add Helm Repository and Install Operator
helm repo add postgres-operator-charts https://opensource.zalando.com/postgres-operator/charts/postgres-operator kubectl create namespace postgres helm install postgres-operator postgres-operator-charts/postgres-operator --namespace postgres
-
Apply PostgreSQL Cluster Manifest
kubectl apply -f minimal-postgres-manifest.yaml
PostgreSQL Cluster Manifest
Here’s the minimal-postgres-manifest.yaml
:
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: postgres
namespace: postgres
spec:
teamId: "acid"
enableLogicalBackup: true
numberOfInstances: 2
users:
zalando:
- superuser
- createdb
<admin>:
- createdb
databases:
astring_dev: <admin>
preparedDatabases:
astring_dev: {}
postgresql:
version: "16"
volume:
size: "1Gi"
How It Works
- Operator Handles Logic: The Zalando operator manages the primary-replica setup, failover, and backups using Patroni.
- Automatic Failover: If the primary node fails, the operator promotes a replica to primary.
- Scaling: You can adjust
numberOfInstances
to scale replicas.
Challenges Faced
While Zalando’s operator is excellent, I encountered some issues:
- Backup Configuration: Difficulty in configuring backups to S3-compatible storage.
- Documentation Gaps: Limited guidance on restoring from backups and disaster recovery.
- Customization Limitations: Needed more control over backup schedules and retention policies.
Switching to CrunchyData PostgreSQL Operator
Why CrunchyData?
After facing challenges with Zalando, I explored the CrunchyData PostgreSQL Operator:
- Advanced Backup Options: Supports full and incremental backups to S3.
- Comprehensive Documentation: Clear instructions for backup, restore, and disaster recovery.
- Enhanced Metrics: Provides detailed monitoring for connections, queries, and transactions.
- Greater Control: More flexibility in configuration and management.
Deployment Steps
-
Install the Operator
Follow the CrunchyData installation guide to deploy the operator in your Kubernetes cluster.
-
Create kustomization yaml
apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization namespace: pgo secretGenerator: - name: pgo-s3-creds files: - s3.conf generatorOptions: disableNameSuffixHash: true resources: - postgres.yaml
-
Create s3.conf
[global] repo1-s3-key=<key> repo1-s3-key-secret=<secret>
-
Create postgres.yaml
apiVersion: postgres-operator.crunchydata.com/v1beta1 kind: PostgresCluster metadata: name: postgres spec: image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-16.3-1 postgresVersion: 16 instances: - name: instance1 replicas: 2 dataVolumeClaimSpec: accessModes: - "ReadWriteOnce" resources: requests: storage: 1Gi backups: pgbackrest: image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.51-1 configuration: - secret: name: pgo-s3-creds global: repo1-retention-full: "14" repo1-retention-full-type: time repo1-path: /pgbackrest/postgres-operator/postgres/repo1 repos: - name: repo1 schedules: full: "0 1 * * 0" differential: "0 1 * * 1-6" s3: bucket: <backup_repo> endpoint: "s3.ap-southeast-1.wasabisys.com" region: "ap-southeast-1" users: - name: postgres options: 'SUPERUSER' - name: <admin_user> databases: [astring-prod, astring-dev] - name: <admin_user> databases: [warehouse] patroni: dynamicConfiguration: postgresql: pg_hba: - "hostnossl all all all md5" monitoring: pgmonitor: exporter: image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-5.6.1-0
-
apply these
kubectl apply -k ./
Key Features Configured
- Instances: Set up with 2 replicas for high availability.
- Backups:
- pgBackRest: Configured for backups to S3-compatible storage.
- Schedules:
- Full Backups: Every Sunday at 1 AM.
- Differential Backups: Monday to Saturday at 1 AM.
- Retention: Keeps backups for 14 days.
- Users and Databases:
- Created users with specific roles and database access.
- Patroni Configuration:
- Manages automatic failover and replication.
- Monitoring:
- Enabled
pgMonitor
exporter for detailed metrics.
- Enabled
Benefits Experienced
- Backup and Restore:
- Seamless configuration of backups to S3.
- Ability to restore backups in different clusters.
- Detailed Metrics:
- Access to comprehensive monitoring data.
- Flexibility:
- More control over cluster settings and behaviors.
- Documentation:
- Clear guidance on setup, maintenance, and troubleshooting.
Conclusion
Deploying PostgreSQL on Kubernetes requires careful planning, especially for stateful applications needing persistent storage and high availability. While the Zalando Postgres Operator is user-friendly, it didn’t meet all my requirements for backup and customization.
Switching to the CrunchyData PostgreSQL Operator provided the features and control I needed. With robust backup options, detailed metrics, and excellent documentation, it proved to be the better choice for my deployment.