A step-by-step guide to creating a production-ready Kubernetes cluster
Kubernetes Cluster Setup on AWS EKS
A step-by-step guide to creating a production-ready Kubernetes cluster on AWS EKS using eksctl, setting up core components (NGINX Ingress, Cert Manager, NATS, PostgreSQL), and configuring secrets and environment-specific configuration for application services.
This walkthrough builds on the high-level overview from Deploying microservices into a Kubernetes cloud.
Prerequisites
Install and configure:
- AWS CLI
- eksctl
- kubectl
- Helm
- AWS account with permissions to create EKS clusters, IAM roles, and policies
- (Optional)
aws-iam-authenticatorif your environment requires it
Ensure you’re authenticated:
aws configure
aws sts get-caller-identityHelm Repo Setup
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo add jetstack https://charts.jetstack.io
helm repo add nats https://nats-io.github.io/k8s/helm/charts/
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo updateStep-by-Step Setup
1. Create EKS Cluster
eksctl create cluster \
--name xoxo-v1 \
--region us-east-2 \
--nodegroup-name standard-workers \
--node-type t3.medium \
--nodes 2 \
--nodes-min 1 \
--nodes-max 4 \
--managedWhat this does
Creates a managed EKS cluster in us-east-2 with a node group that can autoscale from 1 to 4 nodes.
2. Add Cluster to kubectl Context
aws eks update-kubeconfig --region us-east-2 --name xoxo-v1
aws eks describe-cluster --name xoxo-v1 --region us-east-2
kubectl get nodes -o wideWhat this does
Adds the new EKS cluster to your kubeconfig and verifies connectivity.
3. Install NGINX Ingress Controller
helm install nginx-ingress ingress-nginx/ingress-nginx \
--namespace ingress-nginx \
--create-namespace \
--set controller.publishService.enabled=true
kubectl get svc -n ingress-nginx
kubectl create ns backendWhat this does
Installs the NGINX Ingress Controller and creates the backend namespace for your application services.
4. Install Cert Manager
Install CRDs:
kubectl apply --validate=false \
-f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.crds.yamlInstall Cert Manager (pin a version if desired):
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--version v1.14.3Apply your ClusterIssuer (example in Appendix):
kubectl apply -f cluster-issuer.yamlWhat this does
Installs Cert Manager to automatically provision and renew TLS certificates (e.g., via Let’s Encrypt).
5. Install NATS Messaging Queue
helm install nats nats/nats \
--namespace nats \
--create-namespace
kubectl get pods,svc -n nats
kubectl exec -n nats -it nats-0 -- nslookup nats.nats.svc.cluster.localWhat this does
Deploys NATS for inter-service messaging and validates cluster DNS.
6. Install PostgreSQL (EBS CSI + Bitnami)
6.1 EBS CSI Driver (for persistent volumes)
Create IAM role for the EBS CSI driver and attach policy:
aws iam create-role \
--role-name AmazonEKS_EBS_CSI_DriverRole \
--assume-role-policy-document file://trust-policy.json
aws iam attach-role-policy \
--role-name AmazonEKS_EBS_CSI_DriverRole \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicyInstall the addon (IRSA role ARN will vary by account):
eksctl create addon --name aws-ebs-csi-driver \
--cluster xoxo-v1 \
--region us-east-2 \
--service-account-role-arn arn:aws:iam::<YOUR_ACCOUNT_ID>:role/AmazonEKS_EBS_CSI_DriverRole6.2 Install PostgreSQL (Bitnami)
kubectl create namespace postgres
helm install pgdb bitnami/postgresql \
--namespace postgres \
--values postgresdb-values.yaml
kubectl get pods,svc -n postgres
kubectl exec -n nats -it nats-0 -- nslookup pgdb-postgresql.postgres.svc.cluster.localWhat this does
Deploys PostgreSQL using the Bitnami Helm chart with persistent volumes via EBS.
7. Access PostgreSQL
Internal access (from a temporary pod):
export POSTGRES_PASSWORD=$(kubectl get secret --namespace postgres pgdb-postgresql \
-o jsonpath="{.data.postgres-password}" | base64 -d)
kubectl run pgdb-postgresql-client --rm --tty -i --restart='Never' \
--namespace default \
--image docker.io/bitnami/postgresql:17.6.0-debian-12-r0 \
--env="PGPASSWORD=$POSTGRES_PASSWORD" \
--command -- psql --host pgdb-postgresql -U postgres -d backend -p 5432External access (via local port-forward):
# forward Service in postgres namespace to localhost:5433
kubectl port-forward -n postgres svc/pgdb-postgresql 5433:5432 &
# then connect locally using psql (reuses POSTGRES_PASSWORD exported above)
PGPASSWORD="$POSTGRES_PASSWORD" psql --host 127.0.0.1 -U postgres -d backend -p 54338. Add Docker Registry Secret & Apply Configs
⚠️ Never hardcode tokens in scripts or repos. Use env vars or secret managers.
# Export credentials securely (replace values accordingly)
export DOCKER_USERNAME="criyadevops"
export DOCKER_PASSWORD="<your_docker_access_token>"
export DOCKER_EMAIL="devops@criya.co"
kubectl create secret docker-registry docker-reg -n backend \
--docker-username="$DOCKER_USERNAME" \
--docker-password="$DOCKER_PASSWORD" \
--docker-email="$DOCKER_EMAIL"To use this secret, add the following to your Deployments:
spec:
template:
spec:
imagePullSecrets:
- name: docker-regApply environment configs and secrets (adjust file names as needed):
kubectl apply -f staging-configmap.yaml
./staging-redeploy-secrets.shSecurity Recommendations
-
Use AWS Secrets Manager or External Secrets Operator (ESO)
Store app secrets outside the cluster and sync with Kubernetes:- ESO example (values sample in Appendix)
- Avoid
kubectl create secret ...for long-lived credentials.
-
Enable IRSA (IAM Roles for Service Accounts)
Grant the minimum AWS permissions to specific pods that need them:- Create a fine-grained IAM policy.
- Create/annotate a K8s service account with the IAM role ARN.
- Reference that service account in your Deployment.
-
TLS Everywhere
Use Cert Manager with Let’s Encrypt (HTTP-01 or DNS-01) and ensure all Ingress objects have TLS configured. -
Network Policies
Restrict traffic between namespaces and workloads (e.g., only app pods can talk to PostgreSQL/NATS). -
RBAC & Least Privilege
Provide minimal Kubernetes permissions to CI/CD and developers. -
Pod Security (PSA)
Enforcerestrictedbaseline via namespace labels and admission (e.g., disallow privileged pods). -
Encrypt at Rest
Enable EBS volume encryption by default (KMS keys if required). -
Audit & Control Plane Logs
Enable EKS control plane logging (API, audit, authenticator).
Monitoring & Logging
-
Metrics (Prometheus + Grafana)
Install the kube-prometheus-stack:helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update helm install monitoring prometheus-community/kube-prometheus-stack \ --namespace monitoring --create-namespace -
Logging (CloudWatch / Fluent Bit)
Enable CloudWatch Container Insights or deploy Fluent Bit to ship logs to CloudWatch/ELK/DataDog. -
Horizontal/Vertical Scaling
Installmetrics-serverfor HPA and consider VPA for right-sizing:kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml -
Cluster Autoscaler (optional)
Improves node scaling efficiency:kubectl apply -f https://github.com/kubernetes/autoscaler/releases/latest/download/cluster-autoscaler-autodiscover.yaml
Best Practices
- Pin chart versions for reproducibility.
- Separate namespaces by concern: ingress, cert-manager, nats, postgres, backend.
- Use
values.yamlfiles per environment (dev/staging/prod). - Resource requests/limits on all workloads.
- Readiness/Liveness probes on app pods.
- Backups: Schedule PostgreSQL backups (e.g., pgBackRest) and consider Velero for cluster backup.
- Cost: Right-size nodes, enable autoscaler, use spot where appropriate (with interruption handling).
Cleanup
# Delete the EKS cluster and all managed resources
eksctl delete cluster --name xoxo-v1 --region us-east-2
# (Optional) Detach & delete EBS CSI role
aws iam detach-role-policy \
--role-name AmazonEKS_EBS_CSI_DriverRole \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy
aws iam delete-role --role-name AmazonEKS_EBS_CSI_DriverRoleNote: Deleting the cluster won’t delete EBS volumes that are retained, so double-check to avoid orphaned costs.
Troubleshooting
- Ingress not exposing IP/DNS
Check controller logs:kubectl logs -n ingress-nginx deploy/nginx-ingress-controller - Certificates stuck in
Pending
kubectl describe certificate -Aandkubectl describe challenge -Afor ACME issues. - DNS issues
Use abusyboxpod:nslookup <service>.<namespace>.svc.cluster.local - PersistentVolumeClaims pending
Verify EBS CSI addon and StorageClass.
Appendix
A. Sample cluster-issuer.yaml (Let’s Encrypt HTTP-01)
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
email: devops@criya.co
server: https://acme-v02.api.letsencrypt.org/directory
privateKeySecretRef:
name: letsencrypt-prod-private-key
solvers:
- http01:
ingress:
class: nginxB. Sample postgresdb-values.yaml (Bitnami)
global:
postgresql:
auth:
username: postgres
database: backend
existingSecret: "" # Prefer external/ESO secrets; else leave empty to auto-generate
primary:
persistence:
enabled: true
size: 20Gi
storageClass: gp3
resources:
requests:
cpu: 250m
memory: 512Mi
limits:
cpu: "1"
memory: 1GiC. External Secrets Operator (optional)
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: backend-env
namespace: backend
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: backend-env
creationPolicy: Owner
data:
- secretKey: DATABASE_URL
remoteRef:
key: /prod/backend/DATABASE_URLFAQ
- What does this AWS EKS cluster setup guide walk me through?
- It shows how to create an EKS cluster with eksctl, configure kubectl, install NGINX Ingress, Cert Manager, NATS, and PostgreSQL, and wire up DNS and TLS so you have a realistic environment for running backend services.
- Is this EKS configuration meant for production use or just local experiments?
- The guide aims for a production-leaning setup with managed node groups, persistent storage, ingress, TLS, and messaging, but you should still adapt security, backup, and cost settings to your own organization's requirements before using it in a real production environment.
Welcome to The infinite monkey theorem
Somewhere a monkey just typed Shakespeare in TypeScript. Be the first to read the masterpieces (and the hilarious misfires) landing on the blog.

