Stop Paying for Expensive Logging: Self-Hosted ClickHouse on Kubernetes
We help developers learn and grow by keeping them up with what matters. 👉 www.faun.dev
Image by Tumisu from Pixabay
C entralized logging is crucial for understanding application behavior in Kubernetes, but the cost of storing those logs can quickly spiral out of control. ClickHouse is changing that narrative. It provides a high-performance, cost-effective alternative to traditional logging stacks, delivering lightning-fast query speeds that significantly reduces your infrastructure overhead.
In this blog post, we’re going to build a high-performance logging pipeline from the ground up. Let’s look at how to:
- Deploy ClickHouse on Minikube for lightning-fast, cost-effective log storage
- Harness the power of Fluent Bit, the industry-standard, lightweight log processor designed to handle high-throughput Kubernetes telemetry with minimal CPU overhead
- Integrate Grafana to visualize your logs in real-time
Why ClickHouse for Logs?¶
Before we dive in, let’s understand why ClickHouse is excellent for logging:
- Speed: Queries on billions of rows in seconds
- Compression: 10–100x compression ratios
- Cost-Effective: Lower storage and compute costs
- SQL Interface: No need to learn new query languages
- Scalability: Handles petabytes of data
Let’s dive into the setup of ClickHouse on Minikube
Step-1: Setting up Minikube¶
minikube start --cpus=4 --memory=8192 --driver=docker
Step-2: Deploy ClickHouse¶
kubectl create namespace clickhouse
helm repo add clickhouse-operator https://docs.altinity.com/clickhouse-operator/
helm repo update
helm install clickhouse-operator clickhouse-operator/altinity-clickhouse-operator \
--namespace clickhouse
<clickhouse-instance.yaml>
apiVersion: clickhouse.altinity.com/v1
kind: ClickHouseInstallation
metadata:
name: clickhouse
namespace: clickhouse
spec:
configuration:
users:
default/password: "changeme"
default/networks/ip:
- "::/0"
clusters:
- name: default
layout:
shardsCount: 1
replicasCount: 1
defaults:
templates:
podTemplate: clickhouse-pod
dataVolumeClaimTemplate: data-volume
templates:
podTemplates:
- name: clickhouse-pod
spec:
containers:
- name: clickhouse
image: clickhouse/clickhouse-server:23.8
ports:
- name: http
containerPort: 8123
- name: tcp
containerPort: 9000
volumeClaimTemplates:
- name: data-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
kubectl apply -f clickhouse-instance.yaml
qvamjak% kubectl get pods -n clickhouse
NAME READY STATUS RESTARTS AGE
chi-clickhouse-default-0-0-0 2/2 Running 0 6h34m
clickhouse-operator-altinity-clickhouse-operator-6bcb844b765srr 2/2 Running 0 6h41m
Step-3: Deploy Log Generator App¶
apiVersion: v1
kind: Namespace
metadata:
name: demo-app
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: log-generator
namespace: demo-app
spec:
replicas: 2
selector:
matchLabels:
app: log-generator
template:
metadata:
labels:
app: log-generator
spec:
containers:
- name: log-generator
image: busybox
command: ["/bin/sh"]
args:
- -c
- |
i=0
while true; do
i=$((i+1))
# Generate structured JSON logs
echo "{\"level\":\"info\",\"message\":\"Processing request $i\",\"timestamp\":\"$(date -u +%Y-%m-%dT%H:%M:%S.%3NZ)\",\"request_id\":\"req-$i\",\"duration_ms\":$((RANDOM % 1000))}"
# Occasionally generate errors
if [ $((i % 20)) -eq 0 ]; then
echo "{\"level\":\"error\",\"message\":\"Failed to process request\",\"timestamp\":\"$(date -u +%Y-%m-%dT%H:%M:%S.%3NZ)\",\"request_id\":\"req-$i\",\"error\":\"Connection timeout\"}"
fi
sleep 3
done
Step-4: Deploy Fluentbit¶
apiVersion: v1
kind: Namespace
metadata:
name: logging
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluent-bit
namespace: logging
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fluent-bit
rules:
- apiGroups: [""]
resources:
- namespaces
- pods
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: fluent-bit
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: fluent-bit
subjects:
- kind: ServiceAccount
name: fluent-bit
namespace: logging
---
apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
namespace: logging
data:
fluent-bit.conf: |
[SERVICE]
Flush 5
Daemon Off
Log_Level info
Parsers_File parsers.conf
[INPUT]
Name tail
Path /var/log/containers/*.log
Parser docker
Tag kube.*
Refresh_Interval 5
Mem_Buf_Limit 5MB
Skip_Long_Lines On
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Kube_Tag_Prefix kube.var.log.containers.
Merge_Log Off
Keep_Log On
[FILTER]
Name nest
Match kube.*
Operation lift
Nested_under kubernetes
Add_prefix kubernetes_
[FILTER]
Name modify
Match kube.*
Rename kubernetes_pod_name pod
Rename kubernetes_namespace_name namespace
Rename kubernetes_container_name container
Rename log log_message
[OUTPUT]
Name http
Match *
Host clickhouse-clickhouse.clickhouse.svc.cluster.local
Port 8123
URI /?query=INSERT%20INTO%20logs.application_logs%20FORMAT%20JSONEachRow
Format json_lines
http_User default
http_Passwd changeme
Retry_Limit 2
parsers.conf: |
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L%z
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluent-bit
namespace: logging
labels:
app: fluent-bit
spec:
selector:
matchLabels:
app: fluent-bit
template:
metadata:
labels:
app: fluent-bit
spec:
serviceAccountName: fluent-bit
containers:
- name: fluent-bit
image: fluent/fluent-bit:2.1
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: fluent-bit-config
mountPath: /fluent-bit/etc/
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: fluent-bit-config
configMap:
name: fluent-bit-config
Step-5: Create Table in ClickHouse & Verify the logs¶
kubectl exec -n clickhouse chi-clickhouse-default-0-0-0 -- clickhouse-client --password changeme --query "
CREATE TABLE logs.application_logs
(
timestamp DateTime DEFAULT now(),
namespace String DEFAULT '',
pod String DEFAULT '',
container String DEFAULT '',
log_message String DEFAULT '',
stream String DEFAULT ''
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(timestamp)
ORDER BY (timestamp, namespace, pod)
"
Check Logs
++++++++++++++++
qvamjak% kubectl exec -n clickhouse chi-clickhouse-default-0-0-0 -- clickhouse-client --password changeme --query "
SELECT
timestamp,
namespace,
pod,
log_message
FROM logs.application_logs
ORDER BY timestamp DESC
LIMIT 10
" --format=PrettyCompact
Defaulted container "clickhouse" out of: clickhouse, clickhouse-log
┌───────────timestamp─┬─namespace─┬─pod──────────────┬─log_message─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 2025-12-24 16:07:04 │ logging │ fluent-bit-58l57 │ [2025/12/24 16:06:59] [ info] [output:http:http.0] clickhouse-clickhouse.clickhouse.svc.cluster.local:8123, HTTP status=200
│
│ 2025-12-24 16:07:04 │ logging │ fluent-bit-58l57 │ [2025/12/24 16:06:59] [ info] [output:http:http.0] clickhouse-clickhouse.clickhouse.svc.cluster.local:8123, HTTP status=200
│
│ 2025-12-24 16:07:04 │ logging │ fluent-bit-58l57 │ [2025/12/24 16:06:59] [ info] [output:http:http.0] clickhouse-clickhouse.clickhouse.svc.cluster.local:8123, HTTP status=200
│
│ 2025-12-24 16:07:04 │ logging │ fluent-bit-58l57 │ [2025/12/24 16:06:59] [ info] [output:http:http.0] clickhouse-clickhouse.clickhouse.svc.cluster.local:8123, HTTP status=200
│
│ 2025-12-24 16:07:04 │ logging │ fluent-bit-58l57 │ [2025/12/24 16:06:59] [ info] [output:http:http.0] clickhouse-clickhouse.clickhouse.svc.cluster.local:8123, HTTP status=200
│
│ 2025-12-24 16:07:04 │ logging │ fluent-bit-58l57 │ [2025/12/24 16:06:59] [ info] [output:http:http.0] clickhouse-clickhouse.clickhouse.svc.cluster.local:8123, HTTP status=200
│
└─────────────────────┴───────────┴──────────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
┌───────────timestamp─┬─namespace───┬─pod─────────────────┬─log_message───────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 2025-12-24 16:07:04 │ kube-system │ storage-provisioner │ W1224 16:07:02.874294 1 warnings.go:70] v1 Endpoints is deprecated in v1.33+; use discovery.k8s.io/v1 EndpointSlice
│
│ 2025-12-24 16:07:04 │ kube-system │ storage-provisioner │ W1224 16:07:02.856919 1 warnings.go:70] v1 Endpoints is deprecated in v1.33+; use discovery.k8s.io/v1 EndpointSlice
│
│ 2025-12-24 16:07:04 │ kube-system │ storage-provisioner │ W1224 16:07:00.854029 1 warnings.go:70] v1 Endpoints is deprecated in v1.33+; use discovery.k8s.io/v1 EndpointSlice
│
│ 2025-12-24 16:07:04 │ kube-system │ storage-provisioner │ W1224 16:07:00.841958 1 warnings.go:70] v1 Endpoints is deprecated in v1.33+; use discovery.k8s.io/v1 EndpointSlice
│
└─────────────────────┴─────────────┴─────────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
qvamjak%
Step-6: Setup Grafana¶
apiVersion: v1
kind: Namespace
metadata:
name: monitoring
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: grafana-pvc
namespace: monitoring
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
containers:
- name: grafana
image: grafana/grafana:latest
ports:
- containerPort: 3000
name: http
env:
- name: GF_SECURITY_ADMIN_USER
value: admin
- name: GF_SECURITY_ADMIN_PASSWORD
value: admin
- name: GF_INSTALL_PLUGINS
value: grafana-clickhouse-datasource
volumeMounts:
- name: grafana-storage
mountPath: /var/lib/grafana
volumes:
- name: grafana-storage
persistentVolumeClaim:
claimName: grafana-pvc
---
apiVersion: v1
kind: Service
metadata:
name: grafana
namespace: monitoring
spec:
type: NodePort
ports:
- port: 3000
targetPort: 3000
nodePort: 30300
selector:
app: grafana
Step-7: Configure ClickHouse Datasource and Visualize with SQL¶
Configure ClickHouse Data Source
Click ☰ menu → Connections → Data sources
Click Add data source
Search for ClickHouse and select it
Configure:
Name: ClickHouse Logs
Server Address: clickhouse-clickhouse.clickhouse.svc.cluster.local
Server Port: 9000
Protocol: Native
Username: default
Password: xxxxxx
Default Database: logs
Click Save & Test
In Grafana Explorer use the following query
+++++++++++++++++++++++++++++++++++++++++++
SELECT
timestamp as time,
concat('[', pod, '] ', log_message) as line
FROM logs.application_logs
WHERE namespace = 'demo-app'
AND timestamp > now() - INTERVAL 10 MINUTE
ORDER BY timestamp DESC
LIMIT 500
We can see real time logs in Grafana as shown in the screenshot
No more paying premium prices for third-party logging services. By self-hosting ClickHouse on Kubernetes, you take full control of your observability stack — achieving enterprise-grade performance on infrastructure you own and operate. From seamlessly collecting logs across your cluster with Fluent Bit to visualizing them in real-time through Grafana, high-performance logging has never been more accessible or affordable.
👋 If you find this helpful, please click the clap 👏 button below a few times to show your support for the author 👇¶
🚀Join FAUN.dev() & get similar stories in your inbox each week for free!¶
We help developers learn and grow by keeping them up with what matters. 👉 www.faun.dev
More from Vamsi Jakkula and FAUN.dev() 🐾¶
Recommended from Medium¶
[
See more recommendations
](https://medium.com/?source=post_page---read_next_recirc--c6c3d757b2c9---------------------------------------)





