ClickHouse 运维与排障
本文整理 clickhouse_ansible 公开文档中最常用的 ClickHouse 运维命令与排障查询,适合作为日常巡检和验收参考。
1. 建议先设置连接变量
export CH_HOST="192.0.2.131"
export CH_PORT="9000"
export CH_USER="default"
export CH_PASS="<clickhouse_default_password>"
export CH_CLUSTER="example_3shards_2replicas"
如果你习惯使用快捷函数,可定义:
chc() {
clickhouse-client --host "${CH_HOST}" --port "${CH_PORT}" --user "${CH_USER}" --password "${CH_PASS}" "$@"
}
2. 基础登录与版本检查
clickhouse-client --host "${CH_HOST}" --port "${CH_PORT}" --user "${CH_USER}" --password "${CH_PASS}"
SELECT version();
3. 集群健康检查
3.1 集群拓扑
SELECT cluster, shard_num, replica_num, host_name, host_address, port
FROM system.clusters
WHERE cluster = 'example_3shards_2replicas'
ORDER BY shard_num, replica_num;
3.2 副本状态
SELECT database, table, is_leader, is_readonly, queue_size, absolute_delay
FROM system.replicas
ORDER BY database, table;
3.3 协调服务连通性
SELECT count() FROM system.zookeeper WHERE path IN ('/', '/clickhouse');
3.4 所有节点版本
SELECT hostName(), version()
FROM clusterAllReplicas('example_3shards_2replicas', system.one)
ORDER BY hostName();
4. 常用排障 SQL
4.1 当前执行中的查询
SELECT query_id, user, elapsed, read_rows, memory_usage, query
FROM system.processes
ORDER BY elapsed DESC;
4.2 复制队列堆积
SELECT database, table, queue_size, inserts_in_queue, merges_in_queue, log_max_index - log_pointer AS log_lag
FROM system.replicas
ORDER BY queue_size DESC;
4.3 Mutation 状态
SELECT database, table, mutation_id, command, create_time, is_done, latest_failed_part, latest_fail_reason
FROM system.mutations
ORDER BY create_time DESC
LIMIT 20;
4.4 活跃数据分区
SELECT database, table, partition, count() AS parts, sum(rows) AS rows, formatReadableSize(sum(bytes_on_disk)) AS size
FROM system.parts
WHERE active
GROUP BY database, table, partition
ORDER BY rows DESC
LIMIT 30;
5. 常见系统命令
systemctl status clickhouse-server --no-pager
systemctl status clickhouse-server-2 --no-pager
ss -lntp | egrep ':8123 |:8124 |:9000 |:9001 |:9009 |:9010 '
journalctl -u clickhouse-server -n 200 --no-pager
journalctl -u clickhouse-server-2 -n 200 --no-pager
6. 与备份恢复文档的关系
- 备份流程请看:ClickHouse 备份
- 恢复流程请看:ClickHouse 恢复
- 设计规范请看:ClickHouse 部署、备份与恢复规范