ClickHouse Backup

This article explains how to use the backup-related playbooks in clickhouse_ansible to establish a standardized backup process for ClickHouse clusters.

The commands in this article are based on the portable Ansible distribution by default and have executed setup_portable_ansible.sh and source ~/.bashrc, so use ansible-playbook directly.

1. Scope of application

Backup related entries include:

  • playbooks/prepare_backup_disk.yml: Configure backup disk.
  • playbooks/backup_cluster.yml: Perform full or incremental backup.

The default documentation example uses NFS as backup storage.

2. Preconditions

  1. ClickHouse cluster deployment has been completed.
  2. The NFS server configuration has been completed on 192.168.199.162 or other dedicated host.
  3. The backup node can access the backup mounting directory, such as /backup.
  4. Use a dedicated backup inventory, such as inventory/hosts.backup.ini.
  5. If you want to perform off-site recovery in the future, it is recommended to perform NFS mounting and backup disk preparation on the recovery target at the same time.
  6. Objects within the backup scope preferentially use the replication table engine to avoid data gaps caused by single-copy backup.

3. Backup inventory minimal example

[clickhouse_backup]
ck-131-1 ansible_host=192.0.2.131 shard=1 replica=1 clickhouse_tcp_port=9000
ck-131-2 ansible_host=192.0.2.131 shard=3 replica=2 clickhouse_tcp_port=9001
ck-132-1 ansible_host=192.0.2.132 shard=1 replica=2 clickhouse_tcp_port=9000
ck-132-2 ansible_host=192.0.2.132 shard=2 replica=1 clickhouse_tcp_port=9001
ck-133-1 ansible_host=192.0.2.133 shard=2 replica=2 clickhouse_tcp_port=9000
ck-133-2 ansible_host=192.0.2.133 shard=3 replica=1 clickhouse_tcp_port=9001

[all:vars]
dbbot_inventory_purpose=backup
ansible_python_interpreter=auto_silent
ansible_user=root
ansible_ssh_pass="'<your_ssh_password>'"

Note: In backup scenarios, it is recommended to fill in clickhouse_tcp_port explicitly to avoid relying on port derivation logic.

4. Key parameters

Edit playbooks/vars/backup_config.yml and confirm the following parameters first:

  • backup_databases / backup_tables
  • backup_mode
  • backup_base_batch_id
  • backup_storage_disk
  • backup_mount_dir
  • backup_checkpoint_mode
  • backup_require_replicated_tables
  • backup_allow_partial_cluster

Additional instructions:

  • backup_cluster.yml now also checks clickhouse_default_password.
  • If you are still using the public default password Dbbot_default@8888, it will be intercepted by pre_tasks by default; it is only recommended to explicitly set fcs_allow_dbbot_default_passwd: true in experimental environments.

5. Configure NFS and backup disk for the first time

5.1 Configure NFS server

cd /usr/local/dbbot/clickhouse_ansible/playbooks
ansible-playbook \
  -i ../inventory/hosts.nfs_server.ini \
  setup_nfs_server.yml

By default /srv/nfs/clickhouse_backup is exported on 192.168.199.162.

5.2 Mount NFS on the source cluster

cd /usr/local/dbbot/clickhouse_ansible/playbooks
ansible-playbook \
  -i ../inventory/hosts.backup.ini \
  setup_nfs_client_mount_rc_local.yml

5.3 Write ClickHouse backup disk configuration for the backup node

cd /usr/local/dbbot/clickhouse_ansible/playbooks
ansible-playbook \
  -i ../inventory/hosts.backup.ini \
  prepare_backup_disk.yml \
  -e "backup_storage_disk=backup_nfs backup_mount_dir=/backup"

After execution, you should confirm that the corresponding backup disk is visible in system.disks.

5.4 If you want to restore to the disaster recovery cluster later, prepare the DR side backup disk in advance

cd /usr/local/dbbot/clickhouse_ansible/playbooks
ansible-playbook \
  -i ../inventory/hosts.dr_backup.ini \
  setup_nfs_client_mount_rc_local.yml

ansible-playbook \
  -i ../inventory/hosts.dr_backup.ini \
  prepare_backup_disk.yml \
  -e "backup_storage_disk=backup_nfs backup_mount_dir=/backup"

6. Perform full backup

cd /usr/local/dbbot/clickhouse_ansible/playbooks
ansible-playbook \
  -i ../inventory/hosts.backup.ini \
  backup_cluster.yml \
  -e '{"backup_databases":["biz_db"],"backup_mode":"full"}'

If you want to fix the batch number, you can explicitly pass in backup_batch_id:

cd /usr/local/dbbot/clickhouse_ansible/playbooks
ansible-playbook \
  -i ../inventory/hosts.backup.ini \
  backup_cluster.yml \
  -e '{"backup_databases":["biz_db"],"backup_mode":"full","backup_batch_id":"20260306T210000_CST_bk001"}'

7. Perform incremental backup

cd /usr/local/dbbot/clickhouse_ansible/playbooks
ansible-playbook \
  -i ../inventory/hosts.backup.ini \
  backup_cluster.yml \
  -e '{"backup_databases":["biz_db"],"backup_mode":"incremental","backup_base_batch_id":"20260306T210000_CST_bk001"}'

8. Backup products and calibers

After successful execution, the backup process typically outputs the following information:

  • backup_batch_id
  • safe_ts
  • manifest path

The manifest will be written in two copies by default:

  • Backup directory: /backup/<cluster>/<batch_id>/manifest/manifest.json
  • Control node: artifacts/manifests/backup/<batch_id>.json

Among them, safe_ts can be used as the follow-up business complement or the time caliber of data playback.

9. Validation before and after backup

9.1 Copy health check

SELECT database, table, is_readonly, queue_size, absolute_delay
FROM system.replicas
ORDER BY database, table;

9.2 Check backup results

jq '{batch_id, cluster_name, backup_mode, safe_ts, results}' \
  /backup/<cluster>/<batch_id>/manifest/manifest.json

10. Risks and Recommendations

  1. The default policy selects only one copy of each shard for physical backup to reduce repeated IO.
  2. If there are non-replicated local tables in the backup scope, it is recommended to rectify them before performing production backup.
  3. It is not recommended to use --check mode for backup Playbooks.
  4. In the production environment, it is recommended to retain batch numbers with fixed naming conventions to facilitate auditing and recovery.
  5. The NFS mount of the recovery target and prepare_backup_disk.yml should not wait until a real disaster occurs before executing it for the first time. It is recommended to solidify it during the drill stage.