ClickHouse Restore
This article explains how to restore a backup batch to a target ClickHouse cluster using the restore playbook in clickhouse_ansible.
The commands in this article are based on the portable Ansible distribution by default and have executed setup_portable_ansible.sh and source ~/.bashrc, so use ansible-playbook directly.
1. Overview of recovery principles
By default, the recovery process takes the backup manifest as input and uses the strategy of “full recovery of the primary copy of each shard and recovery of other copy structures”:
- Select a primary copy for each shard to perform full recovery.
- Other replicas perform
structure_onlyrecovery to complete the structure and replication metadata. - The ClickHouse replication mechanism then completes the remaining data equalization.
This can reduce the duplicate IO caused by fully importing all copies from the backup disk at the same time.
2. Preconditions
- A backup batch and manifest file already exist.
- The target cluster has been NFS mounted and the backup disk is correctly configured, such as
backup_nfs. - All target nodes can access the backup mounting directory, such as
/backup. - Use a dedicated recovery inventory, such as
inventory/hosts.restore.ini. - If the target is disaster recovery of the original cluster, write stop and object cleanup need to be completed first.
- If it is the default drill environment, the recovery target is
192.168.199.141/142/143, backup disk preparation usesinventory/hosts.dr_backup.ini, and recovery execution usesinventory/hosts.restore.ini.
3. Minimal example of restoring inventory
[clickhouse_restore]
ck-141-1 ansible_host=198.51.100.141 shard=1 replica=1 clickhouse_tcp_port=9000
ck-141-2 ansible_host=198.51.100.141 shard=3 replica=2 clickhouse_tcp_port=9001
ck-142-1 ansible_host=198.51.100.142 shard=1 replica=2 clickhouse_tcp_port=9000
ck-142-2 ansible_host=198.51.100.142 shard=2 replica=1 clickhouse_tcp_port=9001
ck-143-1 ansible_host=198.51.100.143 shard=2 replica=2 clickhouse_tcp_port=9000
ck-143-2 ansible_host=198.51.100.143 shard=3 replica=1 clickhouse_tcp_port=9001
[all:vars]
dbbot_inventory_purpose=restore
ansible_python_interpreter=auto_silent
ansible_user=root
ansible_ssh_pass="'<your_ssh_password>'"
4. Key parameters
Edit playbooks/vars/restore_config.yml and confirm:
restore_batch_idrestore_manifest_filerestore_to_all_replicasrestore_allow_non_empty_tablesrestore_require_manual_confirmrestore_enable_two_phase_mv_compatrestore_allow_partial_cluster
Additional instructions:
restore_cluster.ymlalso checksclickhouse_default_password.- If you are still using the public default password
Dbbot_default@8888, it will be intercepted bypre_tasksby default; it is only recommended to explicitly setfcs_allow_dbbot_default_passwd: truein experimental environments.
5. Target cluster preparation before recovery
cd /usr/local/dbbot/clickhouse_ansible/playbooks
ansible-playbook \
-i ../inventory/hosts.dr_backup.ini \
setup_nfs_client_mount_rc_local.yml
ansible-playbook \
-i ../inventory/hosts.dr_backup.ini \
prepare_backup_disk.yml \
-e "backup_storage_disk=backup_nfs backup_mount_dir=/backup"
The above two steps must be completed on the recovery target cluster before entering the formal recovery.
6. Standard recovery commands
cd /usr/local/dbbot/clickhouse_ansible/playbooks
ansible-playbook \
-i ../inventory/hosts.restore.ini \
restore_cluster.yml \
-e "restore_batch_id=20260306T210000_CST_bk001"
By default, you will be prompted to enter a confirmation phrase, for example: RESTORE 20260306T210000_CST_bk001.
7. Single copy recovery model
If you only want to restore one copy of each shard, you can temporarily overwrite:
cd /usr/local/dbbot/clickhouse_ansible/playbooks
ansible-playbook \
-i ../inventory/hosts.restore.ini \
restore_cluster.yml \
-e "restore_batch_id=20260306T210000_CST_bk001 restore_to_all_replicas=false"
This mode is more suitable for drills or special scenarios; when using it in production, you should first confirm that the table structure and replication strategy meet the requirements.
8. Recommended sequence for disaster recovery of the original cluster
- Stop business writing and freeze the change window.
- Select the recovery batch and check
safe_tsin the manifest. - Use recovery-specific inventory to point to the target cluster.
- Manually clean up the database or table that needs to be restored.
- Execute
restore_cluster.yml. - Verify row count, replica status, and replication health.
- Use
safe_tsas the starting point to perform business complement or playback.
9. Validation after recovery
8.1 Total amount validation
SELECT count() FROM <db>.<distributed_table>;
8.2 Number of local rows in each copy
SELECT
hostName() AS host,
tcpPort() AS port,
count() AS rows
FROM clusterAllReplicas('<cluster>', '<db>', '<local_table>')
GROUP BY host, port
ORDER BY host, port;
8.3 Copy health
SELECT table, is_readonly, queue_size, absolute_delay
FROM system.replicas
WHERE database = '<db>'
ORDER BY table;
8.4 Acceptance criteria of TTL table
If the recovery object has TTL...DELETE, do not directly use count() of the entire table of the source cluster and the target cluster to strengthen consistency acceptance.
Reason:
RESTOREis only responsible for restoring the part.TTL DELETErelies on background merge asynchronous cleanup and will not be completed instantly during recovery.- Even if
ALTER TABLE ... MATERIALIZE TTLis executed manually, the execution time of the source cluster and the target cluster may be different, and a small amount of data near the TTL boundary may continue to drift.
For example, mysql_slowlog_raw_local defaults to:
- Number span
60days - Table TTL
30 DAY DELETE
This combination is more suitable for demonstrating TTL behavior, and is not suitable for directly using “the total number of all tables must be exactly the same” for recovery acceptance.
For TTL tables, it is recommended to use one of the following two methods:
- Use a fixed time window for comparison, and both source and target use the same literal cutoff:
SELECT count()
FROM lab_ck_biz.mysql_slowlog_raw_local
WHERE log_time >= toDateTime('2026-02-07 00:00:00');
- Prioritize business tables without TTL for recovery consistency acceptance.
If you just want to speed up TTL cleanup, you can execute it on both the source and destination:
ALTER TABLE lab_ck_biz.mysql_slowlog_raw_local
ON CLUSTER 'example_3shards_2replicas'
MATERIALIZE TTL
SETTINGS mutations_sync = 2;
But please note: this command can only help speed up TTL convergence. It is still not recommended to use “the total number of the entire table in the TTL table is completely consistent” as the only acceptance criterion.
10. Precautions
- Restoring Playbook by default requires that the inventory file name contains
restoreanddbbot_inventory_purpose=restoreis set. - Playbook will not automatically execute high-risk
DROP DATABASEorDROP TABLE. These actions require manual review before execution. - For scenes containing materialized views, two-stage compatible recovery logic is enabled by default.
- If the backup contains configuration snapshots, whether to play back the configuration should be determined based on the change window and target environment assessment.
- The NFS mount and
prepare_backup_disk.ymlon the recovery target side will not be automatically handled byrestore_cluster.ymland must be prepared in advance. - If you want to perform cross-cluster acceptance of the source cluster and the target cluster, please use the independent ClickHouse Post-Restore Data Validation document and
validate_restore_consistency.yml.