Permanently remove a Kudu data disk

I needed to permanently remove a data disk from Kudu. In my case, this disk had way too many IOs and I needed to have Kudu not writing to it anymore. This post explains how to do this, safely.

Sanity checks

First, you need to make sure that there are no tables with replication factor 1. If by bad luck some tablet of this table are on the disk you will remove, then the table would become unavailable. Note that the user running this command must be in the superuser_acl  list of Kudu (replace of course ${kudu_master_host} with the real hostname).

kudu cluster ksck ${kudu_master_host} | grep '| 1 |' | cut -f2 ' '

If there are tables there, you need to

  • either DROP them
  • either recreate them with a higher replication factor. You cannot change the replication factor of an existing table.

Technically, there are other options, but they are trickier:

  • I could kudu tablet change_config move_replica tablets for all tables with RF 1 from eg. server 1 to server 2, then remove the directory for server 1, rebalance, then rinse and repeat from server 2 to 3 and so on. Note that you can only move tablet between servers, not disks, so if can take a while if you have many servers.
  • I could move the data directory from one disk to another disk as not whole disks are used by Kudu but only subdirectories. As all other disks already had Kudu data directories in my case, this would have meant that a disk would receive twice as many IOs.

Start a rebalance. After this the data will be properly spread, and more importantly we know that rebalance can happen.

kudu cluster rebalance ${kudu_master_host}

Stop kudu.

Remove a disk

Note: do this node per node! It should be possible to do 2 at a time, but I haven’t tested it. If you use Cloudera manager, you need to use config groups.

Remove the path to directory you want to remove from fs_data_dirs.
While kudu is still stopped, tell kudu on the tablet server which configuration you just changed, that there is now 1 less disk:

sudo -u kudu kudu fs update_dirs --force --fs_wal_dir=[your wal directory] --fs_data_dirs=[comma separated list of remaining directories]

Restart kudu. Data will be automatically rebalanced.

Congrats, go to your next node once all tablets are happy (kudu cluster ksck ${kudu_master_host} does not return any error).

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s