Blog

Simple and fast database backup thanks to Percona XtraBackup

Backup of a small database is not a problem. A few seconds of downtime in the application or at night is not a big problem for us. The difficulty arises when we really have a lot of records to archive. Such an example can be MySQL 5.5 database, which contains about 83 973 092 records, occupying nearly 8.2 GB. How to back up the database quickly so that users do not feel it? Fortunately, there is a solution!

The simplest method can be to use the MySQL database replication mechanism. Configure a second slave database, from which we make a backup. During the backup process we can stop the server, copy files or perform a dump using the mysqldump command. During the backup process we only load the clone of the main machine.

And what if we can’t afford to buy a second server, which will serve us as a backup base? The answer to this question is Percona XtraBackup application, which is released under GPLv2 license. It allows you to back up your database without downtime. We support database servers such as: Percona Server, MySQL, MariaDB, and Drizzle. This software allows for such operations as:

  • quick backup
  • not interrupting the transaction during the backup process
  • automatic verification of archived data
  • Faster recovery time
  • incremental backups
  • making replication simpler
  • Backup without server load
  • moving tables in real time between servers
  • backups only data stored in InnoDB

Percona XtraBackup software is used by large social networks such as Facebook:

“Facebook users create a vast amount of data every day. To make sure that data is stored reliably, we back up our databases daily. Facebook was an early adopter of incremental backup in XtraBackup.” – Vamsi Ponnekanti, Facebook Engineering

Percona XtraBackup installation

Installation of the software is very easy with binary packages available for Red Hat, CentOS, Debian and Ubuntu. There are also repositories for Yum and Apt.

The installation begins with adding a repository with the application to the system. Add an entry to /etc/yum.repos.d/percona.repo file:

[percona]
name = CentOS $releasever - Percona
baseurl=http://repo.percona.com/centos/$releasever/os/$basearch/
enabled = 1
gpgkey = file:///etc/pki/rpm-gpg/RPM-GPG-KEY-percona
gpgcheck = 1

We still need to copy the GPG key and save it to /etc/pki/rpm-gpg/RPM-GPG-KEY-percona. From this point on, we can start installing the software by issuing a command:

yum install xtrabackup

All software consists of three applications:

  • xtrabackup – software written in C that copies InnoDB and XtraDB data
  • innobackupex – a script that fills in the deficiencies of the previous program. Copies the entire contents of the MySQL server
  • tar4ibd – packs InnoDB data into tar format

Three similar commands will be installed to perform backups from the relevant database server versions:

  • xtrabackup – Percona Server 5.1 & MySQL 5.1 w/InnoDB plugin
  • xtrabackup_51 – Percona Server 5.0, MySQL 5.0 & MySQL 5.1
  • xtrabackup_55 – Percona Server 5.5 & MySQL 5.5

Time to perform the first backup with xtrabackup.

Backing up

Najszybszą metodą zrobienia kopii zapasowej jest wydanie polecenia xtrabackup z parametrem --backup. W naszym wypadku jest to serwer w wersji 5.5 zatem użyjemy polecenia:

The quickest way to back up is to issue the xtrabackup command with the --backup parameter. In our case it is a 5.5 server, so we will use the command:

xtrabackup_55 --backup --datadir=/var/lib/mysql/ --target-dir=/backup

In addition, you need to specify the target directory (--target_dir) in which you want to save the backup, and the directory in which the MySQL server data (--datadir) is located. These options can be saved in /etc/my.cnf, so that they are not always given as a parameter:

[mysqld]
datadir=/var/lib/mysql/
[xtrabackup]
target_dir=/backup

However, if an error occurs to us:

xtrabackup_55: ambiguous option '--innodb=FORCE' (innodb_adaptive_hash_index, innodb_doublewrite_file)

this means that in the file /etc/my.cnf, we have set the option innodb=force. A bypass is to replace it with loose-innodb=force. This error should be solved in subsequent versions of the application. If everything went well, we will see the result of the command:

xtrabackup_55 version 1.6.2 for Percona Server 5.5.9 Linux (x86_64) (revision id: 274)
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /var/lib/mysql/
xtrabackup: Target instance is assumed as followings.
xtrabackup:   innodb_data_home_dir = ./
xtrabackup:   innodb_data_file_path = ibdata1:10M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 2
xtrabackup:   innodb_log_file_size = 5242880
110812 11:11:41 InnoDB: Using Linux native AIO
>> log scanned up to (274992220655)
[01] Copying ./ibdata1 
     to /backup/ibdata1
>> log scanned up to (274992326877)
>> log scanned up to (274992483657)
>> log scanned up to (274992585754)
>> log scanned up to (274992665719)
>> log scanned up to (274992759458)
>> log scanned up to (274992852501)
>> log scanned up to (274992865265)
>> log scanned up to (274993036879)
>> log scanned up to (274993127087)
>> log scanned up to (274993241361)
>> log scanned up to (274993291339)
>> log scanned up to (274993427789)
>> log scanned up to (274993587569)
>> log scanned up to (274993678820)
>> log scanned up to (274993789123)
>> log scanned up to (274993835340)
>> log scanned up to (274994021991)
>> log scanned up to (274994033880)
>> log scanned up to (274994268323)
>> log scanned up to (274994376271)
>> log scanned up to (274994462281)
>> log scanned up to (274994529600)
>> log scanned up to (274994645157)
>> log scanned up to (274994727307)
>> log scanned up to (274994796009)
>> log scanned up to (274994898785)
>> log scanned up to (274994973583)
>> log scanned up to (274995041628)
>> log scanned up to (274995159543)
>> log scanned up to (274995244693)
>> log scanned up to (274995341883)
>> log scanned up to (274995439890)
>> log scanned up to (274995517633)
>> log scanned up to (274995608306)
>> log scanned up to (274995730882)
>> log scanned up to (274995798412)
>> log scanned up to (274995877561)
>> log scanned up to (274995983804)
>> log scanned up to (274996075140)
[01]        ...done
>> log scanned up to (274996152436)
xtrabackup: The latest check point (for incremental): '274994706269'
>> log scanned up to (274996152436)
xtrabackup: Stopping log copying thread.
xtrabackup: Transaction log of lsn (274991525290) to (274996152436) was copied.

It took about 1 minute to back up a database of 83,973,092 records, which take about 8.2 GB. By default, the software will try to back up the database as quickly as possible. This may overload the machine, so to limit the --throttle parameter. Too restrictive limitation of the program may cause the copy not to be made, because more records will reach the database server than it is archived.

Unfortunately, such a copy is not yet ready for use. Before restoring data, it should be specially prepared for this purpose. Please note, however, that these are only copied InnoDB data. To make the entire copy of the database, use the innobackupex script. First of all, it launches the xtrabackup program. After copying data is finished, it blocks tables with FLUSH TABLES WITH READ LOCK, copies MyISAM data and then removes the blockade.

Backup preparation

During backup, some data may have changed, so the backup may be inconsistent. Trying to run the database on such files will cause it to crash. That’s why XtraBackup uses crash-recovery to prepare the data.

The InnoDB engine runs the so-called redo log (transaction log). It contains a record of each change that occurred in the data. When the database starts, it reads this log and takes two steps. It accepts all data that have been atomized in the log, and reverses all changes that do not have a commitment.

Data preparation by xtrabackup works in a similar way using the built-in InnoDB engine. To start this process we issue a command:

xtrabackup_55 --prepare --target-dir=/backup

The result should be a similar fragment of the log:

xtrabackup_55 version 1.6.2 for Percona Server 5.5.9 Linux (x86_64) (revision id: 274)
xtrabackup: cd to /backup
xtrabackup: This target seems to be not prepared yet.
xtrabackup: xtrabackup_logfile detected: size=5210112, start_lsn=(274991525290)
xtrabackup: Temporary instance for recovery is set as followings.
xtrabackup:   innodb_data_home_dir = ./
xtrabackup:   innodb_data_file_path = ibdata1:10M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 1
xtrabackup:   innodb_log_file_size = 5210112
110812 11:33:21 InnoDB: Using Linux native AIO
xtrabackup: Starting InnoDB instance for recovery.
xtrabackup: Using 104857600 bytes for buffer pool (set by --use-memory parameter)
110812 11:33:21 InnoDB: The InnoDB memory heap is disabled
110812 11:33:21 InnoDB: Mutexes and rw_locks use GCC atomic builtins
110812 11:33:21 InnoDB: Compressed tables use zlib 1.2.3
110812 11:33:21 InnoDB: Using Linux native AIO
110812 11:33:21 InnoDB: Warning: innodb_file_io_threads is deprecated. Please use innodb_read_io_threads and innodb_write_io_threads instead
110812 11:33:21 InnoDB: Initializing buffer pool, size = 100.0M

110812 11:33:21 InnoDB: Completed initialization of buffer pool
110812 11:33:21 InnoDB: highest supported file format is Barracuda.
InnoDB: Log scan progressed past the checkpoint lsn 274991525290
110812 11:33:21  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Doing recovery: scanned up to log sequence number 274996152436 (99 %)
110812 11:33:23  InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 
InnoDB: Apply batch completed
110812 11:33:26  InnoDB: Waiting for the background threads to start
110812 11:33:27 Percona XtraDB (http://www.percona.com) 1.1.5-20.0 started; log sequence number 274996152436

[notice (again)]
  If you use binary log and don't use any hack of group commit,
  the binary log position seems to be:

xtrabackup: starting shutdown with innodb_fast_shutdown = 1
110812 11:33:27  InnoDB: Starting shutdown...
110812 11:33:32  InnoDB: Shutdown completed; log sequence number 274996207841

The last line indicates the correct preparation of the backup for restoration. This process can be run on a separate machine, so as not to overload the one on which we perform the backup. Just remember to use the same version of Perona XtraBackup. While preparing, xtrabackup launches a special version of the InnoDB engine.

The backup is ready for recovery. However, it is worth taking one more step to make restoring the database much faster. Make a second backup using the same command. The first execution of the command makes the data correct and continuous, but the log file InnoDB is not created. After restoring such a copy, MySQL server will have to create the log file itself. For large databases, this operation will take some time. Therefore, executing this command again should create these files.

This time we should see in the logs:

110812 13:43:49  InnoDB: Log file ./ib_logfile0 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile0 size to 5 MB
InnoDB: Database physically writes the file full: wait...
110812 13:43:49  InnoDB: Log file ./ib_logfile1 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile1 size to 5 MB
InnoDB: Database physically writes the file full: wait..

It’s time to restore data.

Rollback

The software does not have the option of restoring data, so this action must be done by yourself with the help of any tool. For example, you can use the rsync command to copy files to the right place. Then make sure they have the appropriate permissions.

Please note that xtrabackup copies only InnoDB data, so you should restore MyISAM data, table definitions (frm files) and everything else you need to run the server. To restore InnoDB data we will use commands:

cd /backup
rsync -rvt --exclude 'xtrabackup_checkpoints' --exclude 'xtrabackup_logfile' ./ /var/lib/mysql
chown -R mysql:mysql /var/lib/mysql/

After restoring the remaining data, you can start the database.

Incremental copies

Both applications (xtrabackup and innobackupex) support incremental backups. This means that it is only possible to archive data that has changed since the last backup. This allows us to implement an archiving policy of making full backups once a week and making incremental backups on a daily basis. At the beginning we will make a full backup with a command already known:

xtrabackup_55 --backup --datadir=/var/lib/mysql/ --target-dir=/backup/base

In addition to copying data, the application creates a file called xtrabackup_checkpoints, which contains the following data:

backup_type = full-backuped
from_lsn = 0
to_lsn = 1291135

It saves information about where the backup was made. When you make an incremental backup, the application starts to copy data from there. In order to start such an archiving, we issue a command:

xtrabackup_55 --backup --target-dir=/backup/inc1 --incremental-basedir=/backup/base --datadir=/var/lib/mysql/

The /backup/inc1 directory will now contain *.delta files, which contain data changed since the last full backup. Looking at the file xtrabackup_checkpoints we will see the entry:

backup_type = incremental
from_lsn = 1291135
to_lsn = 1291340

When making another incremental copy, we use the previous one. We give the same command:

xtrabackup_55 --backup --target-dir=/backup/inc2 --incremental-basedir=/backup/inc1 --datadir=/var/lib/mysql/

Preparing backup from incremental backup

The step --prepare is not the same as for the complete copy. In the previous method it was enough to issue the same command twice to get an efficient copy of InnoDB data. In this case it will look a bit different. In the example above, we made three backups:

  • /backup/base – full copy
  • /backup/inc1 – first incremental copy
  • /backup/inc2 – second incremental copy

We start by issuing a command:

xtrabackup_55 --prepare --apply-log-only --target-dir=/backup/base

Now you have to put a full backup, further incremental copies. We start to issue commands one by one:

xtrabackup_55 --prepare --apply-log-only --target-dir=/backup/base --incremental-dir=/backup/inc1
xtrabackup_55 --prepare --apply-log-only --target-dir=/backup/base --incremental-dir=/backup/inc2

Performing these commands we should be able to impose changes from each incremental copy. The finished copy will be located in /backup/base directory, and we should see the result of the last command on the screen:

incremental backup from 1291135 is enabled.
xtrabackup: cd to /backup/base/
xtrabackup: This target seems to be already prepared.
xtrabackup: xtrabackup_logfile detected: size=2097152, start_lsn=(1291340)
Applying /backup/inc1/ibdata1.delta ...
Applying /backup/inc1/test/table1.ibd.delta ...

Partial backup

The software supports partial backups when innodb_file_per_table is enabled on the server. There are two ways to approach this topic. For example, we will back up a database named test, which contains two tables: t1 and t2.

The first way is to use the --tables switch. This parameter supports regular expressions, so in order to back up the selected database we can use the command:

xtrabackup_55 --backup --datadir=/var/lib/mysql --target-dir=/backup --tables="^test[.].*"

To archive a selected table, use the command:

xtrabackup_55 --backup --datadir=/var/lib/mysql --target-dir=/backup --tables="^test[.]t1"

Similar results will be obtained using the --tables-file switch, indicating instead of the name of the table, the file in which it is stored. The parameter is case-sensitive. The next step is to prepare a copy for recovery. We will use the --prepare parameter.

The command is issued in the same way as in the previous steps, but there will be errors in the logs, that there are no other tables. This behavior is normal for applications, because we have not copied them. Such errors should not be worried about. Example result of the command:

InnoDB: Reading tablespace information from the .ibd files...
101107 22:31:30  InnoDB: Error: table 'test1/t'
InnoDB: in InnoDB data dictionary has tablespace id 6,
InnoDB: but tablespace with that id or name does not exist. It will be removed from

Limitations of XtraBackup

The software has a number of limitations that must be kept in mind when making a backup. This allows us to avoid many disappointments later on. This is a limitation:

  • If the xtrabackup_logfile file is larger than 4GB, during the –prepare step, the application crashes and fails to perform the job. The error affects only the 32 bit version of xtrabackup. This limit also appears in older versions of the 64-bit program.
  • Currently, the software does not create InnoDB logs (e.g. ib_logfile0) during the first boot –prepare. To create these files you need to do this step twice.
  • The software copies only InnoDB data and logs. In order to have a full backup of the database server, you have to manually copy the table definitions (.frm files, MyISAM data, users, privileges, i.e. everything that is not stored in InnoDB. The innobackupex script was created to make a copy of this data.
  • xtrabackup does not recognize very old syntax –set-variable in my.cnf file.
  • In some cases, the prepared data may be damaged if –target-dir points to an installed NFS resource with the async option. If you copy the data into such a directory and then run the preparation of files from another server, which also assembles the data in an asynchronous way, you may damage the backup files.

Statistics

The XtraBackup software also allows you to analyze InnoDB files in read-only mode. This allows you to collect various statistics about the database. The --stats option is used for this. To relieve the load on the server, only selected tables can be analyzed using --tables.

The analysis can be run on a running server, but it may not always succeed. It is recommended to back it up after printing –prepare. Example result of the command:

<INDEX STATISTICS>
  table: test/table1, index: PRIMARY, space id: 12, root page 3
  estimated statistics in dictionary:
    key vals: 25265338, leaf pages 497839, size pages 498304
  real statistics:
     level 2 pages: pages=1, data=5395 bytes, data/pages=32%
     level 1 pages: pages=415, data=6471907 bytes, data/pages=95%
        leaf pages: recs=25958413, pages=497839, data=7492026403 bytes, data/pages=91%

A special script has also been prepared, which formats the generated data to a more readable form for the user. Below is the content of the script tabulate-xtrabackup-stats.pl:

#!/usr/bin/env perl
    use strict;
    use warnings FATAL => 'all';
    my $script_version = "0.1";
     
    my $PG_SIZE = 16_384; # InnoDB defaults to 16k pages, change if needed.
    my ($cur_idx, $cur_tbl);
    my (%idx_stats, %tbl_stats);
    my ($max_tbl_len, $max_idx_len) = (0, 0);
    while ( my $line = <> ) {
       if ( my ($t, $i) = $line =~ m/table: (.*), index: (.*), space id:/ ) {
          $t =~ s!/!.!;
          $cur_tbl = $t;
          $cur_idx = $i;
          if ( length($i) > $max_idx_len ) {
             $max_idx_len = length($i);
          }
          if ( length($t) > $max_tbl_len ) {
             $max_tbl_len = length($t);
          }
       }
       elsif ( my ($kv, $lp, $sp) = $line =~ m/key vals: (\d+), \D*(\d+), \D*(\d+)/ ) {
          @{$idx_stats{$cur_tbl}->{$cur_idx}}{qw(est_kv est_lp est_sp)} = ($kv, $lp, $sp);
          $tbl_stats{$cur_tbl}->{est_kv} += $kv;
          $tbl_stats{$cur_tbl}->{est_lp} += $lp;
          $tbl_stats{$cur_tbl}->{est_sp} += $sp;
       }
       elsif ( my ($l, $pages, $bytes) = $line =~ m/(?:level (\d+)|leaf) pages:.*pages=(\d+), data=(\d+) bytes/ ) {
          $l ||= 0;
          $idx_stats{$cur_tbl}->{$cur_idx}->{real_pages} += $pages;
          $idx_stats{$cur_tbl}->{$cur_idx}->{real_bytes} += $bytes;
          $tbl_stats{$cur_tbl}->{real_pages} += $pages;
          $tbl_stats{$cur_tbl}->{real_bytes} += $bytes;
       }
    }
     
    my $hdr_fmt = "%${max_tbl_len}s %${max_idx_len}s %9s %10s %10s\n";
    my @headers = qw(TABLE INDEX TOT_PAGES FREE_PAGES PCT_FULL);
    printf $hdr_fmt, @headers;
     
    my $row_fmt = "%${max_tbl_len}s %${max_idx_len}s %9d %10d %9.1f%%\n";
    foreach my $t ( sort keys %tbl_stats ) {
       my $tbl = $tbl_stats{$t};
       printf $row_fmt, $t, "", $tbl->{est_sp}, $tbl->{est_sp} - $tbl->{real_pages},
          $tbl->{real_bytes} / ($tbl->{real_pages} * $PG_SIZE) * 100;
       foreach my $i ( sort keys %{$idx_stats{$t}} ) {
          my $idx = $idx_stats{$t}->{$i};
          printf $row_fmt, $t, $i, $idx->{est_sp}, $idx->{est_sp} - $idx->{real_pages},
             $idx->{real_bytes} / ($idx->{real_pages} * $PG_SIZE) * 100;
       }
    }

The previously generated data will be formatted into a form:

          TABLE           INDEX TOT_PAGES FREE_PAGES   PCT_FULL
art.link_out104                    832383      38561      86.8%
art.link_out104         PRIMARY    498304         49      91.9%
art.link_out104       domain_id     49600       6230      76.9%
art.link_out104     domain_id_2     26495       3339      89.1%
art.link_out104 from_message_id     28160        142      96.3%
art.link_out104    from_site_id     38848       4874      79.4%
art.link_out104   revert_domain    153984      19276      71.4%
art.link_out104    site_message     36992       4651      83.4%

Summary

Percona XtraBackup software is a very interesting solution in relation to others that we can find on the Internet. Simple and inexpensive, we can implement fast data archiving and use the technology used by large websites such as Facebook.

Add a comment!