MDEV-14992 BACKUP SERVER by dr-m · Pull Request #4817 · MariaDB/server

dr-m · 2026-03-17T14:24:11Z

The following SQL statements will be introduced:

BACKUP SERVER TO '/path/to/directory';
BACKUP SERVER TO '/path/to/directory' 1 CONCURRENT;
BACKUP SERVER WITH 'command';
BACKUP SERVER WITH 1 CONCURRENT 'command';

In place of the 1, any positive number of threads may be specified. For the first variant, '/path/to' must exist and '/path/to/directory' must not exist; that is where the backup will be written to.

For the second variant, 'command' must be the name of a script or command that will be executed in a child process. The standard input of that command will be in a format that is compatible with GNU tar --format=oldgnu (and also BSD tar variants that are also part of Microsoft Windows and Apple macOS). The command is expected to optionally compress and encrypt the stream and redirect it to a file on a local or a remote server. The BACKUP SERVER WITH will append an additional argument, a positive base-ten number in ASCII, starting with 1, to identify the current thread. In this way, each concurrent stream can write a separate file.

The backup or the first stream will contain a file backup.cnf, which includes parameters needed for restoring the backup. Currently, these are innodb_log_recovery_start and innodb_log_recovery_target. If innodb_log_recovery_target>0, InnoDB will be in read-only mode, not allowing any writes to persistent files other than via the log application.

To restore a streaming backup made with BACKUP SERVER WITH, an empty directory needs to be created and all streams be extracted there using the standard tar utility of the operating system, optionally after undoing any encryption or compression that had been added by the backup command. Then, the backup is prepared or MariaDB server started up on the extracted directory, similar to as if the BACKUP SERVER TO statement had been used.

Note: The parameter innodb_log_recovery_start in backup.cnf is STRICTLY NECESSARY TO AVOID CORRUPTION! By default, InnoDB crash recovery starts from the latest available log checkpoint. However, for restoring a backup, recovery must start from the checkpoint that was the latest when the backup was started. Starting recovery from a possible later checkpoint will result in a corrupted database!

The following will be implemented separately:

MDEV-39061 mariadb-backup compatible wrapper script for BACKUP SERVER
MDEV-40163 Partial backup and restore
MDEV-39091 Back up ENGINE=RocksDB
MDEV-39092 Less blocking backup of ENGINE=Aria

The implementation introduces a basic driver Sql_cmd_backup, storage engine interfaces, and basic copying of the storage engines InnoDB, Aria, MyISAM, MERGE (MyISAM), Archive, CSV.

backup_target: A structured data type to represent a target directory. On Microsoft Windows, we must use directory paths because there is no variant of CopyFileEx() that would work on file handles.

backup_sink: Wraps a per-thread output stream as well as storage engine specific context.

handlerton::backup_start(), handlerton::backup_end(): Invoked at the start or end of a backup phase, in the thread that executes a BACKUP SERVER statement.

handlerton::backup_step(): A backup step that can be invoked from multiple threads concurrently, between the execution of the corresponding handlerton::backup_start() and handlerton::backup_end() of the same phase.

copy_entire_file(): A file copying service for POSIX systems.

copy_file(): A partial or sparse file-copying service for all systems.

backup_stream_append(): Equivalent to copy_file(), but appending to a stream. On Linux, this uses sendfile(2), which assumes that the source data will not be changed before the data has been consumed from the pipe.

backup_stream_append_async(): A variant of backup_stream_append() where the source file region is guaranteed to be immutable after the call returns. We must not use Linux sendfile(2) for copying data files that may be modified in place, because it could introduce a race condition between a page write that runs concurrently with a child process that is reading the data from the pipe.

InnoDB_backup::context: Backup context, attached to backup_sink so that context can continue to exist between the time a BACKUP SERVER releases all locks and another BACKUP SERVER starts executing, with innodb_backup pointing to the new backup, while the old backup is still being finished.

fil_space_t::write_or_backup: Keep track of in-flight page writes and pending backup operation. We must not allow them concurrently, because that could lead into torn pages in the backup.

fil_space_t::backup_end: The first page number that is not being backed up (by default 0, to indicate that no backup is in progress).

fil_space_t::BACKUP_BATCH_SIZE: The number of preceding pages that will be covered by fil_space_t::backup_end. This is the unit of "page range locking" during InnoDB backup.

log_sys.backup: Whether BACKUP SERVER is in progress. The purpose of this is to make BACKUP SERVER prevent the concurrent execution of SET GLOBAL innodb_log_archive=OFF or SET GLOBAL innodb_log_file_size when innodb_log_archive=OFF.

log_sys.archived_checkpoint: Keep track of the earliest available checkpoint, corresponding to log_sys.archived_lsn. This reflects SET GLOBAL innodb_log_recovery_start (which is settable now), for incremental backup.

buf_flush_list_space(): Check for concurrent backup before writing each page. This is inefficient, but this function may be invoked from multiple threads concurrently, and it cannot be changed easily, especially for fil_crypt_thread().

fil_system.have_all_spaces: Whether all tablespace metadata is guaranteed to be known. To speed up startup, InnoDB does not normally open all tablespace files.

CLAassistant · 2026-03-17T14:24:21Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

dr-m · 2026-06-26T13:05:54Z

I plan to rebase this once #5070 has been merged up to the 13.0 branch. @grooverdan pushed a merge to 10.11 today, and I pushed to 11.4 and 11.8. I hope that the conflicts for 12.3 and potentially 13.0 will have be resolved by Monday.

The ultimate merge target is main. For testing, it is better to be based on the oldest maintained branch that includes #4405, which forms the fundament for this, innodb_log_archive=ON.

While rebasing, I will write a description based on the commit message of 4769a43, but mentioning actual MDEVs for the outstanding work. Soon after the rebase, we can include #5140 so that this can be tested more conveniently.

The following SQL statements will be introduced: BACKUP SERVER TO '/path/to/directory' [ 1 CONCURRENT ]; BACKUP SERVER WITH [ 1 CONCURRENT ] 'command'; In place of the 1, any positive number of threads may be specified. For the first variant, '/path/to' must exist and '/path/to/directory' must not exist; that is where the backup will be written to. For the second variant, 'command' must be the name of a script or command that will be executed in a child process. The standard input of that command will be in a format that is compatible with GNU tar --format=oldgnu (and also BSD tar variants that are also part of Microsoft Windows and Apple macOS). The command is expected to optionally compress and encrypt the stream and redirect it to a file on a local or a remote server. The BACKUP SERVER WITH will append an additional argument, a positive base-ten number in ASCII, starting with 1, to identify the current thread. In this way, each concurrent stream can write a separate file. The backup or the first stream will contain a file backup.cnf, which includes parameters needed for restoring the backup. Currently, these are innodb_log_recovery_start and innodb_log_recovery_target. If innodb_log_recovery_target>0, InnoDB will be in read-only mode, not allowing any writes to persistent files other than via the log application. To restore a streaming backup made with BACKUP SERVER WITH, an empty directory needs to be created and all streams be extracted there using the standard tar utility of the operating system, optionally after undoing any encryption or compression that had been added by the backup command. Then, the backup is prepared or MariaDB server started up on the extracted directory, similar to as if the BACKUP SERVER TO statement had been used. Note: The parameter innodb_log_recovery_start in backup.cnf is STRICTLY NECESSARY TO AVOID CORRUPTION! By default, InnoDB crash recovery starts from the latest available log checkpoint. However, for restoring a backup, recovery must start from the checkpoint that was the latest when the backup was started. Starting recovery from a possible later checkpoint will result in a corrupted database! The following will be implemented separately: MDEV-39061 mariadb-backup compatible wrapper script for BACKUP SERVER MDEV-40163 Partial backup and restore MDEV-39091 Back up ENGINE=RocksDB MDEV-39092 Less blocking backup of ENGINE=Aria The implementation introduces a basic driver Sql_cmd_backup, storage engine interfaces, and basic copying of the storage engines InnoDB, Aria, MyISAM, MERGE (MyISAM), Archive, CSV. backup_target: A structured data type to represent a target directory. On Microsoft Windows, we must use directory paths because there is no variant of CopyFileEx() that would work on file handles. backup_sink: Wraps a per-thread output stream as well as storage engine specific context. handlerton::backup_start(), handlerton::backup_end(): Invoked at the start or end of a backup phase, in the thread that executes a BACKUP SERVER statement. handlerton::backup_step(): A backup step that can be invoked from multiple threads concurrently, between the execution of the corresponding handlerton::backup_start() and handlerton::backup_end() of the same phase. copy_entire_file(): A file copying service for POSIX systems. copy_file(): A partial or sparse file-copying service for all systems. backup_stream_append(): Equivalent to copy_file(), but appending to a stream. On Linux, this uses sendfile(2), which assumes that the source data will not be changed before the data has been consumed from the pipe. backup_stream_append_async(): A variant of backup_stream_append() where the source file region is guaranteed to be immutable after the call returns. We must not use Linux sendfile(2) for copying data files that may be modified in place, because it could introduce a race condition between a page write that runs concurrently with a child process that is reading the data from the pipe. InnoDB_backup::context: Backup context, attached to backup_sink so that context can continue to exist between the time a BACKUP SERVER releases all locks and another BACKUP SERVER starts executing, with innodb_backup pointing to the new backup, while the old backup is still being finished. fil_space_t::write_or_backup: Keep track of in-flight page writes and pending backup operation. We must not allow them concurrently, because that could lead into torn pages in the backup. fil_space_t::backup_end: The first page number that is not being backed up (by default 0, to indicate that no backup is in progress). fil_space_t::BACKUP_BATCH_SIZE: The number of preceding pages that will be covered by fil_space_t::backup_end. This is the unit of "page range locking" during InnoDB backup. log_sys.backup: Whether BACKUP SERVER is in progress. The purpose of this is to make BACKUP SERVER prevent the concurrent execution of SET GLOBAL innodb_log_archive=OFF or SET GLOBAL innodb_log_file_size when innodb_log_archive=OFF. log_sys.archived_checkpoint: Keep track of the earliest available checkpoint, corresponding to log_sys.archived_lsn. This reflects SET GLOBAL innodb_log_recovery_start (which is settable now), for incremental backup. buf_flush_list_space(): Check for concurrent backup before writing each page. This is inefficient, but this function may be invoked from multiple threads concurrently, and it cannot be changed easily, especially for fil_crypt_thread(). fil_system.have_all_spaces: Whether all tablespace metadata is guaranteed to be known. To speed up startup, InnoDB does not normally open all tablespace files.

Observe aria_log_dir_path Patch based on code by Thirunarayanan Balathandayuthapani

dr-m · 2026-06-29T14:29:53Z

+        const uint32_t end{start + fil_space_t::BACKUP_BATCH_SIZE};
+        backup_batch_start(node->space, end);
+        /* TODO: avoid copying freed page ranges */
+        err= copy_file(node->handle, f, start * uint64_t{page_size},
+                       std::min(end, file_size) * uint64_t{page_size});
+        backup_batch_stop(node->space);


If this is a ROW_FORMAT=COMPRESSED table, then the file may be 1024, 2048, or 3172 bytes shorter than calculated, and the copying could fail. This API as well as the one in stream() must be refactored so that we will know how much was actually copied. The reason for this short file is that fil_space_extend_must_retry() will only extend files to integer multiples of 4096 bytes.

In stream() we must pad with field_ref_zero so that the file size will match what was written to the header. The last page will be recovered from the redo log.

Note: We don’t currently keep track of the file size or the allocated file size as of the checkpoint when the backup started. If we did that, we could copy even less. That could be an even more elegant fix of this. I think we would create sparse files that match the current file size.

dr-m self-assigned this Mar 17, 2026

dr-m added the MariaDB Corporation label Mar 17, 2026

dr-m force-pushed the MDEV-14992 branch 2 times, most recently from 2723322 to 1703796 Compare March 18, 2026 11:01

vuvova reviewed Mar 18, 2026

View reviewed changes

Comment thread sql/sql_backup.cc

dr-m force-pushed the MDEV-14992 branch 2 times, most recently from 9a529de to 857edeb Compare March 23, 2026 08:28

dr-m changed the base branch from 11.4 to 12.3 March 24, 2026 11:51

dr-m force-pushed the MDEV-14992 branch 3 times, most recently from 8149b3d to c08d121 Compare March 27, 2026 09:48

dr-m force-pushed the MDEV-14992 branch from fcf4ee1 to b182d72 Compare April 8, 2026 08:56

dr-m mentioned this pull request Apr 8, 2026

MDEV-39101 Make BACKUP SERVER mutually exclusive with itself and BACKUP STAGE #4892

Open

dr-m commented Apr 15, 2026

View reviewed changes

Comment thread storage/innobase/handler/backup_innodb.cc Outdated

Comment thread mysql-test/suite/backup/backup_innodb.test

dr-m changed the base branch from 12.3 to main May 5, 2026 10:49

dr-m force-pushed the MDEV-14992 branch from bcbda03 to e0d850e Compare May 5, 2026 12:06

dr-m force-pushed the MDEV-14992 branch from e0d850e to 0c52540 Compare May 18, 2026 09:40

dr-m commented May 19, 2026

View reviewed changes

Comment thread sql/sql_backup.cc Outdated

dr-m force-pushed the MDEV-14992 branch from bdec600 to 45e7902 Compare May 20, 2026 06:52

dr-m mentioned this pull request May 21, 2026

MDEV-39092 Copy Aria data and logs as part of backup #4971

Open

dr-m commented May 27, 2026

View reviewed changes

Comment thread storage/innobase/handler/backup_innodb.cc Outdated

dr-m force-pushed the MDEV-14992 branch from d595ce0 to c0e48fc Compare May 29, 2026 14:17

dr-m mentioned this pull request Jun 5, 2026

MDEV-39861 innodb_log_recovery_target wrongly opens log in read-write mode #5185

Merged

dr-m force-pushed the MDEV-14992 branch from b98be03 to 10539aa Compare June 8, 2026 09:35

dr-m mentioned this pull request Jun 11, 2026

MDEV-39061 mariadb-backup compatible wrappers for BACKUP SERVER #5140

Open

dr-m commented Jun 15, 2026

View reviewed changes

Comment thread storage/innobase/buf/buf0flu.cc

dr-m mentioned this pull request Jun 18, 2026

MDEV-40063 Corruption due to race in SET GLOBAL innodb_log_archive=ON #5253

Merged

dr-m force-pushed the MDEV-14992 branch from e81af7f to 6c8a37f Compare June 18, 2026 09:59

dr-m force-pushed the MDEV-14992 branch from 53cf648 to 87036f8 Compare June 25, 2026 13:10

dr-m force-pushed the MDEV-14992 branch from 87036f8 to 4769a43 Compare June 25, 2026 13:18

dr-m commented Jun 26, 2026

View reviewed changes

Comment thread storage/maria/ma_backup.cc Outdated

dr-m commented Jun 26, 2026

View reviewed changes

Comment thread storage/maria/ma_backup_server.cc Outdated

dr-m commented Jun 26, 2026

View reviewed changes

Comment thread storage/innobase/handler/backup_innodb.cc

dr-m changed the base branch from main to 13.0 June 26, 2026 12:54

dr-m force-pushed the MDEV-14992 branch from a7570c4 to 81b3ae7 Compare June 29, 2026 08:23

dr-m requested a review from Thirunarayanan June 29, 2026 08:34

dr-m changed the title ~~MDEV-14992 BACKUP SERVER to mounted file system~~ MDEV-14992 BACKUP SERVER Jun 29, 2026

dr-m marked this pull request as ready for review June 29, 2026 08:35

dr-m added 2 commits June 29, 2026 13:51

fixup! 81b3ae7

49f4a3e

squash! 81b3ae7

6c59b14

Observe aria_log_dir_path Patch based on code by Thirunarayanan Balathandayuthapani

dr-m commented Jun 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MDEV-14992 BACKUP SERVER#4817

MDEV-14992 BACKUP SERVER#4817
dr-m wants to merge 3 commits into
13.0from
MDEV-14992

dr-m commented Mar 17, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Mar 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dr-m commented Jun 26, 2026

Uh oh!

dr-m Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Uh oh!

Uh oh!

Conversation

dr-m commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dr-m commented Jun 26, 2026

Uh oh!

dr-m Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

dr-m commented Mar 17, 2026 •

edited

Loading

CLAassistant commented Mar 17, 2026 •

edited

Loading