Skip to content

InnoDB Recovery with PDRT

Active Data Loss?

If you have identified data loss on your server, skip directly to the First Response section immediately. The longer MySQL remains online writing to your data files, the less likely recovery becomes. Every second counts.

Percona provides a set of tools for recovering lost or corrupted MySQL data from InnoDB data files. The Percona Data Recovery Tool (PDRT) is freely available, and with some setup, can retrieve data that would otherwise be lost permanently or recover from InnoDB corruption that innodb_force_recovery cannot resolve.

PDRT is most useful in these scenarios:

  • A mistaken DROP TABLE, DELETE, TRUNCATE, or UPDATE
  • Deletion of the data file at the filesystem level
  • Partial corruption where InnoDB cannot start, even with innodb_force_recovery set to its maximum value

This guide walks through the complete PDRT recovery workflow in a cPanel/WHM environment, though the core techniques apply to any MySQL installation.


InnoDB File Architecture

Before attempting recovery, you need to understand what you are recovering from. InnoDB stores data across several file types, and knowing which files matter determines your recovery strategy.

The System Tablespace: ibdata1

The system tablespace (ibdata1) is InnoDB's central file. It always exists, regardless of configuration. It contains:

  • The data dictionary - metadata about every InnoDB table, index, and column
  • The doublewrite buffer - a crash-recovery mechanism that prevents partial page writes
  • The change buffer - cached changes to secondary indexes
  • Undo logs - records needed to roll back uncommitted transactions

When innodb_file_per_table is disabled (the pre-MySQL 5.6 default), ibdata1 also stores all table data and indexes. This makes it the single target for recovery - but also means it grows indefinitely and cannot be shrunk without rebuilding.

Per-Table Files: .ibd and .frm

With innodb_file_per_table=ON (the default since MySQL 5.6 and the standard on all cPanel servers), each table gets its own .ibd file containing that table's data and indexes. The .frm file stores the table's structure definition.

This per-table layout simplifies some recovery scenarios because each table's data is isolated. However, it complicates DROP TABLE recovery because dropping a table removes both the .ibd and .frm files from the filesystem entirely.

MySQL 8.0 changes

MySQL 8.0 replaced the .frm files with a transactional data dictionary stored inside InnoDB itself. The recovery techniques in this guide still apply to the data files, but .frm-based schema extraction only works for MySQL 5.x and MariaDB.

InnoDB Log Files: ib_logfile*

The redo logs (ib_logfile0, ib_logfile1) record every change before it reaches the data files. InnoDB replays these logs after a crash to bring the data files up to date. Deleting these files without understanding the server's state can cause permanent data loss if uncommitted transactions needed for recovery are still in the logs.


The PDRT Toolkit

PDRT includes three core utilities. If you are in an active data-loss scenario, skip to First Response and come back here after your files are backed up.

constraints_parser

The primary recovery tool. It scans raw InnoDB data files (either ibdata1 or individual .ibd files) and extracts rows that match a defined table structure.

constraints_parser -4|-5|-6 [-dDV] -f <InnoDB page or dir> [-T N:M] [-b <external pages dir>]
Flag Purpose
-4 REDUNDANT row format (MySQL 4.x)
-5 COMPACT row format (MySQL 5.0+)
-6 MySQL 5.6+ format
-f Path to the InnoDB data file or directory of pages
-d Process only pages that may contain deleted records
-D Recover deleted rows only
-U Recover undeleted rows only (default)
-V Verbose mode with debug information
-T N:M Filter to a specific index ID
-b <dir> Directory for external BLOB pages
-p prefix Prefix for the dump directory in LOAD DATA output
-o <file> Save dump to a specific file

page_parser

Splits a tablespace file into individual pages grouped by index ID. This is needed when recovering from ibdata1 in shared-tablespace configurations where you need to isolate specific table data.

page_parser -4|-5 [-dDhcCV] -f <innodb_datafile> [-T N:M] [-s size] [-t size]
Flag Purpose
-c Count pages and group by index ID
-C Count pages, ignoring invalid index IDs
-s size Disk cache size (e.g., 1G, 10M, default 100M)
-t size Override tablespace size detection

create_defs.pl

A Perl script that connects to a running MySQL instance and generates the table_defs.h header file that constraints_parser uses to identify row structures.

create_defs.pl --host=localhost --user=root --password=PASS --db=DATABASE --table=TABLE

First Response

Every data-loss scenario is different, but this section covers the critical first steps from the perspective of someone who has just discovered the loss and needs to act fast. As time passes after data loss, the likelihood of full recovery decreases - deleted data pages get reused, and the window for recovery closes.

Step 1: Stop MySQL

If MySQL is still running, stop it immediately:

systemctl stop mysql
# Or on older systems:
/etc/init.d/mysql stop

The goal is to prevent new writes from overwriting space in the data files that still contains recoverable data. Deleted data is marked as free space, and InnoDB will reuse it for new writes.

Step 2: Back Up InnoDB Data Files

Create a working copy of your data files. All recovery work should happen against copies, never the originals:

mkdir /root/innodb.bak
cd /var/lib/mysql
dd if=ibdata1 of=ibdata1.recovery conv=noerror
cp ibdata1.recovery /root/innodb.bak/
cp ib_logfile* /root/innodb.bak/

The dd command with conv=noerror creates a copy even if there are read errors on disk, which is essential when dealing with filesystem-level corruption.

Step 3: Back Up the Database Directory

Copy the database directory containing the per-table .ibd and .frm files (replace $db_name with your database name):

cd /var/lib/mysql
cp -a ./$db_name /root/innodb.bak/

On cPanel servers, innodb_file_per_table=1 is the default, so each table has its own .ibd file. This makes per-table recovery possible but also means a DROP TABLE removes the .ibd file entirely - recovery from a dropped table requires filesystem-level file recovery tools, which is beyond PDRT's scope.

Step 4: Restart MySQL and Dump

With copies safely made, you can restart MySQL if it is able to start:

systemctl start mysql

If MySQL starts, take full dumps immediately:

mysqldump --single-transaction --all-databases > /root/dump_wtrans.sql
mysqldump --all-databases > /root/dump.sql

The --single-transaction flag takes a consistent snapshot without locking tables, which is important for InnoDB. Run both variants - if the InnoDB data is partially corrupted, one may succeed where the other fails.

Filesystem corruption

If you are dealing with filesystem-level corruption, copy your backup files to a different disk drive or a remote host. Do not keep your only copies on the potentially failing drive.


Assessing the Situation

With your files backed up, the immediate time pressure is over. Take stock of what you are dealing with before proceeding.

Data Loss Severity

Scenario Recovery Outlook
Deleted rows Good - recoverable if MySQL was stopped quickly
Truncated table Moderate - partial to full recovery depending on time elapsed
Widespread corruption Surprisingly recoverable - PDRT can succeed where innodb_force_recovery fails
Dropped table (file-per-table on) Difficult - requires filesystem-level recovery of the deleted .ibd file first

Existing Backups

If you have older backups or dumps, they are valuable even if outdated. The table structure from an old backup can serve as the template for PDRT's table_defs.h, which is essential for the recovery process.

If the table still exists in a running MySQL instance, capture its structure now:

mysql -NBe "SHOW CREATE TABLE table_name" database_name

MySQL Version

Check your MySQL version - it determines which constraints_parser format flag to use:

mysql -V
Version Format Flag
MySQL 4.x (REDUNDANT format) -4
MySQL 5.0 - 5.5 (COMPACT format) -5
MySQL 5.6+ / MariaDB 10.x+ -6

Installing PDRT

The PDRT source is available from Percona's GitHub repository. The original Bazaar/Launchpad repository is no longer maintained.

Download and Compile

cd /root
git clone https://github.com/percona/percona-data-recovery-tool-for-innodb.git pdrt
cd pdrt/mysql-source
./configure
cd ..
make

Install Perl Dependencies

The create_defs.pl script requires the DBD::mysql Perl module. Check if it is installed:

perl -MDBD::mysql -e 1

If this prints an error about not finding DBD/mysql.pm, install it. On cPanel servers:

/scripts/perlinstaller DBD::mysql

On other systems:

cpanm DBD::mysql
# Or via package manager:
apt install libdbd-mysql-perl    # Debian/Ubuntu
dnf install perl-DBD-MySQL       # RHEL/Fedora

Getting Your Table Structure

Recovery requires a table definition that tells constraints_parser what row structure to look for. How you obtain this depends on whether MySQL is running.

If MySQL Is Running

1. Create a recovery database:

mysql -e "CREATE DATABASE mydb_recovered"

2. Extract the table structure:

mysql -NBe "SHOW CREATE TABLE customer" mydb

This produces the full CREATE TABLE statement. Save it - you will need it to recreate the table in your recovery database.

3. Recreate the table in the recovery database:

mysql mydb_recovered
SET foreign_key_checks=0;
CREATE TABLE `customer` (
  `customer_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
  `store_id` tinyint(3) unsigned NOT NULL,
  `first_name` varchar(45) NOT NULL,
  `last_name` varchar(45) NOT NULL,
  `email` varchar(50) DEFAULT NULL,
  `address_id` smallint(5) unsigned NOT NULL,
  `active` tinyint(1) NOT NULL DEFAULT '1',
  `create_date` datetime NOT NULL,
  `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`customer_id`),
  KEY `idx_fk_store_id` (`store_id`),
  KEY `idx_fk_address_id` (`address_id`),
  KEY `idx_last_name` (`last_name`),
  CONSTRAINT `fk_customer_address` FOREIGN KEY (`address_id`)
    REFERENCES `address` (`address_id`) ON UPDATE CASCADE,
  CONSTRAINT `fk_customer_store` FOREIGN KEY (`store_id`)
    REFERENCES `store` (`store_id`) ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Disabling foreign key checks

SET foreign_key_checks=0 is required when recreating tables that reference other tables. Without it, CREATE TABLE fails if the referenced tables do not exist yet in the recovery database.

If MySQL Will Not Start

1. Try innodb_force_recovery:

Add to my.cnf under [mysqld]:

innodb_force_recovery = 1

Start MySQL and try increasing the value from 1 through 6 until it starts. See the MySQL documentation on forcing recovery for what each level does.

2. Reset the InnoDB files as a last resort:

If no recovery level works, move the InnoDB files aside so MySQL recreates fresh ones:

cd /var/lib/mysql
mv ibdata1 ibdata1.bak
mv ib_logfile0 ib_logfile0.bak
mv ib_logfile1 ib_logfile1.bak
systemctl start mysql

3. Extract structure from .frm files:

If you have the .frm file but MySQL cannot read the table, copy the .frm to a working MySQL instance (into a database directory where you have created a placeholder table with the same name), restart MySQL, and run SHOW CREATE TABLE.

Alternatively, the MySQL Utilities package includes mysqlfrm for extracting structure directly from .frm files.


Recovering Data

At this point you should have:

  • A copy of ibdata1.recovery and/or the relevant .ibd files
  • MySQL running with a recovery database containing the target table structures
  • PDRT compiled and ready

Automated Recovery

Generate the table definitions, compile, and run constraints_parser in one workflow:

cd /root/pdrt

# Generate table definitions from the recovery database
DB=mydb_recovered
TBL=customer
perl create_defs.pl \
  --host=localhost \
  --user=root \
  --password="$(grep pass /root/.my.cnf | cut -d= -f2 | sed -e 's/^"//' -e 's/"$//')" \
  --db=$DB \
  --table=$TBL > include/table_defs.h

# Recompile with the new definitions
make clean all

# Run constraints_parser against the data file
mkdir -p dumps/default
./constraints_parser -5 -f /root/innodb.bak/mydb/customer.ibd > dumps/default/customer

The output file dumps/default/customer contains tab-separated values for every row that matched the table definition. It will include both valid data and noise - rows where random data happened to match the column structure.

Manual Method with Custom Definitions

If the automated approach produces too much noise, you can run the tools manually with more control. Generate the table definitions:

cd /root/pdrt
DB=mydb_recovered; TBL=customer
perl create_defs.pl --host=localhost --user=root \
  --password="$(grep pass /root/.my.cnf | cut -d= -f2 | sed 's/^"//;s/"$//')" \
  --db=$DB --table=$TBL > include/table_defs.h

Recompile and run:

make clean all
./constraints_parser -5 -f /var/lib/mysql/ibdata1.recovery > recovery_output.tsv

To search specifically for deleted records, add the -D flag:

./constraints_parser -5 -Df /root/innodb.bak/mydb/customer.ibd

Tuning Table Definitions

The default table_defs.h generated by create_defs.pl uses broad constraints that match many false positives. Tightening the constraints dramatically reduces noise in the recovery output.

Here is the auto-generated definition for a first_name column:

{ /* varchar(45) */
    name: "first_name",
    type: FT_CHAR,
    min_length: 0,
    max_length: 135,

    has_limits: FALSE,
    limits: {
        can_be_null: FALSE,
        char_min_len: 0,
        char_max_len: 135,
        char_ascii_only: TRUE
    },

    can_be_null: FALSE
},

The key field is has_limits: FALSE - with limits disabled, constraints_parser accepts any data that structurally fits the column type. Enabling limits and tightening them filters out garbage data:

Field Default Tuned Reasoning
has_limits FALSE TRUE Enable constraint checking
can_be_null FALSE FALSE Every customer has a first name
char_min_len 0 2 No single-character first names in this dataset
char_max_len 135 30 No first name exceeds 30 characters
char_ascii_only TRUE TRUE Names in this table are ASCII

The tuned definition:

{ /* varchar(45) */
    name: "first_name",
    type: FT_CHAR,
    min_length: 0,
    max_length: 135,

    has_limits: TRUE,
    limits: {
        can_be_null: FALSE,
        char_min_len: 2,
        char_max_len: 30,
        char_ascii_only: TRUE
    },

    can_be_null: FALSE
},

After editing include/table_defs.h, recompile and run again:

cd /root/pdrt && make clean all
./constraints_parser -5 -Df /root/innodb.bak/mydb/customer.ibd

With tightened constraints, a search that previously returned thousands of false positives may return exactly the rows you need.


Importing Recovered Data

Once you have clean recovery output, you need to import it back into MySQL. The constraints_parser output includes both the data (on stdout) and the LOAD DATA INFILE SQL statement (on stderr).

Splitting Output and SQL

mkdir -p /root/pdrt/dumps/mydb_recovered

./constraints_parser -5 -Df /root/innodb.bak/mydb/customer.ibd -p mydb_recovered \
  > dumps/mydb_recovered/customer \
  2> customer.sql

# Remove progress lines from the SQL file
sed -i '/done/d' customer.sql

The -p mydb_recovered flag sets the dump path prefix in the generated LOAD DATA statement. The SQL file will contain something like:

SET FOREIGN_KEY_CHECKS=0;
LOAD DATA INFILE '/root/pdrt/dumps/mydb_recovered/customer'
  REPLACE INTO TABLE `customer`
  FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"'
  LINES STARTING BY 'customer\t'
  (`customer_id`, `store_id`, `first_name`, `last_name`, `email`,
   `address_id`, `active`, `create_date`, `last_update`);

The LINES STARTING BY 'customer\t' clause is important - it skips all the page metadata comments in the dump file and only imports lines that begin with the table name followed by a tab.

Running the Import

mysql mydb_recovered < /root/pdrt/customer.sql

If you get an error about being unable to stat the dump file:

ERROR 13 (HY000) at line 2: Can't get stat of '/root/pdrt/dumps/mydb_recovered/customer' (Errcode: 13)

Change LOAD DATA INFILE to LOAD DATA LOCAL INFILE:

sed -i 's/LOAD DATA INFILE/LOAD DATA LOCAL INFILE/g' /root/pdrt/customer.sql
mysql mydb_recovered < /root/pdrt/customer.sql

Verifying the Results

mysql mydb_recovered -e "SELECT COUNT(*) FROM customer"
mysql mydb_recovered -e "SELECT * FROM customer LIMIT 5"

Compare row counts and spot-check data against any existing backups or dumps you have.


Recovery Decision Flowchart

Use this to determine your recovery path based on the scenario:

flowchart TD
    A[Data Loss Detected] --> B{MySQL Running?}
    B -->|Yes| C[Stop MySQL Immediately]
    B -->|No| D[Do NOT Start MySQL]
    C --> E[Backup ibdata1, .ibd, .frm, ib_logfile*]
    D --> E
    E --> F{What happened?}
    F -->|Deleted rows| G[Use constraints_parser with -D flag on .ibd file]
    F -->|Truncated table| H[Use constraints_parser on .ibd file]
    F -->|Dropped table| I{Have .ibd file backup?}
    F -->|Corruption| J[Try innodb_force_recovery first]
    I -->|Yes| G
    I -->|No| K[Filesystem recovery needed]
    J -->|MySQL starts| L[mysqldump + PDRT on ibdata1]
    J -->|Still fails| M[Reset ib* files + PDRT on ibdata1.recovery]
    G --> N[Tune table_defs.h and iterate]
    H --> N
    L --> N
    M --> N
    N --> O[Import via LOAD DATA INFILE]

Practical Exercise


Further Reading


Previous: Scaling & Architecture Patterns | Back to Index

Comments