Mysql service stops with [ERROR] InnoDB: CORRUPT LOG RECORD FOUND
In my Plesk Obsidian version 18.0.42 running on CentOS Linux release 7.9.2009 with mysql version 10.2.43-MariaDB. All of a sudden database has stopped working. Mysql service never comes up on start request. On checking the logs I found the below information saying that "CORRUPT LOG RECORD FOUND". I have tried starting the mysql using innodb_force_recovery from 1 to 6 meanwhile, all other than 6 has failed. Using 6 i can start the service in recovery mode. I could see from logs that lots of tables the major 4 databases including plesk database was corrupted. The major two databases were moodle db and had critical data of 7GB and 2GB respectively.
2022-04-13 8:37:27 140465444636864 [Note] InnoDB: Completed initialization of buffer pool 2022-04-13 8:37:27 140464863237888 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority(). 2022-04-13 8:37:27 140465444636864 [Note] InnoDB: Highest supported file format is Barracuda. 2022-04-13 8:37:27 140465444636864 [Note] InnoDB: Starting crash recovery from checkpoint LSN=158872609427 2022-04-13 8:37:27 140465444636864 [ERROR] InnoDB: ############### CORRUPT LOG RECORD FOUND ################## 2022-04-13 8:37:27 140465444636864 [Note] InnoDB: Log record type 65, page 158872609792:140721128394336. Log parsing proceeded successfully up to 158872609427. Previous log record type 128, is multi 0 Recv offset 0, prev 0 2022-04-13 8:37:27 140465444636864 [Note] InnoDB: Hex dump starting 0 bytes before and ending 100 bytes after the corrupted record: len 100; hex 4152330034003500360029a8eb3e00380001800880068007800080018001800180018008ffffffffffffffffffffffff80018000800080008000800080008000ffffffffffffffff8000800080008000ffff8008800880088008800080008008ffff7fff; asc AR3 4 5 6 ) > 8 ; 2022-04-13 8:37:27 140465444636864 [Note] InnoDB: Set innodb_force_recovery to ignore this error. 2022-04-13 8:37:27 140465444636864 [Warning] InnoDB: Log scan aborted at LSN 158872673280 2022-04-13 8:37:27 140465444636864 [ERROR] InnoDB: Plugin initialization aborted with error Generic error 2022-04-13 8:37:27 140465444636864 [Note] InnoDB: Starting shutdown... 2022-04-13 8:37:27 140465444636864 [ERROR] Plugin 'InnoDB' init function returned error. 2022-04-13 8:37:27 140465444636864 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
I have followed the plesk KDB doc https://support.plesk.com/hc/en-us/...s-for-the-MySQL-databases-on-Plesk-for-Linux- to resolve the case. My point is i have to proceed with risky step of removing mysql data directory. Even the plesk db was in corrupted list. But I have managed to recover it finally using the daily backups. Had to restore including psa db from backups. It was too stressful 4 hours to put everything back normal. This is the second time i am facing the same issue with this plesk server. Can some one let me know why this is happening? How can we prevent this ? Is there some method to monitor the same?
I cannot see anything in the https://jira.mariadb.org on this error that might affect the 10.2.43 version that you have. There are a potential few causes around corrupted data on disk that there isn't sufficient information to search for.
Given you didn't fully recovery from your first crash, there may be underlying issues that caused the second. As the 4 hours of recovery was unmanageable, the backup/restore strategy needs to be re-examined. Its never pleasant, but with planning and practice its manageable. Look at complementing the backup with mariabackup and binary logs a point-in-time recovery mechanism.
A logical mysqldump of both of your critical databases and a restore to a clean instance (brand new data-directory) at a non-emergency time would be a good first step in improving reliability.
The 10.2 series is soon to be end of life in MariaDB and perhaps planning an upgrade (10.6 maybe) at the same time will put you in a better state if you need to get MariaDB fixes.
With a clean install recently happened, if it does occur again, I recommend you create an new bug report with the issue. As this is a crash recovery, the original crash is also a very useful thing to have reported in a bug report.