掘金 后端 ( ) • 2024-05-05 17:13

因数广政务云华为业务存储固件升级,导致数据库产生坏块,业务SQL查询报错如下:

ERROR:  missing chunk number 0 for toast value 38166585 in SYS_TOAST_30170
CONTEXT:  PL/SQL function inline_code_block line 12 at FOR over EXECUTE statement

解决办法:

1、先查询出 SYS_TOAST_30170 属于哪张表的大字段。

select 
  n.nspname AS schema_name,
  c.relname AS table_name,
  a.attname AS column_name,
  bb.relname as TOAST_name
 from 
  sys_class c 
  inner join (select substring(sc.RELNAME from '[0-9]+') gg, sc.relname from sys_class sc) bb on c.oid = bb.gg
  inner join sys_attribute a on a.attrelid = c.oid
  inner join sys_namespace n ON n.oid = c.relnamespace
  WHERE 
    a.attnum > 0 
  AND NOT a.attisdropped
  AND a.atttypid = (SELECT oid FROM sys_type WHERE typname = 'text')  
  and bb.relname = 'SYS_TOAST_30170';

image.png

PostgreSQL使用以下脚本:

select 
  n.nspname AS schema_name,
  c.relname AS table_name,
  a.attname AS column_name,
  bb.relname as TOAST_name
 from 
    pg_class c 
    inner join (select substring(sc.RELNAME from '[0-9]+') gg, sc.relname from pg_class sc) bb on c.oid = bb.gg
    inner join pg_attribute a on a.attrelid = c.oid
    inner join pg_namespace n ON n.oid = c.relnamespace
    WHERE 
    a.attnum > 0 
    AND NOT a.attisdropped
    AND a.atttypid = (SELECT oid FROM pg_type WHERE typname = 'text')  
    and bb.relname = 'SYS_TOAST_30170';

2、创建一张记录含有坏块行的表、row_error_table

CREATE TABLE row_error_table (
    err_ctid tid,
    err_tablename varchar2(1000),
    err_cloum varchar2(500),
    err_CONTEXT text
);

3、使用下面脚本找到坏块在哪些行进行处理。


DO $$
DECLARE
    rec RECORD;        
    /* 需要检测坏块的表、手动输入 */
    v_tablename text := 'g_inbox';     
    /* 需要检测坏块的列、手动输入 */    
    v_columnname text := 'PROCESS_CONTENT';                
BEGIN
    FOR rec IN EXECUTE format('SELECT ctid, %I FROM %I ORDER BY ctid ', v_columnname, v_tablename) LOOP
        BEGIN
            RAISE NOTICE 'CTID Parameter is: %', rec.ctid;
            RAISE NOTICE 'ROW Parameter is: %', rec.PROCESS_CONTENT;
        EXCEPTION WHEN others THEN 
      /* 遇到 ERROR:  missing chunk number 0 for toast value ... 异常 直接记录在 row_error_table*/
            INSERT INTO row_error_table (err_ctid, err_tablename,err_cloum,err_CONTEXT) VALUES (rec.ctid, v_tablename,v_columnname,SQLERRM);
        END;
    END LOOP;
END;
$$ 
LANGUAGE plpgsql;

image.png

这个脚本会对需要检测坏块的表进行全表扫描,逐行验证是否有坏块。

v_tablename text := 'g_inbox';              -- 需要检测坏块的表、手动输入

v_columnname text := 'PROCESS_CONTENT';    --需要检测坏块的列、手动输入

检测坏块的时间取决于表的数据量。

image.png

脚本跑完以后会显示DO。

4、上面脚本跑完以后检查哪些行是有坏块。

SELECT * FROM row_error_table;

image.png

5、校验行是否有坏块。

image.png

查询整行记录报错,确定是有坏块

6、找到坏块的主键id。

image.png 找到主键ID以后,可以通过备份集找到原来的备份在异机进行恢复,然后通过主键ID找到着5行数据导出,然后在生产库导入即可。