Issue326

Title Postgres error: invalid page header ...
Priority critical Status resolved
Superseder Nosy List ajaksu2, izak, loewis, r.david.murray, roche
Assigned To Topics

Created on 2010-03-24.08:41:40 by loewis, last changed 2010-04-04.18:13:44 by loewis.

Files
File name Uploaded Type Edit Remove
ids loewis, 2010-04-01.23:10:22 application/octet-stream
Messages
msg1589 (view) Author: loewis Date: 2010-03-24.08:41:40
It seems that the postgres database is partially corrupted. For roundup_tracker, doing

select count(*) from _file where id='2'

gives

ERROR:  invalid page header in block 0 of relation "_file"

Likewise, pg_dumpall produces the same error, as does "vacuum".

Any help appreciated.
msg1590 (view) Author: r.david.murray Date: 2010-03-24.22:28:23
I've done some research on this.  This kind of error appears, as far as the postgresql mailing list is concerned, to be associated with hardware or kernel problems.  That is, something on the system corrupted the database file.  At least, I haven't found any indication that this kind of error message has ever been tracked down to anything else.  Perhaps someone with postgresql internals knowledge could make something out of the data in the corrupted file, though.  Is it something I could share with others (ie: can the information be considered sensitive in any way?  My thought is no)?  Assuming someone from the mailing list would be willing to look.

In addition to this, however, we apparently also have missing pg_clog files.  These are deleted by vacuum and autovacuum when they are no longer needed, apparently, but currently there is only one, that was created on the 24th...so at some point clog files were deleted that were still needed.  A post I found on the mailing list opines that this should only happen if the files are deleted by something outside postgresql.

Do we have a backup of that table somewhere?  We may be stuck with restoring it from backup.

We could also try (after making a full backup of the binary db files!) restarting postgres with zero_damaged_pages=1, which will get past the error but result in data loss, which could perhaps be patched to be less data loss working from a backup.  Since this is page 0, maybe all the data lost would be in the backup.  Conceivably I could try that on a test tracker install first, but I haven't set one up yet.

It may be possible to finesse the missing transaction data too, but again there may be data loss.

I'm concluding all this based on google-foo, so if someone with better postgresql chops can look at this that would of course be better.

And however we recover, I hope we can figure out what caused the corruption to begin with, otherwise it may just reoccur.
msg1591 (view) Author: loewis Date: 2010-03-24.22:36:15
> Do we have a backup of that table somewhere?  We may be stuck with
> restoring it from backup.

We don't have a postgres dump. However, there are backups, in
~roundup/backup (and probably to-tape backups of that from earlier
dates; I would also hope that there are backups of the raw database tables).

> And however we recover, I hope we can figure out what caused the
> corruption to begin with, otherwise it may just reoccur.

That would be for Upfront to answer.
msg1592 (view) Author: roche Date: 2010-03-25.15:28:05
On Wed, 2010-03-24 at 22:36 +0000, Martin v. Löwis wrote:
> Martin v. Löwis <martin@v.loewis.de> added the comment:
> 
> > Do we have a backup of that table somewhere?  We may be stuck with
> > restoring it from backup.
> 
> We don't have a postgres dump. However, there are backups, in
> ~roundup/backup (and probably to-tape backups of that from earlier
> dates; I would also hope that there are backups of the raw database tables).
> 
> > And however we recover, I hope we can figure out what caused the
> > corruption to begin with, otherwise it may just reoccur.
> 
> That would be for Upfront to answer.

I have asked Izak to have a look. Apologies for making you wait. I'm
currently on holiday and not checking my mail frequently and our whole
team was out of the office yesterday.
msg1593 (view) Author: izak Date: 2010-03-25.15:47:07
Martin v. Löwis wrote:
>> And however we recover, I hope we can figure out what caused the
>> corruption to begin with, otherwise it may just reoccur.
> 
> That would be for Upfront to answer.

We didn't change the kernel or the hardware. We did upgrade the virtual 
host on which psf runs, including the postgres installation. I suspect 
we need only shut it down, run a fsck on the filesystem to make sure 
that is in good shape. Let me know when would be a good time. I suppose 
any time and as soon as possible?

Unfortunately I am not an expert in postgresql on THAT low a level, so I 
would also recommend a restore from backup.
msg1594 (view) Author: loewis Date: 2010-03-25.23:34:07
> We didn't change the kernel or the hardware. We did upgrade the virtual 
> host on which psf runs, including the postgres installation. I suspect 
> we need only shut it down, run a fsck on the filesystem to make sure 
> that is in good shape. Let me know when would be a good time. I suppose 
> any time and as soon as possible?

Sure, go ahead.

> Unfortunately I am not an expert in postgresql on THAT low a level, so I 
> would also recommend a restore from backup.

I'd rather avoid that. The backup is not current enough, and new backups
aren't being made because of the corruption.
msg1595 (view) Author: loewis Date: 2010-03-26.06:46:54
I have now restarted postgres; that alone seems to have resolved some of the problems.
msg1596 (view) Author: izak Date: 2010-03-26.07:35:11
I've checked the filesystem, and though it wasn't checked in 420 days, 
it was still fine. I cleaned up some of the startup scripts that do 
nothing on a virtual host (things that set the hardware clock etc) and 
then restarted it.
msg1597 (view) Author: r.david.murray Date: 2010-03-26.15:43:31
If someone can restore one of the tape backups of file.csv to somewhere on the machine (the one in ~roundup/backup is now empty), I will try to make time this weekend to experiment with ways to recover that table.
msg1602 (view) Author: ajaksu2 Date: 2010-03-26.18:53:51
Martin v. Löwis wrote:
>> Unfortunately I am not an expert in postgresql on THAT low a level, so I
>> would also recommend a restore from backup.
>
> I'd rather avoid that. The backup is not current enough, and new backups
> aren't being made because of the corruption.

Martin, are we positive there's missing data or is it possible that we
just can't access the table?

I've found a message[1] suggesting we can still get the data if we
skip accessing the header on _file.

If you issue "SET enable_seqscan TO 'off'; ", does "select * from
_file where __retired__=0" work? And does the dump work?

HTH,
Daniel

[1] http://osdir.com/ml/pgsql-general/2010-02/msg00380.html
msg1603 (view) Author: r.david.murray Date: 2010-03-26.19:22:45
Unfortunately, no:

roundup_tracker=# set enable_seqscan to 'off';
SET
roundup_tracker=# select * from _file where __retired__=0;
ERROR:  invalid page header in block 0 of relation "_file"

I'm hoping that if there is any missing data it is old data that we have a backup of (somewhere), and that the rest will be readable after zeroing the invalid page.  But I'm not going to test that theory on the live database, so I need to set up a test instance in order to test my theory...
msg1604 (view) Author: r.david.murray Date: 2010-03-26.19:33:23
Hmm.  What I just reported makes me wonder, though.  Accessing the data through that index should have worked :(

Using file_retired=1 does not produce the header error, but gives an empty table.  (I don't know what values are valid).  Maybe retired=0 is trying to read an item from the first database page, and thus gets the error.
msg1606 (view) Author: ajaksu2 Date: 2010-03-26.20:07:55
David,
What does  "EXPLAIN  select * from _file where __retired__=0;" give you?

I get "Index Scan using _file_retired_idx on _file  ..." after setting
enable_seqscan to off, but "Seq Scan on _file ..." in the default
case.

IIRC, __retired__ is an integer used as a bool, defaulting to 0 (file
is 'alive').

You could also try to filter by id to avoid hitting page 0, something
like this seems to perform an index scan too:

select * from _file where __retired__=0 and id>15000;
msg1607 (view) Author: loewis Date: 2010-03-26.20:28:57
> What does  "EXPLAIN  select * from _file where __retired__=0;" give you?
> 
> I get "Index Scan using _file_retired_idx on _file  ..." after setting
> enable_seqscan to off, but "Seq Scan on _file ..." in the default
> case.

Same here.

> select * from _file where __retired__=0 and id>15000;

That does indeed produce a result. Not sure how this is going to help,
though.
msg1608 (view) Author: ajaksu2 Date: 2010-03-26.20:57:59
Martin v. Löwis wrote:
>> select * from _file where __retired__=0 and id>15000;
>
> That does indeed produce a result. Not sure how this is going to help,
> though.

Not sure it will help. It gives us some info on whether we lost data
(and if so, where) or just can't access records easily.

It seems to be possible to get a (lossy?) dump of the DB under
zero_damaged_pages='on', as David proposed.

If it makes any sense, this might help getting most of the data from
_file before zeroing pages, if only to know how useful an ancient
backup for _file would be.
msg1609 (view) Author: r.david.murray Date: 2010-03-26.21:10:20
That query works the same whether you use retired or not.  Looks like we at least have the current data from id 10336 on (creation date 2008-05-16).  If we have a backup of the earlier data we can at least recover the most of the data.  We'd still be missing any changes made to older objects, but I don't think file information changes very often.

We could also ask on the postgres forum to see if anyone is knowledgeable enough to actually repair the page for us.  It's possible just some bits got twiddled rather than wholesale corruption of the block, but I don't know how to tell.  I won't have time to get subscribed and put together a post until tomorrow sometime.
msg1610 (view) Author: loewis Date: 2010-03-26.21:24:46
> That query works the same whether you use retired or not.  Looks like
> we at least have the current data from id 10336 on (creation date
> 2008-05-16).  If we have a backup of the earlier data we can at least
> recover the most of the data.  We'd still be missing any changes made
> to older objects, but I don't think file information changes very
> often.

I'm tempted to create the lossy dump right away, drop the table, and
recreate it. We shouldn't invest too much time into this: the actual
file data are on disk, and hopefully Upfront can provide us with some
old backup within a few days (certainly not over the weekend, though).

As for changes, we also have file__journal, which seems to dump fine.
msg1611 (view) Author: loewis Date: 2010-03-26.22:49:54
I have now created a full database tgz if anybody wants to take a look.

I have then enabled the zero_damaged_pages setting, and vacuum'ed. That produced these messages:

invalid page header in block 0 of relation "_file"; zeroing out page
invalid page header in block 8 of relation "pg_type_typname_nsp_index"; zeroing out page

Dumping the database still failed with a missing pg_clog file. I created one that was zeroed out; this let me proceed. Postgres didn't decide to delete the log file afterwards; this worries me somewhat.

Dumping the database still doesn't work; it gives this error message:

ERROR:  invalid memory alloc request size 18446744073709551613

This happens when doing "select *" or "copy" on _file. Notice that this number is 0xfffffffffffffffd.

izak: can you please provide a copy of the latest backup of /home/roundup/backups/tracker-export where file.csv is not empty?
msg1613 (view) Author: ajaksu2 Date: 2010-03-29.00:20:38
> Dumping the database still doesn't work; it gives this error message:
>
> ERROR:  invalid memory alloc request size 18446744073709551613
>
> This happens when doing "select *" or "copy" on _file. Notice that this number is 0xfffffffffffffffd.

Ugh. It seems we're left with copy-ing the other tables to files,
salvaging _file and rebuilding it with a backup and/or file_journal.

I could try to roll a script to perform these tasks if you're interested.
msg1614 (view) Author: r.david.murray Date: 2010-03-29.14:13:47
I think any help will be welcome, Daniel :)

It turns out I can't even play around with this because I don't have a 64bit linux on which to load the binary copy of the database.  I should be able to set up a KVM instance (which I want to do anyway), but that will take me a bit.
msg1615 (view) Author: loewis Date: 2010-03-29.20:25:27
I'm going to try to write a script that selects each individual _file record, dumps them all into a pickle, so that I can drop the table afterwards. I hope that this will then allow to dump the database.
msg1631 (view) Author: loewis Date: 2010-04-01.07:15:55
I have now run a script that selects each _file record individually; this failed for 113 and 152. Some of them returned a NULL id (even though I selected them by id), namely 107, 116, 119, 126, 127, 10337. I have deleted all these records from the database; afterwards, pg_dump would create a dump of the table (I actually had to delete 119 from the dump instead). I dropped the table, and reloaded the dump.

Now I can create dumps of the entire postgres cluster.
msg1632 (view) Author: loewis Date: 2010-04-01.23:10:22
Upfront has provided a backup from Feb 15. I have checked that none of the issues backed up at that point had been modified after after Feb 15. I have restored all missing _file entries to that point.

Attached is a list (file:ids1) of _file records that have been restored.
msg1633 (view) Author: loewis Date: 2010-04-01.23:12:57
Here is a list of file ids which are missing from range(16731): [0, 9109, 9110, 9184, 9278, 9347, 9375, 9376, 9377, 9378, 10156, 10157, 10158, 10159, 10160, 10161, 10162, 10163, 10164, 10165, 10166, 10167, 10168, 10169, 10295, 11287, 11294, 11825, 13128, 13340, 13341, 13342, 13343, 13344, 13345, 14084, 14209, 14898, 15500, 16149]

It would probably be useful to find out which of these are referenced from issues. I'll do that when I have time.
msg1636 (view) Author: loewis Date: 2010-04-03.15:32:07
None of these fileids are linked to issues, so it appears these ids really don't exist. Closing this as resolved.
msg1637 (view) Author: r.david.murray Date: 2010-04-04.02:09:51
Interesting.  We just got a traceback email for one of them (13128), from an access by anonymous directly to the 'unnamed' file.  Looks related to some of the spam we get.
msg1638 (view) Author: loewis Date: 2010-04-04.07:26:24
I think that's actually a roundup bug. If somebody accesses fileXXX/path, it shouldn't choke, but report http 404.

I checked the apache logs; the access to file13128/unnamed comes from Googlebot. Unfortunately, they don't provide a referer. I guess Googlebot will continue trying until they eventually get a proper result.
msg1639 (view) Author: loewis Date: 2010-04-04.18:13:44
I have now fixed the roundup bug with r79748 and r79749.
History
Date User Action Args
2010-04-04 18:13:45loewissetstatus: chatting -> resolved
messages: + msg1639
2010-04-04 07:26:25loewissetstatus: resolved -> chatting
messages: + msg1638
2010-04-04 02:09:59r.david.murraysetstatus: chatting -> resolved
2010-04-04 02:09:51r.david.murraysetstatus: resolved -> chatting
messages: + msg1637
2010-04-03 15:32:08loewissetstatus: chatting -> resolved
messages: + msg1636
2010-04-01 23:12:58loewissetassignedto: izak ->
messages: + msg1633
2010-04-01 23:10:23loewissetfiles: + ids
messages: + msg1632
2010-04-01 07:15:55loewissetmessages: + msg1631
2010-03-29 20:25:27loewissetmessages: + msg1615
2010-03-29 14:13:48r.david.murraysetmessages: + msg1614
2010-03-29 00:20:39ajaksu2setmessages: + msg1613
2010-03-26 22:49:55loewissetmessages: + msg1611
2010-03-26 21:24:46loewissetmessages: + msg1610
2010-03-26 21:10:20r.david.murraysetmessages: + msg1609
2010-03-26 20:57:59ajaksu2setmessages: + msg1608
2010-03-26 20:28:57loewissetmessages: + msg1607
2010-03-26 20:07:55ajaksu2setmessages: + msg1606
2010-03-26 19:33:23r.david.murraysetmessages: + msg1604
2010-03-26 19:22:46r.david.murraysetmessages: + msg1603
2010-03-26 18:53:51ajaksu2setnosy: + ajaksu2
messages: + msg1602
2010-03-26 15:43:32r.david.murraysetmessages: + msg1597
2010-03-26 07:35:12izaksetmessages: + msg1596
2010-03-26 06:46:55loewissetmessages: + msg1595
2010-03-25 23:34:08loewissetmessages: + msg1594
2010-03-25 15:47:07izaksetmessages: + msg1593
2010-03-25 15:28:05rochesetmessages: + msg1592
2010-03-24 22:36:16loewissetmessages: + msg1591
2010-03-24 22:28:25r.david.murraysetstatus: unread -> chatting
nosy: + r.david.murray
messages: + msg1590
2010-03-24 08:41:40loewiscreate