Author Topic: OMG What happened?!  (Read 3837 times)

KwukDuck

  • Administrator
  • Legendary Member
  • *****
  • Thank You
  • -Given: 6003
  • -Receive: 7231
  • Posts: 5540
OMG What happened?!
« on: June 26, 2017, 09:19:32 pm »
What happened?
Essentially our database got messed up and with the help of poot222 we were able to rebuild it.
We believe the corruption was a result of a bug in the hypervisor that caused the virtual disk to disconnect.

Why did it take so long?
System logs showed the server waiting for disk operations. At the same time disk usage showed 100% while no read or write operations were being done. Every few seconds this happened, causing big lag on the site.
With CPU in wait state it becomes totally unresponsive, which also means unresponsive to any debugging I was trying to do.
Every few hours or so the entire vm would crash (probably causing database corruption), the logs showed the system lost connection to the harddisk was the cause.
While this was going on the host system did not show these loads on either the harddisk or cpu, it DID however show kernel panic in relation to the harddisk and hypervisor driver.
I discovered this was a known bug that was resolved in a new version, we updated, preventing further crashes.
Unfortunately the harm was already done and our backups also contained the corrupted tables.
With now a stable server we had to clean out the database and rebuild it. All is up and running again.

This is where we are at now. Please report any problems you run into.

Thanks to poot22 and Indy for helping to fix this!
Thank you all for being patient!

Rv3SuperStar

  • Jr. Member
  • **
  • Thank You
  • -Given: 1
  • -Receive: 61
  • Posts: 42
Re: OMG What happened?!
« Reply #1 on: June 26, 2017, 10:25:10 pm »
I'd like to report a problem of lack of boobs and I expect KD to fix it prompto.

ThomasSmith

  • Hero Member
  • *****
  • Thank You
  • -Given: 49
  • -Receive: 2454
  • Posts: 2352
Re: OMG What happened?!
« Reply #2 on: June 27, 2017, 11:59:33 am »
It seems the unread topics were reset to sometime last year - for instance, I check the Bianca Beauchamp topic every couple of days, but when I clicked on the new postings button today, it jumped to page 26 of 33 and a posting from mid-2016.

KwukDuck

  • Administrator
  • Legendary Member
  • *****
  • Thank You
  • -Given: 6003
  • -Receive: 7231
  • Posts: 5540
Re: OMG What happened?!
« Reply #3 on: June 27, 2017, 12:47:54 pm »
It seems the unread topics were reset to sometime last year - for instance, I check the Bianca Beauchamp topic every couple of days, but when I clicked on the new postings button today, it jumped to page 26 of 33 and a posting from mid-2016.

That is correct, smf_log_topics was one of the tables that was corrupted, it's one of the most active tables and crashes of the system are likely to cause corruption where it's active.
Instead of trying to restore it we decided to purge it since it takes up a lot of space and doesn't contain information that is too important for most users.

ThomasSmith

  • Hero Member
  • *****
  • Thank You
  • -Given: 49
  • -Receive: 2454
  • Posts: 2352
Re: OMG What happened?!
« Reply #4 on: June 27, 2017, 02:50:47 pm »
It seems the unread topics were reset to sometime last year - for instance, I check the Bianca Beauchamp topic every couple of days, but when I clicked on the new postings button today, it jumped to page 26 of 33 and a posting from mid-2016.

That is correct, smf_log_topics was one of the tables that was corrupted, it's one of the most active tables and crashes of the system are likely to cause corruption where it's active.
Instead of trying to restore it we decided to purge it since it takes up a lot of space and doesn't contain information that is too important for most users.

I keep track of new or updated threads through that feature and reset it with "mark all messages as read" before I log out.

How many new postings do our users post per day on average these days? I can only keep track of a fraction of the content nowadays.

Indy

  • Administrator
  • Hero Member
  • *****
  • Thank You
  • -Given: 1291
  • -Receive: 11322
  • Posts: 4486
Re: OMG What happened?!
« Reply #5 on: June 28, 2017, 06:46:27 pm »
the log of who read what at what time is about 3x bigger than the rest of our database combined. It's a thorn in our side.
You're basically the only one who resets it, but most people don't. That means we need to store the statusses of ALL our users for ALL our posts. This really is huge data.
It's very convenient, I do hear you, I use it all the time as well, but we cannot promise we won't purge this data table every once in a while. I think in the entire history of TPB, we've purged it 4x now.
I want something good to die for, to make it beautiful to live.

ThomasSmith

  • Hero Member
  • *****
  • Thank You
  • -Given: 49
  • -Receive: 2454
  • Posts: 2352
Re: OMG What happened?!
« Reply #6 on: June 29, 2017, 02:36:52 pm »
Vbulletin also gives you the option to store this information not on a server db table, but as a cookie on the user's computer. It is, of course, not as convenient, but takes load off the server. As I've never used SMF I do not know if they offer a similar option.

loveboobs

  • Global Moderator
  • Hero Member
  • *****
  • Thank You
  • -Given: 439
  • -Receive: 623
  • Posts: 1184
  • Plastic makes perfect...
Re: OMG What happened?!
« Reply #7 on: July 01, 2017, 12:57:02 am »
Vbulletin also gives you the option to store this information not on a server db table, but as a cookie on the user's computer. It is, of course, not as convenient, but takes load off the server. As I've never used SMF I do not know if they offer a similar option.

This is a much worse option - doesn't survive clearing cookies, doesn't persist across different computers.
Plastic makes perfect...

Tags: