Site Outage Today - 2020-10-16 - Site is Back Up
martyb writes:
We had an outage this morning -- "Internal Server Error" would appear when trying to load the main site.
I noticed this at about 0945 UTC from my mobile phone and immediately TXTed a message to "The Mighty Buzzard" (aka TMB) alerting him of the situation. Of course, it being 0545 EDT, he was sound asleep like any sane person would be.
I then booted up my computer and accessed "#Soylent" on IRC; discovered others were already aware. It appears to have been first noted at 05:42:57 UTC by "SoyCow8732". That was followed not long after by "c0lo" and "lld". Soon after, "chromas" was on the scene and tried bouncing the front ends, but no joy. He sleuthed around and concluded it was likely a mysql error, but our configuration is... interesting and it was non-obvious on how to restart things.
My hands were mostly tied as only a few days ago I managed to mess up Windows on my main system and would get a BSOD whenver I tried to boot it. I looked on from a system booted from a Ubuntu Live CD (well actually, a USB stick).
Eventually, TMB appeared, took stock of the situation, and was able to get things running again in pretty short order. Thanks Buzz!
Synopsis (AIUI) our installation of Mysql is setup so that there are redundant copies of the DB running on two different servers. The intent is to provide redundancy so that if one instance goes down, the other can take over and carry things along until the failing system is recovered. That's great in theory, but not so good in practice. Thankfully, it does [mostly] work. We are continuing to monitor the situation. Be assured this is working its way of the priority queue! I mean, who likes to wake up and debug server issues before their first cup of coffee?
So, that's my take on it. I'll leave it to TMB to add details/corrections should he deem it necessary.
Read more of this story at SoylentNews.