Today’s network problems were caused by not 1, not 2, not even 3, but 4 separate issues.
First, one of our older distribution switches decided to give up on life.
After replacing it with newer hardware, everything ran just fine for a couple hours. Then, it decided our network topology had changed (apparently due to a different default configuration since the old days when it was deployed), and started dropping a good percentage of traffic.
This issue also affected another switch that some of our file servers plug into, so customer data was unavailable to a good portion of our hosting servers.
Consequently, one of our more active mail servers tried to dump all of the backed up email all at once, saturating some of the network that it’s on.
Once these configuration issues were resolved, one of the network interface cards in our main firewall machine died, requiring a quick swap.
We’re still working on some of the after-effects, and there may be some slowness while things get caught up, but everything should be back to normal later tonight.
Looks like things are still totally down over there, which is a few steps worse than “some slowness” if you ask me. Might I take this moment to recommend Server Beach?
UPDATE: Tuesday, 5:30am – it’s back up!