We experienced multiple website outages on multiple servers over the weekend.
Multiple servers became overloaded at various times over the weekend, causing their Apache web servers to stall and websites not loading or showing error pages.
Our analysis and trials (unfortunately with more errors) revealed the chain of events and lead us to make several improvements:
- A persistent flood of attacks on WordPress websites aimed at brute-forcing via the wp-login.php and xmlrpc.php scrits.
- Our web firewall, Mod Security, caught the attacks, but was getting increasingly slower in processing the volume of attacks, leading to Apache maxing out several CPU cores and stalling.
- We pruned of the database Mod Security keeps of attacking IP address; it had grown very, very large. The database will now be pruned automatically with the daily maintenance task.
- We tweaked the Apache configuration to use Mod LSAPI (LiteSpeed Server API), which provides a state-of-the-art and efficient method to serve PHP. This immediate lessened the server load.
- A secondary advantage of using Mod LSAPI was the improved integration with Mod Security, resulting in even more attacks vectors to be identified. This is great. However, with the attacks continuing relentlessly, we still saw cases of overloaded servers.
- The next step was tweaking to our network firewall while slimming down the rule sets in Mod Security. Our network firewall now blocks connections from know botnet IP addresses; these blacklists are maintained by SpamCop, Project Honeypot and others. These steps resulted in another huge improvement in server load.
All and all, we believe that this ordeal has helped us improve the speeds and stability of our Apache servers. We are hopeful that our servers will continue to weather the storm of these attacks and remain perfectly stable from this point forward.
Being Easter weekend, few of our clients noticed the issues we were dealing with. Nevertheless, we regret the inconvenience anyone may have experienced.