Server WWW-30 in Cape Town is back online after emergency maintenance, an unplanned extended outage, and a second round of hardware maintenance.
Original report Thursday 18:20 UTC
We are investigating loss of network connection.
Update 18:40 UTC
Network connection restored. All services back to normal
Update 20:00 UTC
After thinking all was fine, the network problem appeared again an hour later! We are working with the data centre to investigate the problem.
Update Friday 00:05 UTC
Server is currently offline while we run network diagnostics. We are still to identify the problem! We will bring the server back online before the start of business in South Africa.
Update 04:15 UTC
The intermittent network trouble persists. We are in contact with the data centre to replace the network hardware; this will require the server to be taken offline again sometime this morning (SA time). We will post an update here once we know more.
Update 05:00 UTC
Server is currently offline to have the network cards replaced.
Update 06:30 UTC
Server is still offline. We hope to be back online again soon!
Update 06:55 UTC
It would appear as if the network problem may be off-server, e.g. faulty cable or switch. Data centre technicians are still attending. This may take a while :(
Update 07:40 UTC
The data centre technician is switching out network cards again. This is becoming a process of elimination. It now seems unlikely to be resolved very soon — it will take as long as it takes.
Update 08:30 UTC
In desperation, the technician is switching back to the on-board network card that we started out with. This should allow us to bring the server back online, but likely with the same intermittent connection problems at yesterday. If so, we will just nurse the server through this business day before we have another look at repairs.
Update 09:00 UTC
The server is back online again, and all services are functioning normal. It is very possible (perhaps likely) that there will be more intermittent network connectivity problems today — we are monitoring and will step in as needed.
We plan to take if offline again to replace the motherboard (and its built-in network interface). We understand that our clients have been though a lot today, and therefore we will scheduled this work for after-hours (South Africa) today or sometime tomorrow. Please stay tuned for confirmed maintenance schedule.
We regret the inconvenience.
Update 09:35 UTC
We are investigating multiple reports that email logins are failing, resulting in our firewall blocking connections (because the failures seem like attacks). Please stay tuned.
Update 10:00 UTC
All websites and all email logins should be fully operational at this time. If you find otherwise, please contact us at firstname.lastname@example.org.
Through this escapade, the server literally lost track of time. As a result some emails received may show a timestamp that is off by a couple of hours.
Update 11:05 UTC
The network connection is still unstable, requiring a restart every hour or so. Not good.
We have now scheduled replacement of the server motherboard (with built-in network adaptor) later today at 16:00 UTC (18:00 SAST). The server will be offline again, hopefully for not more than one hour.
Update 16:10 UTC
Server is offline for replacement of the motherboard.
Update 16:55 UTC
Server is back online with new hardware, and all services are running normal. We are hopeful that this will resolve the issue. We will continue monitoring.