CrazyEngineers
  • CE Goes Through Longest Unplanned Downtime - Here's What Happened

    Kaustubh Katdare

    Administrator

    Updated: Oct 26, 2024
    Views: 1.2K
    CEans,

    Yesterday, CrazyEngineers experienced the longest unplanned downtime in the history of CE. To make the matters worse, it also blocked our official email IDs during whole the time. Though no emails were lost, we couldn't receive or send emails to our members informing about the downtime. The aim of this post is to explain what exactly happened -

    You might be aware that CrazyEngineers is hosted in the cloud. Our web host wanted to move the instance that hosts CE to a new 'parent' infrastructure. This move usually takes about 5 hours and was therefore scheduled to commence at 1:00 AM IST on Monday. The idea was to get this done when CE experiences the lowest traffic every week.

    The server admins began the move at the planned time. Once the automatic process starts, there's nothing the admins can do until the process is over. The monitoring systems indicated that the move was in progress smoothly, but in reality it was way slower than it should be and at a point it got stuck, while the indicators failed to catch it. The admins were under the impression that the move would be over 'very soon' all the time. This went on throughout the day yesterday and finally had to cancel everything. It was discovered that there was a nasty issue that prevented the process.

    The whole thing resulted into CE being not accessible to anyone for more than 24 hours. The fixes are being worked out & hopefully we'll be able to execute the operation without any downtime (or short downtime during off hours) in this week. Thank you for your patience.

    We usually make sure that CE stays up all the time and throughout past years we've maintained 99.99% uptime; with the recent exception. I hope we'll be able to manage this in a far better way in future.

    Thanks to all of those who sent emails & expressed concerns about CE's unavailability. We had to make a few announcements yesterday, which we will make today.

    Let the action resume! 👍
    0
    Replies
Howdy guest!
Dear guest, you must be logged-in to participate on CrazyEngineers. We would love to have you as a member of our community. Consider creating an account or login.
Replies
  • Anoop Kumar

    MemberMay 6, 2013

    Finally .. welcome aboard 👍
    Are you sure? This action cannot be undone.
    Cancel
  • lal

    MemberMay 6, 2013

    Ai ai captain! Replenished the supplies. Now let the sail resume!
    Are you sure? This action cannot be undone.
    Cancel
  • Ankita Katdare

    AdministratorMay 6, 2013

    It just felt so empty to not see this page the entire day. 😔

    It is GOOD to be back! 😁 Let's roll!
    Are you sure? This action cannot be undone.
    Cancel
  • Kaustubh Katdare

    AdministratorMay 6, 2013

    It looks like the troubles aren't over yet. We just had another downtime and had to reboot the server.

    We'll work on resolving the issues during off hours today.
    Are you sure? This action cannot be undone.
    Cancel
  • Rehana Thakur

    MemberMay 6, 2013

    i am always waiting for updates on this page
    Are you sure? This action cannot be undone.
    Cancel
  • Saandeep Sreerambatla

    MemberMay 6, 2013

    Plan an upgrade on Saturday night or Sunday night IST, thats the best time 😀
    Are you sure? This action cannot be undone.
    Cancel
  • Rehana Thakur

    MemberMay 6, 2013

    thanks to give me quickly reply
    Are you sure? This action cannot be undone.
    Cancel
  • Kaustubh Katdare

    AdministratorMay 6, 2013

    English-Scared
    Plan an upgrade on Saturday night or Sunday night IST, thats the best time 😀
    We wont' be hold on for that longer. We're planning for a maintenance update tonight - around 1-2 AM.
    Are you sure? This action cannot be undone.
    Cancel
  • Abhishek Rawal

    MemberMay 6, 2013

    Kaustubh Katdare
    We wont' be hold on for that longer. We're planning for a maintenance update tonight - around 1-2 AM.
    How long will it take ? 😨
    Are you sure? This action cannot be undone.
    Cancel
  • Kaustubh Katdare

    AdministratorMay 6, 2013

    Abhishek Rawal
    How long will it take ? 😨
    Less than one hour.
    Are you sure? This action cannot be undone.
    Cancel
  • Abhishek Rawal

    MemberMay 6, 2013

    Kaustubh Katdare
    Less than one hour.
    That's good news for me.
    Are you sure? This action cannot be undone.
    Cancel
  • Sanyam Khurana

    MemberMay 6, 2013

    Oh, finally CE is back..!!

    I was worried, where the CE gone yesterday, and how it's back....

    Yeah..😁
    Are you sure? This action cannot be undone.
    Cancel
  • CE Designer

    MemberMay 7, 2013

    Was much damage done as a result of this downtime, besides all the broken hearts? Haha, its funny, I realised yesterday that CE has become such a part of my daily routine. It threw me of a bit yesterday.
    Are you sure? This action cannot be undone.
    Cancel
  • Anoop Mathew

    MemberMay 7, 2013

    Thanking all admins of CE for their combined efforts to stabilize the issue.

    Thanks #-Link-Snipped-# for quick response on Facebook.
    Are you sure? This action cannot be undone.
    Cancel
  • Abhishek Rawal

    MemberMay 7, 2013

    Oh gawd!
    It took 2:12 AM to 7:40 AM for maintenance today.I was waiting entire night.

    Nevermind, CE is live !
    Crazyengineers, Let's hustle
    Are you sure? This action cannot be undone.
    Cancel
  • Kaustubh Katdare

    AdministratorMay 7, 2013

    Yes, it again took longer than expected. While the server is up, error fixing process is still running at the background.

    I expect another scheduled maintenance sometime around weekend, late night hours. Will post an update if the downtime is required.
    Are you sure? This action cannot be undone.
    Cancel
  • Kaustubh Katdare

    AdministratorMay 7, 2013

    Update: We had another ~4 hours of downtime this morning (IST). The load on server spiked and services went down. Once we got the server up, repairing the databases took extra time.

    I thank all CEans for their patience. We're now working to find out the causes of the server failures and fix them. I'll keep everyone posted through this thread.
    Are you sure? This action cannot be undone.
    Cancel
Home Channels Search Login Register