It was time to do an OS upgrade on a webserver I maintain that hosts a couple of reasonably active sites. The two most active sites on the server bring in about 12,000 unique visitors a month, with the two serving up about 4 million hits. Of course this is small potatoes compared to the really big boys, but between the two of them it is pretty much given that there is somebody hitting the server at any given moment and the demographics are such that there is really no reasonable time for the server to be down for any length of time.
Yesterday I went through a "rehearsal" upgrade on the development server to check for any possible gotchas. There were a couple that made me glad to have the rehearsal. With the gotchas worked out and satisfied that everything was going to go alright, today I tackled the production server, which lives in a datacentre about 2500 Km away.
The upgrade was from Fedora 8 to Fedora 9. In the Windows world this would be roughly equivalent to upgrading from Windows 2000 Server to Windows 2003 Server - i.e. upgrading one major release.
Step 1: Installing some yum plugins to help things along
yum install yum-utils
yum install yum-fastestmirror
As the name implies, yum-utils adds some handy utilities for Yum. Notably I was after yum-complete-transaction as in testing on the development server I had run into an incomplete transaction. I wasn't expecting the production server to have the same problem, but better safe than sorry.
Yum-fastestmirror is useful for speeding things along, especially when you'll be downloading a lot of updates. It basically figures out which mirrors are responding best at the moment and latches on to the fastest one. For an upgrade like this will save you many minutes of downloading.
Step 2: Clean up
yum-complete-transaction
yum update
yum clean all
The first command makes sure the updater hasn't got any outstanding half-finished work. As expected this was not the case on the production server and the command simply returned that there were no incomplete transactions. Good.
Next we make sure everything installed on the current release is up to date before upgrading to the next release. In my particular case, everything was up to date and yum reported that there was nothing to do.
Lastly we make sure the RPM database is in a pristine state.
Step 3: Update the Yum repositories for the new release
Because I intentionally run a release behind the bleeding edge there's always somebody who has already done the legwork. As with past upgrades, I found the instructions and subsequent comments at http://www.ioncannon.com very helpful and just did a copy-paste of the following command:
rpm -Uhv http://mirror.liberty.edu/pub/fedora/linux/releases/9/Fedora/i386/os/Pac... http://mirror.liberty.edu/pub/fedora/linux/releases/9/Fedora/i386/os/Pac...
Because I was frankly too lazy to type in a better mirror and the response from this one was pretty slow, this was the most time consuming bit so far. It took nearly a minute to download the packages.
I started step one at precisely 3:00 p.m.. At the end of step three here it was now 3:02 p.m..
Step 4: A little more prep
yum clean all
yum update fedora-release
Another "clean all" probably isn't necessary, but it sure doesn't hurt.
The next command, "yum update fedora-release" was recommended by a couple of users in the comments following the instructions at IonCannon. It goes a long way to eliminating dependency conflicts in the next step. In fact, the biggest gotcha in my rehearsal run on the development server was that the update failed with a metric crapload of dependency conflicts. I could have easily spent hours removing and reinstalling packages to get around them, but this cleared it all up in one fell swoop.
At this point it is now 3:05 p.m., no reboots, no down-time.
Step 5: The meat of the update, this is it
yum -y update
Now yum goes out and downloads all the new packages and installs them. In my case 999 packages totally 841 MB. The production server has a pretty fat pipe and the fastest-mirror plug in found a mirror that had some impressive throughput. It took 16 minutes and 1 second to download all the packages, and then another 47 mintues or so to install them all. None of this required any intervention on my part, not one keystroke. I spent the hour doing other things. And note well, this all goes on while the server is up, running, and happily serving pages.
At this point it is now 4:07 p.m., zero downtime.
Step 6: Reboot
reboot
So here it is, the point where, in order get up and running with the new release, we have one of the rare moments that a Linux server needs to be rebooted.
It takes a few minutes for all the processes to stop, especially the Lotus Domino processes, so the server is still serving up pages for the next seven minutes. At 4:14 p.m. the server stops responding to pings and I get the first "page not found" error as I'm refreshing the browser out of curiosity. Less than a full minute later it starts responding to pings again and Apache starts serving pages again. It takes another half-minute or so for Domino to start up fully.
Step 7: Minor tweaks
After the reboot, when everything was up and running again I had to remove a few lines from the odbc.ini to stop some "errors" (really information-only warnings) on the Domino console generated by a DECS connection to the mySQL server.
[MYODBCUtilReadDataSource.c][233][ERROR] Unknown attribute (DSN).
[MYODBCUtilReadDataSource.c][233][ERROR] Unknown attribute (ReadOnly).
[MYODBCUtilReadDataSource.c][233][ERROR] Unknown attribute (ServerType).
[MYODBCUtilReadDataSource.c][233][ERROR] Unknown attribute (FetchBufferSize).
[MYODBCUtilReadDataSource.c][233][ERROR] Unknown attribute (TraceFile).
[MYODBCUtilReadDataSource.c][233][ERROR] Unknown attribute (Trace).
This had come up on the development server. The warnings were just saying that the DSN, ReadOnly, ServerType, FetchBufferSize, TraceFile, and Trace options were unnecessary and being ignored. DECS was running just fine. Removing the offending lines from the odbc.ini and restarting DECS eliminated the warnings.
In conclusion
So there it is, not counting the hour and three minutes the upgrade was running unattended, the whole procedure took less than ten minutes of my time with just over one minute of down time to preform a major-release upgrade on a physical server half a continent away. What's not to like about that?