Skip to content

Moving a Paravirtualized EC2 legacy instance to a modern HVM one

I had to try a few things before I could get this right, so I thought I'd write about it. These steps are what ultimately worked for me. I had tried several other things to no success, which I'll list at the end of the post.

If you have Elastic Compute Cloud (EC2) instances on the "previous generation" paravirtualization based instance types, and want to convert them to the new/cheaper/faster "current generation", HVM instance types with SSD storage, this is what you have to do:

You'll need a donor Elastic Block Store (EBS) volume so you can copy data from it. Either shutdown the old instance and detach the EBS, or, as I did, snapshot the old system, and then create a new volume from the snapshot so that you can mess up without worrying about losing data. (I was also moving my instances to a cheaper data center, which I could only do by moving snapshots around). If you choose to create a new volume, make a note of which Availability Zone (AZ) you create it in.

Create a new EC2 instance of the desired instance type, configured with a new EBS volume set up the way you want it. Use a base image that's as similar to what you currently have as possible. Make sure you're using the same base OS version, CPU type, and that your instance is in the same AZ as your donor EBS volume. I mounted the ephemeral storage too as a way to quickly rollback if I messed up without having to recreate the instance from scratch.

Attach your donor EBS volume to your new instance as sdf/xvdf, and then mount them to a new directory I'll call /donor
mkdir /donor && mount /dev/xvdf /donor

Suggested: Mount your ephemeral storage on /mnt
mount /dev/xvdb /mnt
and rsync / to /mnt
rsync -aPx / /mnt/
If something goes wrong in the next few steps, you can reverse it by running
rsync -aPx --delete /mnt/ /
to revert to known working state. The rsync options tell rsync to copy (a)ll files, links, and directories, and all ownership/permissions/mtime/ctime/atime values; to show (P)rogress; and to not e(x)tend beyond a single file system (this leaves /proc /sys and your scratch and donor volumes alone).

Copy your /donor volume data to / by running
rsync -aPx /donor/ / --exclude /boot --exclude /etc/grub.d ...
. You can include other excludes (use paths to where they would be copied on the final volume, not the path in the donor system. The excluded paths above are for an Ubuntu system. You should replace /etc/grub.d with the path or paths where your distro keeps its bootloader configuration files. I found that copying /boot was insufficient because the files in /boot are merely linked to /etc/grub.d.

Now you should be able to reboot your instance your new upgraded system. Do so, detach the donor EBS volume, and if you used the ephemeral storage as a scratch copy, reset it as you prefer. Switch your Elastic IP, or change your DNS configuration, test your applications, and then clean up your old instance artifacts. Congratulations, you're done.

Be careful of slashes. The rsync command treats /donor/ differently from /donor.

What failed:
Converting the EBS snapshot to an AMI and setting the AMI virtualization type as HVM, then launching a new instance with this AMI actually failed to boot (I've had trouble with this with PV instances too with the Ubuntu base image unless I specified a specific kernel, so I'm not sure whether to blame HVM or the Ubuntu base images.
Connecting a copy of the PV ebs volume to a running HVM system and copying /boot to the donor, then replacing sda1 with the donor volume also failed to boot, though I think if I'd copied /etc/grub.d too it might have worked. This might not get you an SSD backed EBS volume though, if that's desirable.

Web Site Hosting Advice

Turbogears Occasionally friends, relatives, and clients ask me what they should do about creating and hosting a web site. When this happens, I find myself repeating, well, myself; so I thought I would put my thoughts on virtual paper for future reference. I will post a notice on this entry if my recommendations change at some future date. If you would like to consult with me about your particular setup, please contact me for consulting rates and availability.

Ok, you want a web site, good. First, get an idea of what your website will contain, how big it will be, what kind of content you will serve, and how much traffic it will receive. Will it DO something or SHOW something. If you're just starting out, or have no idea, any of the recommended plans will let you scale size and traffic for additional monthly fees, so don't worry too much about it.

If your goal is an informational, mostly text, but low volume, web site, just get a or other blog hosting account. They are free, minimally annoying, and with free image galleries and video hosting sites, can link to or embed video and photo content too. My Ward (a congregation in the LDS church) has a few of these sites for various extra activities, for example the youth group is presenting a "Fancy Dance" and Dessert Auction on Saturday Feb 19, 2011 to raise money for camp and activities this year, and uses BlogSpot to advertise. By the way, everyone is invited to the dance, and babysitting is provided, see the site for more information.

If your goal is to sell something, sell through the Amazon marketplace or if the products are crafty. Piggyback on top of an existing marketplace to jump start sales. If you're too big for that, I don't really have any advice. I don't have any experience in that space. I think that I would look for a host that provided merchant services (credit card processing for example) as part of the package.

If your goal is to host a medium volume dynamic application, use WebFaction. WebFaction is probably the best Shared Hosting service there is. They're one of the very few hosting providers that embraces Python application hosting, and I've run Pylons, TurboGears and CherryPy applications there. The hosting is cheap, fast, and it stays out of your way if you want it to. I host this blog, my personal e-mail and my business website on the base level account. I also host demo sites for clients when needed. The email service isn't spectacular, but it's functional as long as you have client side spam filtering like what is provided by Thunderbird. I like it because there are no set CPU limitations, the memory allotment is generous (email, OS, and even Database memory usage doesn't count against your quota, though the disk usage does), and the base disk space/bandwidth allocation is substantial. It also helps that WebFaction takes care of all data backups and operating system and hardware maintenance for you. WebFaction has one click installers for a large number of applications, so you don't have to know very much about Linux to get started, but if you do know what you're doing, you have SSH access, and everything that comes standard with a Linux shell account.

If you are planning on building a new application, take a look at Google App Engine. It lets you get going and host up to a certain threshold for free. Scaling up can be done fairly reasonably. Applications developed for App Engine can be run independently of Google, so you are not necessarily locked to Google as your hosting vendor.

I do not recommend any kind of Virtual Private Server hosting that isn't bundled as a Cloud offering. I've used three different VPS services, and two have all been slow and had high network latency (the third, Slice Host was bought and extended into Rackspace's cloud services, which I recommend below). Higher volume sites may do OK, but if the CPU, IO or Memory usage is too high for too long, your VPS can be rebooted or shut off. What this translates to is that you would have to hit a very small sweet spot to get good performance out of a VPS without getting shut down. Better hosting options exist.

If you do need system level access to a server of your own for some reason -- if for example you have an email processing system as part of your application -- or if you have requirements that extend beyond a single host, like high availability, then using a Cloud based VPS is desirable. Cloud computing nodes are designed for high performance application hosting. The overhead of virtualization is minimized by the use of advanced virtualization techniques (paravirtualization, CPU instruction sets, etc.) and by dedicating virtual resources to physical hardware. The management tools are typically excellent and, in the case of my two favorite cloud providers, there is an inherent benefit of a content delivery network (CDN) and Storage Attached to Network (SAN) which can serve as a scalable long term application storage or system backups. These two tools are used by very large websites to deliver content faster and more efficiently, and they're available on the Cloud for even the lowest rate plans. The intro level computing node at Amazon Elastic Compute Cloud (EC2) starts at 3¢/hour. Rackspace however has a node that start as low as $10.95/month (that's about 1.5¢/hour). There aren't as many third party software developers, and no external image providers (as far as I know) for Rackspace, but they have pretty good management tools, and a pretty good selection of base images to get you up and running pretty quickly.

EC2 was built for running short-lived computing (i.e., processor intensive) tasks, and it's pricing model and instance sizes reflect that. The instances and costs are very competitive to people looking at dedicated hosting. Rackspace's cloud is similarly designed, but has smaller instances, so it is cheap enough to use as a substitute for VPS or even shared hosting.

A former coworker of mine recently signed up for EC2 to host his blog using a promotional deal offered by Amazon's EC2. This deal lets you use the Micro instance for up to 750 hours per month for a whole year. Thereafter he's looking at a starting monthly rate of $21.60 plus storage and bandwidth charges. Of course using a Cloud node to host a blog is seriously overkill (as evidenced by his load average) unles he is doing much more with his site than visible at first glance. If he is uncomfortable with a free or even a paid blog hosting account, either WebFaction or Rackspace Cloud would be sufficient to host his site at about half the cost of EC2.

There is also dedicated hosting, but with the price point and performance of EC2 and Rackspace Cloud, you'd have to be very big indeed, or have special criteria not available for cloud nodes for the benefits to outweigh the costs.

Here's what I use for myself and my clients, and why I don't recommend VPS hosting:

As I mentioned above, I currently host my blog, email and business website on a WebFaction Shared Hosting plan. Shared Hosting starts at less than $10/month, with steep discounts for prepayment. I moved all the services off my VPS at Linode and shut it down since WebFaction was working so well. I found Linode to be sluggish and and network traffic to be high latency, but haven't felt that way about Webfaction.

With InMotionHosting's VPS offerings, performance was similar to or worse than Linode's. I had a client on the fully managed VPS plan costing $90/month. The VPS would bog down during traffic peaks and InMotion's system administrators would reboot the box (without any advance warning, without notice after the fact and without explanation of why). When things were peaceful, trying to log in to SSH could take 30-45 seconds, page loads for the main site or core application could take several seconds in spite of caching and being rather lightweight. InMotion always seemed to want to upsell to dedicated hosting when I mentioned the problems to their customer service representatives.

This site/application just passed through its busiest season on a Rackspace Cloud Server instance, and the it never even hiccuped. Final cost for hosting for the month? $24, and plenty of room to scale up if volume increases. I recommended the Rackspace Cloud Server because the application has an email processing system and the client has clients that could have been squeamish if their customers' names and email addresses were available on a shared host's shared database server (even though the database itself was not shared and was password protected).

Web Browser Posers

Ok, I'm not a novice when it comes to developing websites: I've been building web pages for close on 15 years. But within the last week, I've come across two browser behaviors (or perhaps they're browser addon behaviors) that make me scratch my head.

First, a request coming from something sending the User-Agent "Mozilla/4.0"-- yes, that's all, no clarifiers or parentheticals-- is lopping off the GET parameters when a popup is launched through a button click via an onclick handler. This site states that this is a Yahoo! search something, but the links are not something that a Bot would come across. On the other hand, there is no referrer sent, whick makes me think it could be some kind of link preloader or some other browser add on. Also, I saw a very similar error today coming from Firefox 3.0, though I'm not sure it's related.

Second, and this is really baffling: Sometimes I'm getting requests from a browser identifying itself as IE 6.x that has the entire URL made lowercase. I'm use nice REST-ful URLs for my application, so when a identifier comes across as lowercase, it throws off the lookup. Of course my own copy of IE 6 doesn't exhibit the behavior. For this particular case, I'm using JavaScript to build a URL, and then sticking it as the src attribute of an embedded iframe that is also being created by JavaScript. I'm seeing other errors in my logs though of IE6 and IE7 browsers going to different links (links that would typically be clicked or pasted from an e-mail) that are all lower case as well. Again, not sure if that's related, or if people are just typing them in (lazily) or if it's a browser bug. The only thing I can seem to find about this is this forum (news?) post from 2005 with no replies.

Of course my Google searching is revealing nothing to help me keep my hair, so I turn to the Lazy Web. :-) Any ideas?

Sort Tabular Data

I hate the way that tabular sorting is typically done in web apps (make links on every column with sort_order="columnname" or similar). It is tedious to code, and requires a lot of bandwidth and round trips from the server, not to mention additional load on the database.

Well, today I Googled a bit, and found SortTable.js. Add the script, and add a "sortable" class to the table you wish to sort, and you're done. It automatically detects string, numeric and date columns and sorts them using a very quick (though non-stable) sort.

I only had a very small problem (some of the CSS styles in FF3 stopped working) with the mechanism used to set the table up to be sortable (window.onload replacement), so I switched it to use jquery(document).ready, which happens later in the page loading process. Works nicely.

Check out the documentation for additional features.

Icon View Style Grid Layout?

Turbogears For a couple of projects now, I've wanted a grid layout engine that is similar to how desktops display lists of icons: nearly fixed width items, but varying slightly in height, displayed on a variable width page, so your layout could end up with 1 column or 8 depending on the width of the browser window. Tables are no good because they're always a fixed number of columns. Div elements using float works, so long as you make all the elements a fixed width, but they also have to be the same height, or you'll end up with gaps. I'm thinking it's going to have to be javascript driven including redrawing when the page size changes, and to manually size all items to the tallest item in the row, but I can't seem to find an example on the web anywhere (or my Google foo is weak).

Dear lazy web, can you point me in the right direction?

Slice Host and rBuilder Online Images

I host this blog, as well as e-mail for and a host of other services on an old PC behind my cable modem at home. This has served me well for the most part, but it requires onsite maintenance when it goes down. This is bad when I'm at work, or vacation, as happened this week. So, I bit the bullet and researched some Virtual Priate Server (VPS) hosting providers.

I ended up choosing Slice Host as a no-frills, just the tech if you please, Linux/Xen-based VPS host. Their entry level plan (slice) gives you 256 MB RAM, 10GB storage, and 100GB of bandwidth for $20/month, and you can scale it with a reboot up to 4GB/160GB/1600GB for $280. /proc/cpuinfo shows that the host for my entry level slice is a two way "Dual-Core AMD Opteron(tm) Processor 2212" operating at 2.0 GHz. There's a separate swap partition (so swap doesn't count against the 10GB limit), as well as web based management tools for rudimentary Name Services, starting, stopping and rebuilding your slice, a web console (in case ssh isn't working for some reason), some statistics and reporting, and my favorite part, a rescue mode.

Rescue mode lets you boot your slice in a rescue environment, mount your root file system in an alternate location, and do what you want (or need) with it. This makes it pretty easy to run your appliance from rBuilder Online on a hosted slice. Here are the steps to get this working. Choose a Xen Appliance Image (32 or 64 bit, though 64 bit is preferred) that is a single file system image.

  • Create a slice (doesn't matter what kind, we're going to blow it away anyway).

  • Reboot your slice into rescue mode

  • SSH or console in using the password mailed to you (yes, rescue mode gets started with a randomized password)

  • wget -O - <link to the rBO image> | tar -xz # This downloads and extracts the filesystem image

  • dd if=<path to filesystem image file> of=/dev/sda1

  • e2fs.ext3 -f /dev/sda1 # This forces a file system check, without this check the next step will fail.

  • resize2fs /dev/sda1 # Resize the file system image to match the available size

  • mount /dev/sda1 /mnt

  • copy the following networking configuration files from your rescue image to your new slice image mounted in /mnt

    • /etc/sysconfig/network /etc/sysconfig/network-scripts/ifcfg-eth0 /etc/resolv.conf

  • Edit /etc/sysconfig/network to fix the hostname to the desired value

  • chroot /mnt

  • passwd # changes the root password since the rBO images ship with root's password blanked

At this point you can do any additional configuration you wish, such as adding additional users, making sure that openssh-server is installed and configured to start on boot, etc.

When that's done, shutdown, exit rescue mode from the Slice Host panel, and log in to your new appliance.

There is quite a bit of noise when the slice boots up with an rPath Linux based appliance because the kernel in the image isn't used for booting, and modules.dep isn't located for the booting kernel, but that seems harmless.

Now to build an appliance to run on the thing... I used the rPL 2 beta 3 text devel image as my image while developing this HOWTO.

Catching up

First of all it sucks to be sick. I don't know what I caught, nor where I caught it from, but now I'll be coughing for at least a week and my lungs are already sore from the last two days of hacking. I just hope that my wife and son don't catch it. Alex already has ear infections and a nasty cough coinciding with the arrival of four new teeth. At least his black eye is better.

A lot has happened since December, and I won't even begin to enumerate it all, just to tell you that I was working feverishly to get VMware and QEMU/RAW Hard Drive images working in rBuilder Online. With that task completed I am no longer working on rBuilder, instead I'm working with Ken VanDine and Xiaowen on something to be announced soon.

Yesterday I modified the PHPAppPackageRecipe class to make it more friendly to sofware upgrades. Previously the tarball containing the PHP application was extracted directly to /srv/www using whatever root folder was in the tarball. Then the recipe made a link connecting /srv/www/ to whatever folder was extracted. The problem with this arrangement is that if that directory contained the version, as many php apps do, the next version would be installed to a different location causing a manual step to move settings and data to the new folder.

Now the recipe extracts the application as most other packages do to the conary build folder. Then I copy the contents to /srv/www/ and eliminate the link. Hopefully this will ease upgrade pain. I rebuilt Gallery bumping its version to the newly released 2.1 and Serendipity in the LAMP rBuilder project so if you're running either of those be aware that there will be some one-time transitional pain. Feel free to contact me for help in upgrading to these new packages.

PHP Application Packaging

I spent some time this weekend packaging Gallery2 and Serendipity for rPath based Linux Distributions. As part of that effort, I built a conary recipe superclass to make packaging other PHP applications easier.

This class currently resides in the LAMP repository in the phpapppackage:source trove. The superclass handles creation of the apache configuration file to drop in to /etc/httpd/conf.d, provides stubs for creating empty files and directories for use by the PHP application, and sets up the requirement on PHP.

So, a new recipe for a simple application like gallery looks like the one here.

Notice the calls to MakeWriteableDirs and CreateWriteable. Those create the empty config files and directories needed by the application. These are created with ownership "apache", as which user the stock httpd server runs. Also notice the r.macros.dirconf macro. This data gets inserted into the apache configuration file between <Directory foo> directives so that you can set application specific php configuration values, or even set overrides.

If you use phpapppackage.recipe as a root class to package some other php application, shoot me an e-mail to let me know. I'd love to hear about it.

AJAX and rBuilder

For the past week or so, I have worked on creating a new AJAX-y group trove builder for rBuilder Online. This widget will allow anyone who can shop online to create new purpose built Linux images. The images can then be downloaded, burned to a CD-R, and installed, all in a day.

The best part of the work I've been doing is that it elegantly falls back to POSW (Plain old static web) when javascript is disabled. I do this by rewriting the urls on the page to make the AJAX calls to our XMLRPC interface to rBuilder Online. If the urls don't get rewritten, they will make page load calls to perform the same action. Continue reading "AJAX and rBuilder"

XMLRPC Interface to Mailman

I have created a XMLRPC interface to allow the remote control of many of the Mailman functions previously only available via the CGI or command line interfaces. This patch, submitted as SF Patch #1244799 is available for anyone who wishes to download it and apply it to a 2.1.6 Mailman tarball. This entry is a (shameless) attempt to publicize the patch and try to get it integrated into mailman CVS.

If you run mailman on an rpath Linux system, this patch will soon be available from the conary repository.