A quick guide to installing an Ubuntu VM.
Setting up a template VM and saving it away will be invaluable once you start trying to install various different Open Source Big Data systems. Mess it up, delete it, copy back the template and you’re back in business!!
I’ve chosen Ubuntu for no particular reason other than I like and it supports everything I’ve tried to install so far, I’d be interested to know if there’s a “lighter” install that could be used instead. The GUI is nice but a basic terminal window is all you really need once the VM is set up.
There’s 6 simple steps, the whole thing including the downloads should take about an hour – depending on your download speed of course…
Step 1 – Download the files you need
Step 2 – Install and start the VMWare Player
Step 3 – Create a new Virtual Machine with the Ubuntu .iso file
Step 4 – Log in and Update the OS
Step 5 – Update/Install Java
Step 6 – Save the VM as a template for re-use
Step 1 – Download the files you need
Go to www.ubuntu.com
Download 32bit iso image, it’s about 680Mb and took 15 minutes with my Sky Broadband connection
Note I have installed Hadoop on the 64bit version of Ubuntu and had no problems but Ubuntu recommend the 32bit version on their site. As we are installing a VM which is going to have limited memory, unless you have a monster machine!, I can’t see the need for a 64bit OS
Latest version at time of writing is 11.10, I actually preferred the GUI on the previous version but like I said above most of the time I just use a terminal window anyway.
www.vmware.com
Download VMWare Player about 117Mb and a 5 Minute download
You will need to set up an account with your email address to be able to access the downloads, I don’t mind doing this as long as you don’t end up with a load of spam and I’ve not heard from VMWare since registering.
Windows and Linux 32 and 64 bit versions are available, I’m running 64bit Windows 7 on my laptop.
NOTE: there’s an easy cop out here. Search for Cloudera on the VMWare site and download their preconfigured VM with Hadoop, Pig and Hive, plus training exercises. You’ll need an extra tool called the OVF convertor which changes their generic VM format to be compatible with the VMWare Player. Using this you can be up and running in an hour and half with a fully functional Hadoop pseudo-cluster…. nice! but… you won’t learn how to configure anything and I found issues with all the software on the downloaded VM being out of date and not compatible with some of the latest software and drivers.
Step 2- Install and start the VMWare Player
VMWare Player
Step 3 – Create a new Virtual Machine with the Ubuntu .iso file
in the first step of the wizard, select the downloaded iso file. This will boot the new VM straight into the install process.
Select downloaded .iso file
On the next step create the initial ID and Password
Create the initial user name and password
Name the Machine and set the Path where you want the VM to be saved – I usually change this so the “user” folder in Windows doesn’t end up as a dumping ground for everything.
Name the Machine and set the Path
On the next screen you get the option to bring up the advanced configuration menu, Increase Memory if you
Configuration Screen, Increase Memory if you can spare it.
OK, now you can finish the wizard and set the VM to start up, it’ll go straight in to the install routine. After about 20 minutes you get this screen….
Nearly done!
Resist the urge to log in with the username and ID you created above and start playing around. Let it finish, in another 2 minutes or so the screen will clear and just show the Ubuntu Login: prompt.
NOTE: I’ve just been installing a later version of Ubuntu and it keeps hanging at this stage – basically the graphical environment never launches and the screen stays at that log in prompt. It seems to be an issue with the Easy Install on VMWare, to get round it I picked the “I will install an operating system later” option, then in the edit settings menu set the VM CD drive to the Ubuntu iso image, hit play virtual machine and it all installed perfect.
From the VMWare player, Virtual Machine menu choose Send ctrl-alt-dlt and your new VM will reboot to the finished GUI.
Restart the VM with ctrl+alt+dlt command
There you go a nice shiny new Linux VM, just a few updates and installs to go and we’ll be done
New VM ready to login and go.
Step 4 – Log in and Update the OS
log in and wait a minute or two for system to check for available security updates, there was around 75Mb of updates to download and install when I last did it which took about 10 Minutes.
Now if you don’t know Linux or Unix commands then you should spend some time on familiarising yourself with the basics of how to move around, copy and create files and directories. Also check out how permissions work , this site has some nice tutorials on it http://beginnerlinuxtutorial.com/ These are some common commands you’ll need and don’t forget Linux is CaSe SeNsItIvE!!!!
ls, cd, cp, mv, rm, mkdir. chmod. cat, ps, export, tar
From the Ubuntu Dash Home Icon, search for Terminal, drag and drop the icon onto the side menu bar. We’ll be using the terminal window from now on… (only need to do this for version 11.10, earlier versions have a different interface – I always drag a terminal icon onto the desktop)
Find the Terminal App and drag it on to the sidebar
If you find that when you’re typing, the keyboard isn’t right, it’s probably US format – try shift+2, if you get the @ rather than a ” you’ve got the keyboard set to US…
Go to the system setting icon on the sidebar and choose keyboard layout – add the English(UK) option and remove the English(US)
Change the Keyboard layout to UK
Obviously only do this if you have an English keyboard!!
Step 5- Update/Install Java
We need to make sure we have a late version of Java and our pathing correct, so do this now then any copies of this VM will all be sorted. Linux has a really nice system for installing software which uses online repositories and the “apt-get” command.
Note: you’ll see a lot of commands preceeded by “sudo” this means we are going to run this command as the superuser (or root) and have full admin privileges – remember this is not windows where the default position is you can do what you like, in Linux the default position is you’re a user and can’t muck about with the system
Open a Terminal window and follow the commands below – everything after the $ sign should be typed (that’s the prompt) and remember its case sensitive
Type Commands in Terminal Window
Add a repository
$ sudo add-apt-repository “deb http://archive.canonical.com/ maverick partner”
update repository
$ sudo apt-get update
install Java
$ sudo apt-get install sun-java6-jdk
Note: if you have to select OK and Yes to accept conditions then the tab key activates the button and Enter pushes it.
NOTE: 05/06/2012 update
just rerunning this using a later version of Ubuntu (precise) and I couldn’t get the sun-java6-jdk installed using the above method… in the end I found this site http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/ (very good BTW) which has a link to a different repository sudo add-apt-repository ppa:ferramroberto/java
run sudo add-apt-repository “deb http://ppa.launchpad.net/ferramroberto/java/ubuntu natty main”
and follow with
$ sudo apt-get update
$ sudo apt-get install sun-java6-jdk
this should work for the newer versions of Ubuntu
make this version of java the default
sudo update-java-alternatives -s java-6-sun
will maybe get some errors but if you only have one version of java but if running java -version shows 1.6. it’s done..
$ java -version
Step 6 – Save the VM as a template for re-use
Go to the directory where the VM files are and copy the lot into a backup directory
Copy the VM Image files to a template/backup directory
Whenever you want a fresh vanilla install just copy the backup files in to a new directory and from the VMWare Player choose Open a Virtual Machine, then navigate to the directory you created and open the .vmx file
Create a new VM by opening the files from the new directory you created
Done!
If you’ve followed this, then please comment on what worked, what didn’t, too much info. not enough info. etc. I want to get the instructions perfect.