Friday, February 13, 2009

Installing OpenSolaris on a server with a HP Smart Array Controller

I'm quite new to OpenSolaris; insofar as it's like Linux, I'm pretty comfortable, but there appear to be enough little gotchas to make the learning curve a little steeper than one might like. Take, for instance, the fact that OpenSolaris doesn't ship with the drivers for the almost-ubiquitous HP Smart Array controller. That makes for an installation hiccup that a quick Google search shows many people have found difficult. It's too bad, too: it appears that this has stymied a lot of folks in trying to install the OS. Like most things, the solution is easy, once you know how to do it. These instructions, originally, were for OpenSolaris 2008.11, but they've since been updated to work with 2009.06.

Boot to the Install CD

The first task is to boot to the OpenSolaris Live CD, which is also the install CD. If you haven't downloaded it, yet, you can get it at http://opensolaris.org

Get the HP Smart Array Driver

Download the CPQary3 driver, which includes Solaris drivers for most of the recent (and not so recent) Smart Array controllers. As of this writing, the latest version of these drivers is v2.2.0. 
Update: I've tried a few times to get the v2.0 drivers to work on the latest-generation HP servers, and I've had no luck. The v2.2.0 drivers are out, and they support the latest that HP has to offer. I've confirmed that the 2.1.0 drivers work on a BL460 G6 .
So stay away from the 2.0.0 drivers with new HP servers: they can cause some serious heartburn. The HP CPQary drivers can be found here. The /tmp filesystem has a lot of space on the live CD, so it's probably best to save the file there. 
I saved off the 1.9.2 drivers here, if that's helpful. As always, it' a better idea to go to the source (HP in this case) for files than a second-hand location, but if you can't locate it at their site, it'll remain available here.

Install the Driver

Now that we've got the drivers, we'll unpack them and go about installing them:
jack@opensolaris:/tmp$ tar -zxf *.gz
jack@opensolaris:/tmp$ ls
CPQary3-2.0.0-solaris10-i386
dbus-D43WuQsGnK
iconf_entries.363
CPQary3-2.0.0-solaris10-i386.tar.gz
dbus-EmWPHCv5Ec
ogl_select471
Just for simplicity's sake, I renamed the directory, so that it was a bit less unwieldy.
jack@opensolaris:/tmp$ mv CPQary3-2.0.0-solaris10-i386 cpqary
jack@opensolaris:/tmp$ cd cpqary
jack@opensolaris:/tmp/cpqary$ ls
CPQary3.144
 CPQary3.pkg
 LICENSE.CPQary3
 RELEASENOTES.CPQary3
CPQary3.iso
 DU
 README.CPQary3
 tools
Note that the OpenSolaris Live CD logs in with the username 'Jack,' which doesn't have much in the way of priviledges. Instead of sudo, use the pfexec script to run the commands with elevated priviledges. 
Now here is where some persistent Googling paid off. There's a bug report at the OpenSolaris site (bug #5860) where a developer suggests a step (creating an empty file in the root dir) that makes things all OK.

Note that this step continues to be necessary with OpenSolaris 2009.06 and the v2.1.0 CPQary3 drivers. So here are the rest of the steps: 

Create a file on root:
jack@opensolaris:/tmp/cpqary$ pfexec touch /ADD_DRV_IGNORE_ROOT_BASEDIR

Once we've done that, we can install the driver we downloaded:
jack@opensolaris:/tmp/cpqary$ pfexec pkgadd -d ./CPQary3.pkg 
The following packages are available:
1 CPQary3 HP Smart Array Controller Driver (i386)
2.0.0,Rev=2008.12.05.01.09
 Select package(s) you wish to process (or 'all' to process all packages).
(default: all) [?,??,q]:

Processing package instance from HP Smart Array Controller Driver(i386) 2.0.0,Rev=2008.12.05.01.09

Copyright 2008 Hewlett-Packard Development Company, L.P.
## Executing checkinstall script. Using as the package base directory.
## Processing package information.
## Processing system information. 11 package pathnames are already properly installed.
## Verifying package dependencies.
## Verifying disk space requirements. WARNING: The /usr filesystem has 0 free blocks.
The current installation requires 158 blocks, which includes a required 150 block buffer for open deleted files.
158 more blocks are needed.
WARNING: The /usr filesystem has 0 free file nodes.
The current installation requires 26 file nodes, which includes a required 25 file node buffer for temporary files.
26 more file nodes are needed.
 Do you want to continue with the installation of [y,n,?] y 
## Checking for conflicts with packages already installed. 
## Checking for setuid/setgid programs. 
This package contains scripts which will be executed with super-user permission during the process of installing this package. 
 Do you want to continue with the installation of [y,n,?] y 
 Installing HP Smart Array Controller Driver as 
## Installing part 1 of 1. 
/kernel/drv/amd64/cpqary3 
/kernel/drv/cpqary3 
/kernel/drv/cpqary3.conf 
/usr/share/man/man7d/cpqary3.7d 
ERROR: attribute verification of failed pathname does not exist [ verifying class
ERROR: attribute verification of failed pathname does not exist [ verifying class ] [ verifying class
## Executing postinstall script. 
Installation of partially failed.
Note a couple of things: the defaults are sufficient, and there are errors in the install. Happily, the errors are in copying the man pages, which we'll not need, at least for now (the /usr filesystem is read-only in the live CD). The good news is that the driver now is installed for the Smart Array controller.

Install OpenSolaris

Double-click on the "install OpenSolaris" icon, and you now should be able to see your drives for installation.

Troubleshooting

Reboot loop

If, after installing OpenSolaris as above, you find yourself in a reboot loop, the best thing to do in troubleshooting it is to set a boot option such that OpenSolaris will display text as it's booting, rather than a graphical progress screen. To enable this, type the letter e when presented with the GRUB boot loader menu. This will allow you to edit the boot options. You'll see a list of the steps that are used in booting OpenSolaris. First we want to get rid of the splash screen. Highlight the line that references "splashimage" and press the letter d. That will delete the image that otherwise will cover up the debugging text. Now select the kernel line (it begins with 'kernel$') and type the letter e again. If you're unfamilar with GRUB, you're getting a glimpse into how it works: basically, it's a series of commands that sets up the system and then passes control over to the operating system. It's a great system for dual booting (Windows, unbeknownst to most, uses something similar, if more mysterious). So now we've got a line that, by default, looks more or less like this:
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics
We want to add verbosity and debugging to the boot options, so add -k -v to that line and remove the graphical console, such that it reads like this:
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS -k -v
Hit ENTER, and you'll be returned to the GRUB menu, where you can hit the letter b to boot the system. Now you can see the error that is causing the system not to boot (or to reboot). In many cases, you'll see that the error is "cannot mount root" because there's something wrong with the SCSI controller driver. In this case, try the installation again, using an older version of the CPQary3 drivers.

Enabling Event Logging

HP, for some reason, turned off storage controller event logging in the latest version of the CPQary3 driver. You can read the full HP Advisory here, along with instructions on enabling the logging.

47 comments:

  1. Thank you very much!
    Good article!

    ReplyDelete
  2. hi,
    thx for the guide ;)..I'm trying to download the driver for the smart array but I think hp ftp has some troubles...do you still have the driver?? maybe you could send me...I'd really appreciate...

    ReplyDelete
  3. I've updated the post with the new link for the file on HP's support site, and I've also added an alternative location, should the HP site not work for you. -Lane

    ReplyDelete
  4. Worked !
    Installed on HP DL 360 G3 and Opensolaris and followed this post.

    Thanks alot !

    ReplyDelete
  5. Thank you so much for this. This is a lot easier than the method I'd been using.

    It appears to be working on a DL360 G5, even as I comment.

    ReplyDelete
  6. didnt worked for me :(

    trying on a Compaq DL360
    with OPenSolaris 2008.11

    ReplyDelete
  7. That's a bummer; happy to help, if I can. Let me know what's happening, if you come back this way again.

    ReplyDelete
  8. The installation completed successfully but upon rebooting the machine fails to boot up completely by reboots continuously. Any ideas please?

    ReplyDelete
  9. What do you see on the screen before it reboots again (in the reboot cycle)? Would you be installing this on a drive with existing data?
    If this system had a bootable partition on it before, that partition may not have gotten overwritten, and it'd still be trying to boot (but, of course, wouldn't find its OS to which to boot). That's a wild guess, but we've seen that before.
    If you're not trying to save data on the drives, I'd run HP's drvie array utility and destroy the existing logical drives. Then I'd re-create them. You could be sure everything is gone, having done that.

    ReplyDelete
  10. Does the HP Smart Array Controller allow you to put the disks in plain JBOD mode, to allow ZFS to "own" your storage system? Or do you have to do some trick like exporting single-drive RAID0 volumes?

    ReplyDelete
  11. Lane, there was no OS on the machine prior to my installation. Upon booting up, it displays the OpenSolaris boot options with GRUB. After the selection, it shows the booting splash screen with progress bar for a second or two, then it reboots the machine. I am using a DL360 G5, incase that helps. As per your suggestion, I will be recreating the logical drive with HP's SmartArray utility and then I can give it another try. Will let you know how it goes.

    ReplyDelete
  12. I just went through the entire process twice, and all went well. I guess I must have missed out on something the first time. Thanks a lot!

    ReplyDelete
  13. Robert, I'm glad it's working! Sounds like the gremlin decided to haunt someone else's system. :)

    Eric, I don't believe the smart array controller will present the disks as JBOD. I've never seen that capability, and I've heard others complain about that lack, as well.

    ReplyDelete
  14. Hey Lane

    It seems Robert's gremlins have come to haunt my system - I'm using a brand new HP DL380, your guide worked perfectly and I am able to install the OS, but it reboots a second or two after an entry is selected from the GRUB menu. It seems the kernel is loading but panicking for some reason. I've not worked out what is wrong yet but will let you know what I find.

    ReplyDelete
  15. Andrew, I'm glad it (sortof!) worked. I (and others, I'm sure) would love to know what you find.
    If you _and_ Robert have experienced this, there must be others fighting it, as well.

    ReplyDelete
  16. Worked on a ML530! You saved this machine from the garbage heap.
    Thank you for this!

    ReplyDelete
  17. That's good to hear! I think that OpenSolaris is a great option for setting up a shared storage environment on older equipment (Whew: an ML530? Those are beasts!).

    We've been experimenting with using OpenSolaris as an OS-agnostic enterprise-wide shared storage system. I think it'll be great, should I have the time to devote to it. I'll post our experience here, should it ever happen. :)

    ReplyDelete
  18. On a DL380 G3 with a 5i and 6402 controller installed, I tried the procedure listed with several versions of the driver. The result is the same for every version: When I get to the end of the install script after doing the pkgadd, the script says ##Executing postinstall script. and then the server reboots.

    I have a couple of MSA20s that I would like to use on this server so I wish I could get it to work. Similar things happen when I try to use official versions of Solaris.

    BTW, thanks for this tip.

    ReplyDelete
  19. Scott, when it reboots, what happens then? Does it go into a reboot loop?

    ReplyDelete
  20. Thanks for this write-up. Very much appreciated.

    Has saved me the equivalent time 'googling'.... :)

    ReplyDelete
  21. I confirm this guide works with OpenSolaris 2009-06. I dont' want to use Solaris 10 for this. I like OpenSolaris.

    I have a HP Smart Array E200i SAS Controller
    I installed the CPQary3-2.0.0-solaris10
    -i386.tar.gz driver.

    Thank you very much.

    Regards.

    ReplyDelete
  22. This worked with CPQary v2.0.0 on a DL380G5 with Smart Array P400 but I first tried it (on 2 different machines) using CPQary v2.1.0 and in each case after installation it got stuck in the reboot loop described by others.

    ReplyDelete
  23. pmd, that's really good info; thanks for taking the time to post that!
    I'm about to start a new OpenSolaris project for a test SQL Server cluster, and this is great to know.

    ReplyDelete
  24. I am trying to do a similar thing to the posters on this thread. I configured the HP blade to boot using the serial console and the reboot loop is because of the following error:

    NOTICE: error reading device label
    NOTICE:
    * This device is not bootable
    * It is either offlined or detached or faulted
    * Please try to boot from a different device.
    NOTICE: spa_import_rootpool: error 19
    Cannot mount root on /blah blag

    panic[cpu0] etc

    (I had to retype this hence omitted some of the ascii art).

    I verified that there does appear to be the correct partition and labelled the disk etc. I think that the latest opensolaris and my blade/smartarray just do not work together.

    The way to get the serial console is to use console=ttya in the grub params to kernel btw. It's more reassuring to get error messages than a graphical progress bar and then a reboot.

    Chris Morgan
    cm@miihalis.net

    ReplyDelete
  25. I posted my own email address wrong, it's cm@mihalis.net - just in case anyone needs to get in touch to share experience on this. I love Solaris, and I've got a nifty HP C3000 blade chassis, just can't get the two to play nice together right now.

    ReplyDelete
  26. Chris, that's great information; thanks for sharing it here! Have you tried an earlier version of the driver? I'm curious if that makes a difference; it sounds like it did with pmd above.

    I'm guessing these are new BL460 G6 or BL480 G6 servers; is that right? I might be able to dig out the same config to see if we can put our heads together on getting it to work.

    ReplyDelete
  27. I have some good news to report. I found an ISO of 2008.11 patch with the CPQary driver, and when I installed this it worked perfectly, including being able to install graphically, boot graphically etc etc.

    The image I used is this:
    http://www.szymonbanka.com/osol-0811-101b-cpq_2.0.0.iso

    From a thread on opensolaris.org here :
    http://www.opensolaris.org/jive/thread.jspa?messageID=318996&tstart=0#318996

    For my purposes, this is good enough to proceed with more exploration of OpenSolaris, so this was a big help. Of course I would like to see an ISO of the latest OpenSolaris with this driver installed out of the box, but can't have everything!

    Chris Morgan
    cm@mihalis.net

    ReplyDelete
  28. Chris, that's really helpful information; I've updated the post with some troubleshooting suggestions. I've also confirmed that the v2.0.0 CPQary3 drivers don't work on our BL460 systems.
    I *can* say that the v1.9.2 drivers work well on that system. I'm hoping to try out the 2.1.0 drivers directly, and I'll update the post with what I find. My hope is that, since HP reports that support for these latest servers is included in that release, it'll work well.

    My hunch is that the difference with that ISO you used is the earlier (1.9.1) CPQAry driver that is installed on it. Awfully nice to have that as a part of the installation media, no?

    Thanks again for posting your experiences here!

    ReplyDelete
  29. Thanks a lot! This saved me a lot of Work.
    I've created an bootable usb stick as described in this article http://blogs.sun.com/clayb/entry/creating_opensolaris_usb_sticks_is . After copying the image to the usb stick I've mounted the stick and put some different driver versions on it. For my HP DL785 G5 driver version 2.2.0 works fine. Thanks again for this article!

    ReplyDelete
  30. Another reader directed me here from the opensolaris forum and thanks to you, I could install OpenSolaris 09/06 on a DL360 G5 quite easily.

    Thanks!!

    ReplyDelete
  31. Hi, great post, have had it working on DL380 G5 for a while, but now I am struggling with a G6.

    Have anyone managed to get this working on a DL380 G6, with the 410i controller?

    ReplyDelete
  32. Hi again, well managed it ourselves. What we _think_ solved it was upgrading the SmartArray controller to the 2.0 firmware.

    ReplyDelete
  33. Ah, that's very helpful! Thanks for posting your findings!

    ReplyDelete
  34. Tried the reference ISO image and it says it finds the 5i controller. However, I get pci0,0, ide0 etc. etc. errors that bring me to an unusable cmd line shell. HP DL360 G3. But thanks everyone.

    ReplyDelete
  35. Hank, it sounds like upgrading the firmware might help; have you tried that? Additionally, there are some differences between the cpqary3 versions. Do you have similar results when you try an earlier version of the driver?

    ReplyDelete
  36. Hank: I had the exact same problems with OpenSolaris 2009.06 on a DL380 G3. The problem has nothing to do with the storage array (5i in my case) but rather some bad interaction between that particular Solaris kernel and the IDE subsystem. Personally, I got around it by doing a network installation. You can read more about that here:
    http://dlc.sun.com/osol/docs/content/dev/AIinstall/

    In this case, it was actually easier than a CD install because I did not have to modify or burn an ISO. I just altered the x86.microroot (served now via TFTP) to contain the CPQary3 driver and, with some patience due to the problems with the IDE chipset, it installed successfully.

    ReplyDelete
  37. Hank: As luck would have it, I just came across this forum link that describes the same problem we are seeing on the CDROM drives:

    http://opensolaris.org/jive/thread.jspa?messageID=236308

    The suggestion is to disable the DMA on the CDROM drive by adding the grub option:
    -B atapi-cd-dma-enabled=0

    I have tested this myself on my DL380 G3 and it does seem to have worked around the IDE controller timeout messages.

    ReplyDelete
  38. Ben, thanks for the heads-up; that's very useful!

    ReplyDelete
  39. Hi,

    Thank you very much for this article. It helped me very much in setting up opensolaris 0906 on a HP ML110 with a SmartArray E200.

    Regards

    Richard

    ReplyDelete
  40. Lane,

    Thank you for the article, its much appreciated.

    I am having a slight problem with the install. I have a Proliant 380 G4 and after installing the driver, the install seems to be stuck at discovering the disks. If I drop to a prompt, I can run format and see my drive there.

    Any help?

    ReplyDelete
  41. Hi Lane,
    Thanks for this info, really helps.
    Do you know if there's a way to increase the number of supported disks per controller? The HP driver supports up to 16 disks but I have a MSA70 with 25 disks (yes, I would like to create a zpool using the raw disks).

    d

    ReplyDelete
  42. Thanks for the helpful info.

    I installed OS Build 134 on a 2.5 y/o MBP, and it got stuck in a reboot loop.

    When I followed these instructions, to get rid of the splash screen and the graphical console, and enable verbose messages, the reboot loop vanished.

    Happy as a clam.

    Sunny Guy

    ReplyDelete
  43. The HP smart array in my ML530 didnt work under any distribution of Linux other than RedHat out of the box. Not even the brand-new ubuntu server, yet it ran on 10-year old red hat enterprise 2.x ... strange.

    ReplyDelete
  44. Thanks a ton. For 6 hours I was fighting with the 2.0.0 drivers. After reading your article, downloaded the latest 2.5.3 and I could progress (atleast see the disks).

    Thanks again.

    ReplyDelete
    Replies
    1. PC, I'm glad it was helpful! It remains shocking to me that Oracle-nee-Sun isn't including such a common and basic driver in the stock installation media.

      Delete

Thanks for leaving a comment!