Installing Solaris 10 booted from a Linux server

There are a couple of issues I had to struggle with when I installed Solaris 10 on a SPARC host.

Every time I have installed Solaris in the past, I have popped the CD into another Solaris machine and used the Tools/add_install_client script to set up everything on the Solaris host for the install client. This time it was a Linux machine that I had handy to use as an install server.

I was quite familiar with the setup that is required on the install server from many previous Solaris installs:

Although I suppose it is possible that the add_install_client script would have just worked, I decided to set this up manually, since I already had some of it setup (like /etc/ethers) from previous attempts to boot things on this machine.

Here are the problems I encountered and how I solved them.

inetboot cannot load the kernel

Packet sniffing shows the server reporting a NFS4ERR_PERM. I couldn't figure this out; other clients could mount the exported CD by NFS with no problems and it was exported read-only to the world.

Older versions of inetboot (from Solaris 7 and 8) did not have this problem but the Solaris 10 kernel did not like those inetboots. I guess you have to use the right inetboot for the kernel version.

I ended up replacing Linux's kernel based NFS server with the user based one, and that problem went away.

bpgetfile failed

/sbin/install-discover runs bpgetfile to get the root directory and install directory from the bootparams server after the IP interface is up and running under the Solaris 10 kernel. bpgetfile fails.

The reason it was failing, sniffing quickly revealed, was that it was sending to the wrong broadcast address (the interface autoconfiguration took a guess concerning the correct netmask because RARP does not provide one). The bootparams server wasn't responding to that address. I wonder: does the bootparams server on Solaris not care about the correctness of the broadcast address?

/sbin/install-discover gives you a shell when this happens, but I found that fixing the netmask with ifconfig and rerunning /sbin/install-discover didn't work very well. Maybe /sbin/install-discover does things near the beginning of its invocation that you can't get away with repeating.

I solved this by copying the Tools/Boot directory off the CD to a hard disk and exporting the hard disk copy instead of the CD copy by NFS (being sure to update bootparams to point to the new path). I then edited the copy of /sbin/install-discover with a hardcoded ifconfig command to set the correct netmask and broadcast address just before the invocation of bpgetfile. This worked.

I noticed some other pieces of the install process sending packets to the wrong broadcast address at a couple of other points in the install process but it always seemed to time out, get past those and continue without issue each time, so I did not bother with those.

Cannot talk outside of the local LAN

That one was easy. The client was aparently using the install server's IP address as a default gateway but that wasn't the correct default gateway. I turned on IP forwarding temporarily on the install server so that it would bounce the packets along to the correct default gateway.

Solaris 10 installer impressions

The Solaris installer has changed very little over the years, it seems. A few not particularily important things got better (the choices for name service are no longer limited to NIS+, yp and None: the actual most probable answer to this question, DNS, is now actually an option). The interface is as slow as ever (don't they know the console is usually 9600bps?). Two important things got much worse though, which I'd like to mention.

The way I think an OS install should be obeys the motto Reboot early, reboot once.

Solaris has never been good with the first part of that motto. It installs absolutely everything from the distribution medium before booting into the new system. This has not changed.

Some people think an OS should be able to be installed without rebooting at all, but I don't agree. Usually for an OS install there are one or more things that are different between the installer environment and the final runtime environment that can only be changed by rebooting. The one you want to use as soon as you can is, of course, the final runtime environment. For some installers and operating systems this difference is more pronounced than for others. Examples of things that can be different are: having your root filesystem mounted on remote or removable media (Linux can change this live with pivot_root, but this is rare functionality) and using a stripped down kernel in the interest of size and complexity that lacks features which are definately not needed for installing (like SMP, audio, or debugging symbols). Besides, you want to test that the installed image is indeed capable of booting and starting up (on machines with multiple fixed disks, getting the firmware configured to boot the correct disk is not always trivial, for example). If you don't want to have to reboot at all, then Knoppix is for you!

On the other hand there is no reason to reboot more than once during an operating system installation, but now Solaris has violated this part of the motto too. Frustratingly, it seems (just from watching the console) that it does very little at all the first time booting into the newly installed system before declaring that it must be booted again. It's possible that this requirement was forced by the complexity of the new service management framework, but it's certainly disappointing.

The second thing that is worse than it used to be is that the installer no longer demands that the user set a root password before starting services for the first time. Instead the root account is created WITHOUT A PASSWORD. Sure, the first thing that any intelligent person will do right after logging in is to set one, but I certainly liked it better before.