It´s been a few months since posting the idea of building a custom made storage system in my blog. During this time we convinced people to give our idea a try, did some minor changes to the box layout and ordered the parts. Finally they arrived at 27th December 2009.
Building the box
We had a budget of approx 12.000 euros for building the prototype of the storage box. We decided to build the prototype with 40 disks from the start and fit an SAN-HBA in it to try COMSTAR (this enables us to export the storage via SAN to other servers). As operating system we choose Open Solaris. Two disks are dedicated to the operating system and are attached directly to the mainboard. The remaining 38 disks are attached to either one Adaptec 52445 or one Adaptec 51645 controller.
Below are some pictures of the components and the final assembled box. Click on the image for a larger version:
After putting everything in place we started to install Open Solaris for the first time. But wait – we forgot a DVD drive. But USB-attached dvd drives are widely available. After attaching a usb-dvd drive we started to install.
This time we were able to boot from dvd but installation did not start. A short glance at the mainboard manufacturer site yielded:
Fixed an issue where the system could not boot from a DVD ROM when an Adaptec Raid Card (ASR-5805) was installed on any PCI-e Slot
Yup, we had a similar controller installed. So – instead of shorthand installing the operating system and playing around a little bit – we patched all components to the most recent version (BIOS, HBA, and so on).
After doing so installation went fine and the system booted Open Solaris 2009.06 for the first time.
Open Solaris 2009.06 is quite old and we needed the latest features of ZFS and COMSTAR so we upgraded our Open Solaris. There are basically two versions available:
- A “release” version (currently as of December 2009: Build 111) of Open Solaris (Repository Link)
- A “development” version (currently as of December 2009: Build 130) of Open Solaris (Repository Link)
The upgrade process is pretty straight forward:
Upgrading to the most recent “release” version is done by entering:
pfexec pkg image-update
Upgrading to the most recent “development” build is done by:
$ pfexec pkg set-publisher -O http://pkg.opensolaris.org/dev opensolaris.org $ pfexec pkg image-update
We first patched to the most recent “release” build and afterwards patched to the most recent “developement” build.Beside from the painfully slow fetching of the packages (*really* slow… it run several hours for few hundred MB) everything worked fine and we ended up with two bootable configurations:
- Release – Build 111
- Development – Build 130
We continued our work with the development build due to the fact we wanted deduplication and the most recent version with the most recent fixes included.
Replacing the Adaptec controller driver
In order to use the drives attached to the Adaptec controller we had to replace the driver shipped with Open Solaris with the appropriate driver from Adaptec.
Creating the ZFS pool
After successfully booting with Build 130 and replacing the adaptec driver we created our first ZFS pool “pool1” with a total capacity of 20 TB:
zpool create pool1 raid2z c10t0d0s0 c11t0d0s0 c12t0d0s0 c13t0d0s0 c14t0d0s0 c15t0d0s0 c16t0d0s0 c17t0d0s0 c18t0d0s0 c19t0d0s0 c20t0d0s0 c21t0d0s0 c22t0d0s0 c23t0d0s0 c24t0d0s0 c25t0d0s0 c26t0d0s0 c27t0d0s0 c28t0d0s0 c29t0d0s0 c30t0d0s0 c31t0d0s0
After several seconds the zpool was created successfully. So far – so cool.
I know a raid1z pool with 20 disks i pretty uncommon and i would never ever use this configuration in production due to the extremely high probability of a double-disk-failure. But for the very first tests it seemed acceptable.
Install missing packages
For using COMSTAR (and iSCSI as well) we needed several packages which could be installed with the help of the package manager (“pkg”) easily.
Booting after COMSTAR and ZPool creation
The first boot after installing the COMSTAR packages, replacing the qlc with the qlt driver (to export our storage over SAN) and creating the large zpool caused the system to crash during boot.
It took me some time to find out package management silently uninstalled the adaptec controller driver from adaptec we installed before and replaced it with the original Open Solaris driver (which cannot use disks attached to the adaptec controller) when installing the COMSTAR packages. Booting afterwards caused the system to crash.
Testing and Crashing
Currently we are testing overall system performance. While doing so we faced problems were I/O to the adaptec controller would hang occasionally. The system is still responsible (most probably because the system disks are not attached via the adaptec controller) but I/O to the data pool is impossible. I´ve posted the problem at the opensolaris mailing list but currently with no replies.
Conclusion: currently crashing
Although we are facing some problems we will continue this project. First of all we separated the hard disks on each controller into their own zpools. If this is a hardware issue the error should be located to one zpool only. In addition to that we will replace the controller with LSI-based ones which are also used in the SUN Thumper systems.
If the system runs performance is good: We observed up to 1 GB/s (not Gbit!) or 1000 MB/s sequential read/write speed and up to 7500 I/O operations per second. Exporting storage via SAN works as well with decent speed: We observed up to 320 MB/s on a QLogic 4 Gbit/s HBA running under Linux SLES 10 SP3.