Wednesday, December 13, 2006

How to coerce OMSA 5.1 to install & run properly on CentOS 4

To monitor Dell PowerEdge servers running Linux, I downloaded OMSA (Open Manage Server Assistant) 5.1 package for Redhat Enterprise Linux 4 (RHEL4) on a bunch of Dell PowerEdge 6850 servers. The package was listed under 'system management' of the 'drivers & download' section. The actual file name is 'OM_5.1_ManNode_LIN_A00.tar.gz'.

On the PE6850 server, I unpacked the tarball, and attempted to install using the srvadmin-install.sh. It failed silently, with an error code of 2.
# supportscripts/srvadmin-install.sh
# echo $?
2
Examing srvadmin-install.sh closely, I realized that it didn't support CentOS 4. Instead, it supports & detects RHEL4 by its code node 'Nahant'. Instead of mucking the code to support CentOS, I decided to fake RHEL4 by modifying /etc/redhat-release.
# cat /etc/redhat-release
fake RHEL 4 Nahant
It worked.The installation went a-ok and all services started properly. 'omreport storage controllers' now reports various information on various storage controllers.

Thinking the faking was needed only by the installation scripts, I reverted /etc/redhat-release to its original version. To be sure, I restarted OMSA by ' srvadmin-services.sh stop|start'. Well, it didn't pan out that well.

# /usr/bin/srvadmin-services.sh start

Starting mptctl:
Waiting for mptctl driver registration to complete:
[ OK ] Starting Systems Management Device Drivers:
Starting dell_rbu: [ OK ]
Starting ipmi driver: [FAILED]
Starting Systems Management Device Drivers:
Starting dell_rbu: Already started [ OK ]
Starting ipmi driver: [FAILED]
Starting DSM SA Shared Services: [ OK ]

Starting DSM SA Connection Service: [ OK ]

Now it seemed the srvadmin-ipmi package's init script still relies on /etc/redhat-release to tell it which OS it is running on. The server doesn't have OpenIPMI installed. Without any IPMI driver loaded, no report can be run against storage or temps using OMSA.

To have it work for both world (OMSA and CentOS), I cooked up this /etc/redhat-release by appending the RHEL4 code name to the original /etc/redhat-release
# cat /etc/redhat-release.fake
CentOS release 4.4 (Final) Nahant
# /usr/bin/srvadmin-services.sh start
Starting mptctl:
Waiting for mptctl driver registration to complete: [ OK ] Starting Systems Management Device Drivers:
Starting dell_rbu: [ OK ]
Starting ipmi driver: [ OK ]
Starting Systems Management Data Engine:
Starting dsm_sa_datamgr32d: [ OK ]
Starting dsm_sa_eventmgr32d: [ OK ]
Starting DSM SA Shared Services: [ OK ]
Starting DSM SA Connection Service: [ OK ]

This cooked /etc/redhat-release had been verified to work well with 'yum -y update' and OMSA's 'srvadmin-services.sh start'

Wednesday, December 06, 2006

trick to enable non-bridged wireless access on a Netgear Wireless router (WGR614v6)

I bought a Netgear wireless router (WGR614v6) from CompUSA yesterday. It is going to a customer's site for WI-FI blackberries (BB7270). I need a wireless router that can simply
  • obtain its own TCP/IP via DHCP from a network port
  • self-contain its WLAN: NAT/SPI protect all wi-fi devices
I would be happier with a Netgear MR814v3, which we have in the lab and is known to work well with BB7270. All the bugs and glitches we ran into in the past year turned out to be on RIM's end, instead of the "poor" quality perceived by RIM's engineer of consumer-grade wireless access points.

Followed the auto-run wizard from the installation CD to the letter, I connected my own laptop to one of the LAN ports on the unit, with the unit's own Internet port plugged into the wall jacket in my office. Once I did a 'ipconfig /renew' on my laptop, the laptop got TCP/IP settings via DHCP as promised. It is a bit odd to see the DNS server is 192.168.1.1. This means the Netgear unit proxies DNS queries for devices utilizing its DHCP/NAT services. However, two things do not work right:
  • nslookup anything resolves to 192.168.1.1 ?
  • the LED on the front of the unit is not on for the WI-FI? 'Router status' says 'wireless :: off', while the checkbox to turn on 'wireless router radio' was shadowed out!
Searching on Netgear's site, I was relieved/disappointed to find the firmware was up to date. There's a link from WGR614's support page on how to enable the unit as a wireless access point. However, such enabling is to bridge the WLAN to the local LAN. I attempted that to satisfy my curiosity. It requires to use a regular LAN port as the autosense uplink instead of the separate Internet port on the unit. Once I did it, the laptop and a wifi blackberry were bridged to the real Ethernet LAN and got their TCP/IP settings from the real corporate DHCP server instead of from the unit's own DHCP server.

Lo & behold, everything went peachy all of a sudden, after this diversion. The checkbox to turn on wireless router radio option is now available, even if after I moved the cable from regular LAN port to Internet port on the unit. The wireless icon on the unit is blinking green. Power cycle the unit didn't eliminate such capability.

I am a bit at loss why wireless capability was disabled before and how it could be enabled by the steps I took. One thing I should have done at the beginning, is to reset it to factory default, just in case someone has messed it up somehow. The unit has all its seals and should be a new one. Anyhow, I am glad it worked out. However, I can't imagine any non-techie can get this far w/o picking up the phone or returning it to the retailer.

One thing to note is that WPA-PSK worked as advertised, both on the access point and on the BB7270. A lengthy 'mydoaminDS-24random' was used for SSID and the maximum allowed '63RANDOM' as the PSK (Pre-shared key). I couldn't imagine that I can type them in w/o a typo, so I emailed such lengthy secret to the email account currently assigned to my test BB7270. On the BB7270, I copy+paste these lengthy random bits from email to a new WLAN profile creation form under Options/WLAN. It worked great. With older version of BB7270 firmware, copy+paste WEP key fails each every time. I'd assume the bug was fixed by now (from v4.0.1.84 to v.4.0.1.104). On the BES (Blackberry Enterprise Solution) server, a site-specific "IT Policy" got created on the using these WLAN secrets worked great too. However, the SSID somehow took several kicks to get really sucked into the "IT policy". The BBM (Blackberry Manager) didn't complain when SSID didn't take, resulting that a BB7270 provisioned from scratch after nuking didn't carry any WLAN profile at all. Only upon reviewing the IT policy, I was shocked to find SSID field was blank while everything else under the WLAN policy section looked good.