Friday, November 03, 2006

Comcast connection problem (Motorolla SF4100) for recent two weeks

I have been having connection problem with Comcast connection problem (Motorola SF4100) for the past two weeks. The speed was not stable at all and may even lost connection. Why the broadband ISP can't provide a robust service just running water or electricity? Is it because
we as customers are not that demanding
  • we fail to scream at them for each every outage or service degradation
  • we fail to find time to talk to our legislators
  • we fail to fail to vote with our feet
  • we fail to penalize them even within the current framework. If everyone can just report the problem, ask for a ticket be opened, calling back to Billing with the ticket number to get a refund! This would help in two fronts
    • the sheer volume of calls/tickets would open some eyes when BI or data warehouse report is in
    • the sheer volume of the refund check would cut into the profit margin
    • the faster and more we complain, the sooner they stop using us as a measuring gauge whether the service is up or not
  • For the least, we should demand a registration be passed so that refund for system-wide outages be automated and to all customer base, instead of the current more-hoopla-for-you scheme: refund is only available for those who went through the trouble reporting the problem, asking for a ticket, calling back to Billing with the ticket number to get a teeny-weeny refund.

Here is the symptoms of the week:
  • power-cycle the cable modem and/or wireless router may or may not help.
  • download speed ranges from 200K to .4KBps to a stall. several times within a hour.
  • Web interface of the cable modem itself shows the following error/warnings. I looked at it when it works properly, it should be all Debug/Informational.
  • When I asked 'router' to 'renew' address, it failed to renew for over a dozens of retries and timed out.
  • Speed test via http://www.speakeasy.net/speedtest/ shows the connection is not stable at all: 600KB/14KB at one point, and 200KB/104KB at another. It really should stay around 700KB/80KB.
061103155029 8-Debug F504.1 Bridge Ethernet Hook. Failed to learn CPE MAC Address.
061103155029 4-Error F507.5 MAC Filters. Add MAC Address can't add entry. Table is full.
061103155019 8-Debug M570.2 Motorola CM certificate present
061103155019 8-Debug M571.7 CM Cert Upgrade Enabled. Initiate after Registration
061103155019 8-Debug I503.0 Cable Modem is OPERATIONAL
061103155019 7-Information B401.0 Authorized
061103155018 8-Debug F502.1 Bridge Forwarding Enabled.
061103155018 8-Debug F502.3 Bridge Learning Enabled.
061103155018 7-Information B0.0 Baseline Privacy
061103155016 7-Information X518.9 Configuration - GGFMMD - Unit Update Enabled by CVC
061103155016 8-Debug I500.1 DOCSIS 1.0 Registration Completed
061103155016 7-Information I500.4 Attempting DOCSIS 1.0 Registration
061103155016 7-Information D509.0 Retrieved TFTP Config File SUCCESS
061103155013 7-Information D507.0 Retrieved Time....... SUCCESS
061103155013 7-Information D511.0 Retrieved DHCP .......... SUCCESS
061103155013 5-Warning D3.0 DHCP WARNING - Non-critical field invalid in response
061103155013 4-Error D530.8 DHCP - Invalid Log Server IP Address.
061103155013 5-Warning D520.2 DHCP Attempt# 6 BkOff: 5s Tot DSC:6 OFF:3 REQ:3 ACK:1
061103155013 3-Critical D1.0 DHCP FAILED - Discover sent, no offer received
061103154959 5-Warning D520.2 DHCP Attempt# 4 BkOff:27s Tot DSC:4 OFF:2 REQ:2 ACK:0
061103154959 3-Critical D1.0 DHCP FAILED - Discover sent, no offer received
061103154932 5-Warning D520.2 DHCP Attempt# 3 BkOff:13s Tot DSC:3 OFF:2 REQ:2 ACK:0
061103154932 3-Critical D2.0 DHCP FAILED - Request sent, No response
061103154927 5-Warning D520.2 DHCP Attempt# 2 BkOff: 4s Tot DSC:2 OFF:1 REQ:1 ACK:0
061103154927 3-Critical D1.0 DHCP FAILED - Discover sent, no offer received
061103154923 5-Warning D520.2 DHCP Attempt# 1 BkOff: 4s Tot DSC:1 OFF:1 REQ:1 ACK:0
061103154923 3-Critical D2.0 DHCP FAILED - Request sent, No response
061103154918 7-Information D0.0 DHCP CM Net Configuration download and Time of Day
061103154918 7-Information T500.0 Acquired Upstream .......... SUCCESS
061103154918 8-Debug T503.1 Acquire US with status OK, powerLevel 19, tempSid 1378
061103154918 8-Debug T505.0 Acquired Upstream with status OK
061103154916 7-Information T501.0 Acquired Downstream (687000000 Hz)........ SUCCESS
061103154916 8-Debug T509.0 Acquired DS with status OK, DS Freq 687000000, US Id 5
061103154906 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154905 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 105000000, US Id 0
061103154905 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154905 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 99000000, US Id 0
061103154905 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154905 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 93000000, US Id 0
061103154905 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154904 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 855000000, US Id 0
061103154904 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154904 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 849000000, US Id 0
061103154904 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154903 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 843000000, US Id 0
061103154903 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154903 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 837000000, US Id 0
061103154903 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154902 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 831000000, US Id 0
061103154902 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154902 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 825000000, US Id 0
061103154902 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154901 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 819000000, US Id 0
061103154901 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154901 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 813000000, US Id 0
061103154901 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154900 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 807000000, US Id 0
061103154900 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154900 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 801000000, US Id 0
061103154900 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154859 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 795000000, US Id 0
061103154859 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154859 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 789000000, US Id 0
061103154859 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154859 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 783000000, US Id 0
061103154859 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154858 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 777000000, US Id 0
061103154858 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154858 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 771000000, US Id 0
061103154858 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154857 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 765000000, US Id 0
061103154857 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154857 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 759000000, US Id 0
061103154857 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154856 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 753000000, US Id 0
061103154856 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154856 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 747000000, US Id 0
061103154856 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154855 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 741000000, US Id 0
061103154855 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154855 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 735000000, US Id 0
061103154855 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154855 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 729000000, US Id 0
061103154855 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154854 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 723000000, US Id 0
061103154854 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154854 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 717000000, US Id 0
061103154854 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154853 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 711000000, US Id 0
061103154853 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154853 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 705000000, US Id 0
061103154853 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154852 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 699000000, US Id 0
061103154852 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154852 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 693000000, US Id 0
061103154852 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154851 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 687000000, US Id 0
061103154851 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154851 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 681000000, US Id 0
061103154851 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154850 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 675000000, US Id 0
061103154850 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154850 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 669000000, US Id 0
061103154850 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154850 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 663000000, US Id 0
061103154850 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154849 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 657000000, US Id 0
061103154849 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154849 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 651000000, US Id 0
061103154849 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154848 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 645000000, US Id 0
061103154848 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154848 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 639000000, US Id 0
061103154848 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154847 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 633000000, US Id 0
061103154847 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154847 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 627000000, US Id 0
061103154847 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154847 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 621000000, US Id 0
061103154847 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154846 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 615000000, US Id 0
061103154846 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154846 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 609000000, US Id 0
061103154846 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154845 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 603000000, US Id 0
061103154845 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154845 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 597000000, US Id 0
061103154845 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154845 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 591000000, US Id 0
061103154845 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154844 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 585000000, US Id 0
061103154844 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154844 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 579000000, US Id 0
061103154844 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154843 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 573000000, US Id 0
061103154843 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154843 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 567000000, US Id 0
061103154843 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154842 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 561000000, US Id 0
061103154842 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154842 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 555000000, US Id 0
061103154842 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154842 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 549000000, US Id 0
061103154842 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154841 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 543000000, US Id 0
061103154841 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154841 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 537000000, US Id 0
061103154841 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154840 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 531000000, US Id 0
061103154840 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154840 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 525000000, US Id 0
061103154840 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154840 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 519000000, US Id 0
061103154840 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154839 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 513000000, US Id 0
061103154839 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154839 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 507000000, US Id 0
061103154839 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154838 8-Debug T509.0 Acquired DS with status NO FEC lock, DS Freq 501000000, US Id 0
061103154838 3-Critical T2.0 SYNC Timing Synchronization failure - Failed to acquire FEC framing
061103154837 7-Information D519.0 DHCP Client shutting down.
061103154837 7-Information H501.2 HFC: Shutting Downstream Down
061103154837 3-Critical I2.0 REG RSP not received
061103154837 1-Emergency I506.0
061103154828 7-Information I500.4 Attempting DOCSIS 1.0 Registration
061103154828 7-Information D509.0 Retrieved TFTP Config File SUCCESS
061103154828 7-Information D507.0 Retrieved Time....... SUCCESS
************ 7-Information D511.0 Retrieved DHCP .......... SUCCESS
************ 5-Warning D3.0 DHCP WARNING - Non-critical field invalid in response
************ 4-Error D530.8 DHCP - Invalid Log Server IP Address.
************ 5-Warning D520.2 DHCP Attempt# 1 BkOff: 5s Tot DSC:1 OFF:1 REQ:1 ACK:1
************ 7-Information D0.0 DHCP CM Net Configuration download and Time of Day
************ 7-Information T500.0 Acquired Upstream .......... SUCCESS

Wednesday, November 01, 2006

perc 4e/Di on Dell PE6850 saga continues...part A

We ended up applying BIOS upgrade (A00->A01) and PERC 4e/Di firmware upgrade (521A to 522A A13) for the system lockup problems we had on the production database server running on a Dell PE6850. Home-made load tests didn't cause panic for 18 hours. The server was then rushed back into production since the fail-over spare server couldn't stand the load.

The server (the Sybase database engines) has been up for 14 days today. At 09:50am, just when the server started to ramp up to its daily load peak (CPU load ~=4) , some processes failed to write to the disk and 'date > junk' from cmdline just hang there. I canceled that 'date>junk'. All is good after less than 4 minutes. Nothing interesting (warn/error/abort) in the system log, exportlog from PERC controller, or database log. PR was running at the time.

The symptoms definitely differ, so the BIOS and firmware upgrade did make some difference towards the better. For the previous two lockups and the only two for 15 months, we lost access to the disks totally, getting "reject i/o to offlined disk" without kernel panic or corruption. This time, this is merely a hiccup or pause or suspension of sorts.

Older postings on similar topic on dell-linux-poweredge forum suggested PR could be the culprit if BIOS/firmware is up-to-date. On the system, I get the following output from '"megapr -dispPR -a0" today. Is #Iterations current count of the total PR has run or a threshold or some sort? If the former, how to clear it? If the latter, how to increase? Basically I am looking into why it locked up exactly 30 days (could be coincidence too. and we are now using newer BIOS and firmware). Dell diag from OMSA 4.4 on 10/17/2006 suggests nothing wrong the controller, memory, or underlying disks. (omreport on the controller is appended below too).

********PR INFO********
Mode :AUTO
#Iterations:2200
Status :PR In Progress

# omreport storage controller
Controller PERC 4e/Di (Embedded)

Controllers
ID
: 0
Status : Ok
Name : PERC 4e/Di
Slot ID : Embedded
State : Ready
Firmware Version : 522A
Driver Version : Not Applicable
Minimum Required Firmware Version : Not Applicable
Minimum Required Driver Version : Not Applicable
Number of Channels : 2
Rebuild Rate : 30%
Alarm State : Not Applicable
Cluster Mode : Not Applicable
SCSI Initiator ID : 7

Also, we upgraded the BIOS from A00 to A01, instead of to the latest A04, since the release notes of A02 through A04 didn't read pertinent at the time. At second read of A03's release notes, I noticed the following two fixes that could be relevant to the system. Where can I find more detailed notes other than PE6850-BIOSA03.TXT ? I don't quite understand why the developers or release managers so minced on words.

  • Added support for Virtualization Technology in the processor.
Should I assume this is not referring to HT, but of special server virtualization assistance from Intel's VT (?) technology or alike ?
  • Added support for 800MHz system configurations.
Does this mean BIOS prior to A03 doesn't support 800MHZ system configurations?

Although the megaraid* driver is dated early 2005. The CHANGLOG.megraid in /kernel/Documentation doesn't have much interesting changes either.