After a stock installation of the Fedora Core Linux 6, I am pleasantly surprised at the first boot, with the SELinux policy management GUI at part of the 'firstboot' program. You can collapse to set granular policies for services of your interest. For each service, all you need to do is to check or to uncheck policy items .Such a nice GUI management would definitely help greatly to unleash the raw power of SELinux to system administrators or even security administrators who would otherwise have to spend much more time to study, research, and to manage SELinux policy effectively.
The same SELinux policy management tool is readily accessible under System-> Administration->Security Level and Firewall. I guess the name of the applet has not really caught up with this new integration.
Of course, a system won't be secure unless it can be kept up-to-date with all good security patches and updates. In that regard, FC6 is improved as well. The update notification applet is improved to make the update process point-and-click compliant. You will get Package Updater running all GUI to assist regular desktop users to keep their system up to date.
Tuesday, October 31, 2006
kernel panic when using a 2G ramdisk on a CentOS 4.4 serverCD installation
I was burnt recently with a minimal installation using CentOS 4.4 ServerCD. CentOS project has this one-CD flavor for server use in addition to the regular current 4-CD full distro, mirroring Redhat Enterprise Advanced Server (RHAS). I found it awfully convenient to download only one CD of 530M, instead of the full set of 2.4G. The installations I do at work are all server installations, if my own Linux desktop is not counted. It has been a god-given ever since I first noticed its existence in CentOS 4.2 /iso folder.
Kernel paniced for a new production Sybase ASE database server when Sybase ASE 12.5 database engines started to zero out ~2G tmpdb devices (data & log devices). The hardware specs: Dell PowerEdge 6850 with 16G of DDR2 RAM and quad CPUs. The box survived my own stress test on CPU and disk. However, as an hindsight, no attempt was made to exhaust the memory.
The Sybase seemed to have finished its task, when the DBA called me saying he lost connection to the machine. I attached to its serial console and opened a connection using Kermit. Input was not rejected, but no response (no echo) from the system. After a quick reboot, I found nothing interesting in the log tail around the panic ( syslog.conf is set to write kernel.* to /var/log/messages). So, I decided to log the console session and asked the DBA to start Sybase. Sure enough, it happened again, with tons of 'out of memory' messages, followed by desperate yet fatal attempts by oom-killer to kill all processes to free up memory.
oom-killer: gfp_mask=0xd0
Mem-info: DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
cpu 2 hot: low 2, high 6, batch 1
cpu 2 cold: low 0, high 2, batch 1
cpu 3 hot: low 2, high 6, batch 1
cpu 3 cold: low 0, high 2, batch 1
cpu 4 hot: low 2, high 6, batch 1
cpu 4 cold: low 0, high 2, batch 1
cpu 5 hot: low 2, high 6, batch 1
cpu 5 cold: low 0, high 2, batch 1
cpu 6 hot: low 2, high 6, batch 1
cpu 6 cold: low 0, high 2, batch 1
cpu 7 hot: low 2, high 6, batch 1
cpu 7 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
cpu 2 hot: low 32, high 96, batch 16
cpu 2 cold: low 0, high 32, batch 16
cpu 3 hot: low 32, high 96, batch 16
cpu 3 cold: low 0, high 32, batch 16
cpu 4 hot: low 32, high 96, batch 16
cpu 4 cold: low 0, high 32, batch 16
cpu 5 hot: low 32, high 96, batch 16
cpu 5 cold: low 0, high 32, batch 16
cpu 6 hot: low 32, high 96, batch 16
cpu 6 cold: low 0, high 32, batch 16
cpu 7 hot: low 32, high 96, batch 16
cpu 7 cold: low 0, high 32, batch 16
HighMem per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 cpu 2 hot: low 32, high 96, batch 16 cpu 2 cold: low 0, high 32, batch 16 cpu 3 hot: low 32, high 96, batch 16 cpu 3 cold: low 0, high 32, batch 16 cpu 4 hot: low 32, high 96, batch 16 cpu 4 cold: low 0, high 32, batch 16 cpu 5 hot: low 32, high 96, batch 16 cpu 5 cold: low 0, high 32, batch 16 cpu 6 hot: low 32, high 96, batch 16 cpu 6 cold: low 0, high 32, batch 16 cpu 7 hot: low 32, high 96, batch 16 cpu 7 cold: low 0, high 32, batch 16 Free pages: 12506892kB (12493888kB HighMem) Active:208204 inactive:791964 dirty:101636 writeback:26 unstable:0 free:3126723 slab:20412 mapped:563179 pagetables:DMA free:12564kB min:16kB low:32kB high:48kB active:0kB inactive:0kB present:16384kB pages_scanned:4 all_unreclaimable? yes protections[]: 0 0 0 Normal free:440kB min:928kB low:1856kB high:2784kB active:644696kB inactive:16036kB present:901120kB pages_scanned:1069066 all_unreclaimable? yes protections[]: 0 0 0 HighMem free:12493888kB min:512kB low:1024kB high:1536kB active:180824kB inactive:3159244kB present:16908288kB pages_scanned:0 all_unreclaimable? no protections[]: 0 0 0 DMA: 3*4kB 5*8kB 4*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 2*4096kB = 12564kB Normal: 0*4kB 1*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 440kB HighMem: 1418*4kB 535*8kB 128*16kB 85*32kB 119*64kB 38*128kB 10*256kB 4*512kB 2*1024kB 2*2048kB 3041*4096kB = 12493888kB Swap cache: add 0, delete 0, find 0/0, race 0+0 0 bounce buffer pages Free swap: 16779884kB 4456448 pages of RAM 3963834 pages of HIGHMEM 299626 reserved pages 849369 pages shared 0 pages swap cached Out of Memory: Killed process 6586 (dataserver). oom-killer: gfp_mask=0xd0 Mem-info: DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 cpu 2 hot: low 2, high 6, batch 1 cpu 2 cold: low 0, high 2, batch 1 cpu 3 hot: low 2, high 6, batch 1 cpu 3 cold: low 0, high 2, batch 1 cpu 4 hot: low 2, high 6, batch 1 cpu 4 cold: low 0, high 2, batch 1 cpu 5 hot: low 2, high 6, batch 1 cpu 5 cold: low 0, high 2, batch 1 cpu 6 hot: low 2, high 6, batch 1 cpu 6 cold: low 0, high 2, batch 1 cpu 7 hot: low 32, high 96, batch 16 cpu 7 cold: low 0, high 32, batch 16 Free pages: 12507020kB (12493888kB HighMem) Active:46134 inactive:953784 dirty:101636 writeback:26 unstable:0 free:3126755 slab:20379 mapped:563179 pagetables:1589 DMA free:12564kB min:16kB low:32kB high:48kB active:0kB inactive:0kB present:16384kB pages_scanned:4 all_unreclaimable? yes protections[]: 0 0 0 Normal free:568kB min:928kB low:1856kB high:2784kB active:0kB inactive:659820kB present:901120kB pages_scanned:1845199 all_unreclaimable? no protections[]: 0 0 0 HighMem free:12493888kB min:512kB low:1024kB high:1536kB active:180824kB inactive:3159244kB present:16908288kB pages_scanned:0 all_unreclaimable? no protections[]: 0 0 0 DMA: 3*4kB 5*8kB 4*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 2*4096kB = 12564kB Normal: 20*4kB 3*8kB 3*16kB 1*32kB 0*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 568kB HighMem: 1418*4kB 535*8kB 128*16kB 85*32kB 119*64kB 38*128kB 10*256kB 4*512kB 2*1024kB 2*2048kB 3041*4096kB = 12493888kB Swap cache: add 0, delete 0, find 0/0, race 0+0 0 bounce buffer pages Free swap: 16779884kB 4456448 pages of RAM 3963834 pages of HIGHMEM 299626 reserved pages 849602 pages shared 0 pages swap cached Out of Memory: Killed process 6587 (dataserver). oom-killer: gfp_mask=0xd0 Mem-info: DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 cpu 2 hot: low 2, high 6, batch 1
The server was rebooted fine with sybase up the day before. Today, the DBA got some message claiming the tmpdb devices are not initialized. Since these tmpdb devices are created from scratch when sybase starts, it really didn't make sense! DBA decides to drop the current set of tmpdb devices and create a new set instead. Just when the new set is finishing initialization per Sybase logs, the system stopped responding. Before I lost my console, I were able to issue 'ps -e -o rss,pid,cmd,args|sort -n', 'free', and 'top'. Only about 6G is used out of the total 16G. So, the memory is not really exhausted. So, this got to do with how Kernel is carving and using the memory. Googling ' oom-killer: gfp_mask=0xd0' yields a single post which referred to a bug of sorts by a overly-zealous OOM-killer and how to disable it, due to problem with memory addressing scheume switch between high mem and low mem area.
For that, I wonder if ramdisk was claiming HIMEM. If HIMEM were not addressable or addressed improperly, that may have the appearance of OOM (out of memory). Once I started to examine '/var/log/messages' line by line, there, it was painfully obvious that hugemem kernel was not selected by CentOS ServerCD 4.4. The regular kernel-smp was selected.
Oct 23 12:35:30 syb06 kernel: Linux version 2.6.9-42.ELsmp (buildcentos@build-i386) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3 )) #1 SMP Sat Aug 12 09:39:11 CDT 2006
Oct 23 12:35:30 syb06 kernel: BIOS-provided physical RAM map:
Oct 23 12:35:30 syb06 kernel:********************************************************
Oct 23 12:35:30 syb06 kernel: * This system has more than 16 Gigabyte of memory. *
Oct 23 12:35:30 syb06 kernel: * It is recommended that you read the release notes *
Oct 23 12:35:30 syb06 kernel: * that accompany your copy of Red Hat Enterprise Linux *
Oct 23 12:35:30 syb06 kernel: * about the recommended kernel for such configurations *
Oct 23 12:35:30 syb06 kernel: **********************************************
Once I downloaded the corresponding kernel-hugemem and rebooted the server with it, (grub.conf needed to be edited manually to to boot the new kernel by default). All is peachy again.
P.S. To answer Barry's question below, here is the 12-day memory RRD graph. The panic happened on the last Thursday or Friday, where lines were broken due to missing data. It would be nicer if Blogger actually allows image insertion inside comments. -20061030
P.P.S. Obviously the correct selection of the right kernel package and the maintenance thereof had been a problem/pain for Redhat's maintainers and customers. Per its release notes, the newly minted Fedore Core 6 (FC6) has one single kernel package per architecture. Kernel parameters optimized for different hardware specs (SMP or UNP, 4G/8G/16G/32G of RAM, etc.) would be set dynamicly upon boot, instead of being hard coded in different kernel packages: kernel, kernel-smp, and kernel-hugemem. Of course, with RHEL5 beta in the horizon, I guess we won't see such a change (in RHEL6) for a few years :(
Kernel paniced for a new production Sybase ASE database server when Sybase ASE 12.5 database engines started to zero out ~2G tmpdb devices (data & log devices). The hardware specs: Dell PowerEdge 6850 with 16G of DDR2 RAM and quad CPUs. The box survived my own stress test on CPU and disk. However, as an hindsight, no attempt was made to exhaust the memory.
The Sybase seemed to have finished its task, when the DBA called me saying he lost connection to the machine. I attached to its serial console and opened a connection using Kermit. Input was not rejected, but no response (no echo) from the system. After a quick reboot, I found nothing interesting in the log tail around the panic ( syslog.conf is set to write kernel.* to /var/log/messages). So, I decided to log the console session and asked the DBA to start Sybase. Sure enough, it happened again, with tons of 'out of memory' messages, followed by desperate yet fatal attempts by oom-killer to kill all processes to free up memory.
oom-killer: gfp_mask=0xd0
Mem-info: DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
cpu 2 hot: low 2, high 6, batch 1
cpu 2 cold: low 0, high 2, batch 1
cpu 3 hot: low 2, high 6, batch 1
cpu 3 cold: low 0, high 2, batch 1
cpu 4 hot: low 2, high 6, batch 1
cpu 4 cold: low 0, high 2, batch 1
cpu 5 hot: low 2, high 6, batch 1
cpu 5 cold: low 0, high 2, batch 1
cpu 6 hot: low 2, high 6, batch 1
cpu 6 cold: low 0, high 2, batch 1
cpu 7 hot: low 2, high 6, batch 1
cpu 7 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
cpu 2 hot: low 32, high 96, batch 16
cpu 2 cold: low 0, high 32, batch 16
cpu 3 hot: low 32, high 96, batch 16
cpu 3 cold: low 0, high 32, batch 16
cpu 4 hot: low 32, high 96, batch 16
cpu 4 cold: low 0, high 32, batch 16
cpu 5 hot: low 32, high 96, batch 16
cpu 5 cold: low 0, high 32, batch 16
cpu 6 hot: low 32, high 96, batch 16
cpu 6 cold: low 0, high 32, batch 16
cpu 7 hot: low 32, high 96, batch 16
cpu 7 cold: low 0, high 32, batch 16
HighMem per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 cpu 2 hot: low 32, high 96, batch 16 cpu 2 cold: low 0, high 32, batch 16 cpu 3 hot: low 32, high 96, batch 16 cpu 3 cold: low 0, high 32, batch 16 cpu 4 hot: low 32, high 96, batch 16 cpu 4 cold: low 0, high 32, batch 16 cpu 5 hot: low 32, high 96, batch 16 cpu 5 cold: low 0, high 32, batch 16 cpu 6 hot: low 32, high 96, batch 16 cpu 6 cold: low 0, high 32, batch 16 cpu 7 hot: low 32, high 96, batch 16 cpu 7 cold: low 0, high 32, batch 16 Free pages: 12506892kB (12493888kB HighMem) Active:208204 inactive:791964 dirty:101636 writeback:26 unstable:0 free:3126723 slab:20412 mapped:563179 pagetables:DMA free:12564kB min:16kB low:32kB high:48kB active:0kB inactive:0kB present:16384kB pages_scanned:4 all_unreclaimable? yes protections[]: 0 0 0 Normal free:440kB min:928kB low:1856kB high:2784kB active:644696kB inactive:16036kB present:901120kB pages_scanned:1069066 all_unreclaimable? yes protections[]: 0 0 0 HighMem free:12493888kB min:512kB low:1024kB high:1536kB active:180824kB inactive:3159244kB present:16908288kB pages_scanned:0 all_unreclaimable? no protections[]: 0 0 0 DMA: 3*4kB 5*8kB 4*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 2*4096kB = 12564kB Normal: 0*4kB 1*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 440kB HighMem: 1418*4kB 535*8kB 128*16kB 85*32kB 119*64kB 38*128kB 10*256kB 4*512kB 2*1024kB 2*2048kB 3041*4096kB = 12493888kB Swap cache: add 0, delete 0, find 0/0, race 0+0 0 bounce buffer pages Free swap: 16779884kB 4456448 pages of RAM 3963834 pages of HIGHMEM 299626 reserved pages 849369 pages shared 0 pages swap cached Out of Memory: Killed process 6586 (dataserver). oom-killer: gfp_mask=0xd0 Mem-info: DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 cpu 2 hot: low 2, high 6, batch 1 cpu 2 cold: low 0, high 2, batch 1 cpu 3 hot: low 2, high 6, batch 1 cpu 3 cold: low 0, high 2, batch 1 cpu 4 hot: low 2, high 6, batch 1 cpu 4 cold: low 0, high 2, batch 1 cpu 5 hot: low 2, high 6, batch 1 cpu 5 cold: low 0, high 2, batch 1 cpu 6 hot: low 2, high 6, batch 1 cpu 6 cold: low 0, high 2, batch 1 cpu 7 hot: low 32, high 96, batch 16 cpu 7 cold: low 0, high 32, batch 16 Free pages: 12507020kB (12493888kB HighMem) Active:46134 inactive:953784 dirty:101636 writeback:26 unstable:0 free:3126755 slab:20379 mapped:563179 pagetables:1589 DMA free:12564kB min:16kB low:32kB high:48kB active:0kB inactive:0kB present:16384kB pages_scanned:4 all_unreclaimable? yes protections[]: 0 0 0 Normal free:568kB min:928kB low:1856kB high:2784kB active:0kB inactive:659820kB present:901120kB pages_scanned:1845199 all_unreclaimable? no protections[]: 0 0 0 HighMem free:12493888kB min:512kB low:1024kB high:1536kB active:180824kB inactive:3159244kB present:16908288kB pages_scanned:0 all_unreclaimable? no protections[]: 0 0 0 DMA: 3*4kB 5*8kB 4*16kB 3*32kB 3*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 2*4096kB = 12564kB Normal: 20*4kB 3*8kB 3*16kB 1*32kB 0*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 568kB HighMem: 1418*4kB 535*8kB 128*16kB 85*32kB 119*64kB 38*128kB 10*256kB 4*512kB 2*1024kB 2*2048kB 3041*4096kB = 12493888kB Swap cache: add 0, delete 0, find 0/0, race 0+0 0 bounce buffer pages Free swap: 16779884kB 4456448 pages of RAM 3963834 pages of HIGHMEM 299626 reserved pages 849602 pages shared 0 pages swap cached Out of Memory: Killed process 6587 (dataserver). oom-killer: gfp_mask=0xd0 Mem-info: DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 cpu 2 hot: low 2, high 6, batch 1
The server was rebooted fine with sybase up the day before. Today, the DBA got some message claiming the tmpdb devices are not initialized. Since these tmpdb devices are created from scratch when sybase starts, it really didn't make sense! DBA decides to drop the current set of tmpdb devices and create a new set instead. Just when the new set is finishing initialization per Sybase logs, the system stopped responding. Before I lost my console, I were able to issue 'ps -e -o rss,pid,cmd,args|sort -n', 'free', and 'top'. Only about 6G is used out of the total 16G. So, the memory is not really exhausted. So, this got to do with how Kernel is carving and using the memory. Googling ' oom-killer: gfp_mask=0xd0' yields a single post which referred to a bug of sorts by a overly-zealous OOM-killer and how to disable it, due to problem with memory addressing scheume switch between high mem and low mem area.
For that, I wonder if ramdisk was claiming HIMEM. If HIMEM were not addressable or addressed improperly, that may have the appearance of OOM (out of memory). Once I started to examine '/var/log/messages' line by line, there, it was painfully obvious that hugemem kernel was not selected by CentOS ServerCD 4.4. The regular kernel-smp was selected.
Oct 23 12:35:30 syb06 kernel: Linux version 2.6.9-42.ELsmp (buildcentos@build-i386) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3 )) #1 SMP Sat Aug 12 09:39:11 CDT 2006
Oct 23 12:35:30 syb06 kernel: BIOS-provided physical RAM map:
Oct 23 12:35:30 syb06 kernel:********************************************************
Oct 23 12:35:30 syb06 kernel: * This system has more than 16 Gigabyte of memory. *
Oct 23 12:35:30 syb06 kernel: * It is recommended that you read the release notes *
Oct 23 12:35:30 syb06 kernel: * that accompany your copy of Red Hat Enterprise Linux *
Oct 23 12:35:30 syb06 kernel: * about the recommended kernel for such configurations *
Oct 23 12:35:30 syb06 kernel: **********************************************
Once I downloaded the corresponding kernel-hugemem and rebooted the server with it, (grub.conf needed to be edited manually to to boot the new kernel by default). All is peachy again.
P.S. To answer Barry's question below, here is the 12-day memory RRD graph. The panic happened on the last Thursday or Friday, where lines were broken due to missing data. It would be nicer if Blogger actually allows image insertion inside comments. -20061030
P.P.S. Obviously the correct selection of the right kernel package and the maintenance thereof had been a problem/pain for Redhat's maintainers and customers. Per its release notes, the newly minted Fedore Core 6 (FC6) has one single kernel package per architecture. Kernel parameters optimized for different hardware specs (SMP or UNP, 4G/8G/16G/32G of RAM, etc.) would be set dynamicly upon boot, instead of being hard coded in different kernel packages: kernel, kernel-smp, and kernel-hugemem. Of course, with RHEL5 beta in the horizon, I guess we won't see such a change (in RHEL6) for a few years :(
Labels:
centos,
dell poweredge,
dell server,
FC6,
fedora core,
kernel,
linux,
ramdisk
Monday, October 23, 2006
step-by-step: how to set up a large ramdisk on CentOS 4.4/i386
On our new database server running CentOS 4.4 on a Dell PE 6850, a 2G ramdisk is needed. The option to use 'tmpfs' was considered and forfeited due to its potential of using disk swap.
Much to my surprise, the ramdisk is not mounted after the reboot! The mke2fs log showed the mke2fs was successful. When I attempted to mount it manually, I got
# mount /dev/ram1 /mnt/ram1 mount: wrong fs type, bad option, bad superblock on /dev/ram1, or too many mounted file systems
In the log, I did notice the block size was 4K was instead of the 1K block size I observed with a 512M ramdisk.
Block size=4096 (log=2) Fragment size=4096 (log=2) 262144 inodes, 524288 blocks
Since the Sybase ASE uses 2K page size, I tried mke2fs using a 2K block. No joy! Mounting attempts were met with the same errors.
1K block size is known to have worked for 512M ramdisk, so I gave it a try. Bingo!
#mke2fs -b 1024 /dev/ram1 Block size=1024 (log=0) Fragment size=1024 (log=0) 262144 inodes, 2097152 blocks
Now we have a working ramdisk as below:
/dev/ram1 2063763 3086 1955820 1% /mnt/ram1
I don't see much of a point to waste space on 5% reserved for superuser. So, I tuned it down to 1%
# tune2fs -m 1 /dev/ram1
/dev/ram1 2063763 3086 2039706 1% /mnt/ram1
From the 'free' output, I am a bit surprised to see the 2G is not counted as 'used'.
total used free shared buffers cached Mem: 16627288 231236 16396052 0 48300 89240 -/+ buffers/cache: 93696 16533592 Swap: 16779884 0 16779884
bootparam man[7] page actually says " These days ram disks use the buffer cache, and grow dynamically." If so, ramdisk truly behaves similar to tmpfs at this point, except
- To make the ramdisk 2G apiece, a kernel parameter (ramdisk_size) needs to be passed ( grub.conf or lilo.conf) and a quick reboot is required to make it effective. The ramdisk by default is 16M in size.
Here is my grub.conf entry:
kernel /vmlinuz-2.6.9-42.ELsmp ro root=LABEL=/ console=ttyS0,57600n8 console=tty1 ramdisk_size=2097152
- To make the ramdisk accessible to the database after reboot, I cooked a chkconfig-compliant init script
---cut----x------/etc/init.d/ramdisk4syb-----x-----cut---------
#!/bin/bash
#
# $Id: rc.ramdisk4syb,v 1.00 2006/10/23 13:57:14 jer Exp $
#
# rc file for set up ramdisk fs for sybase to use for tempdb
#
#
#!/bin/bash
#
# $Id: rc.ramdisk4syb,v 1.00 2006/10/23 13:57:14 jer Exp $
#
# rc file for set up ramdisk fs for sybase to use for tempdb
#
#
# For Redhat-ish systems
#
# chkconfig: 2345 30 70
# processname: ramdisk4syb
# config: none
# description: set up ramdisk for sybase to use for tempdb
case $1 in
start)
test -d /mnt/ram1 || mkdir /mnt/ram1
mke2fs -m 1 -b 1024 /dev/ram1 >/tmp/ramdisk.mke2fs.log &&
mount -t ext2 /dev/ram1 /mnt/ram1 &&
chmod 770 /mnt/ram1 && chown sybase:dba /mnt/ram1
;;
stop)
umount /mnt/ram1
;;
*)
echo "Usage: $0 start|stop"
;;
esac
#
# chkconfig: 2345 30 70
# processname: ramdisk4syb
# config: none
# description: set up ramdisk for sybase to use for tempdb
case $1 in
start)
test -d /mnt/ram1 || mkdir /mnt/ram1
mke2fs -m 1 -b 1024 /dev/ram1 >/tmp/ramdisk.mke2fs.log &&
mount -t ext2 /dev/ram1 /mnt/ram1 &&
chmod 770 /mnt/ram1 && chown sybase:dba /mnt/ram1
;;
stop)
umount /mnt/ram1
;;
*)
echo "Usage: $0 start|stop"
;;
esac
- to pin the init script to run levels
chkconfig --add ramdisk4syb
- to test the init script before reboot
/etc/rc3.d/S30ramdisk4syb start && df -k /mnt/ram1 && ls -ld /man/ram1
- to make all this effective by reboot.
init 6
- to verify the resultant ramdisk file system
df -k /mnt/ram1 && ls -ld /man/ram1
Much to my surprise, the ramdisk is not mounted after the reboot! The mke2fs log showed the mke2fs was successful. When I attempted to mount it manually, I got
# mount /dev/ram1 /mnt/ram1 mount: wrong fs type, bad option, bad superblock on /dev/ram1, or too many mounted file systems
In the log, I did notice the block size was 4K was instead of the 1K block size I observed with a 512M ramdisk.
Block size=4096 (log=2) Fragment size=4096 (log=2) 262144 inodes, 524288 blocks
Since the Sybase ASE uses 2K page size, I tried mke2fs using a 2K block. No joy! Mounting attempts were met with the same errors.
1K block size is known to have worked for 512M ramdisk, so I gave it a try. Bingo!
#mke2fs -b 1024 /dev/ram1 Block size=1024 (log=0) Fragment size=1024 (log=0) 262144 inodes, 2097152 blocks
Now we have a working ramdisk as below:
/dev/ram1 2063763 3086 1955820 1% /mnt/ram1
I don't see much of a point to waste space on 5% reserved for superuser. So, I tuned it down to 1%
# tune2fs -m 1 /dev/ram1
/dev/ram1 2063763 3086 2039706 1% /mnt/ram1
From the 'free' output, I am a bit surprised to see the 2G is not counted as 'used'.
total used free shared buffers cached Mem: 16627288 231236 16396052 0 48300 89240 -/+ buffers/cache: 93696 16533592 Swap: 16779884 0 16779884
bootparam man[7] page actually says " These days ram disks use the buffer cache, and grow dynamically." If so, ramdisk truly behaves similar to tmpfs at this point, except
- ramdisk's maximum size (the reserved?) needs a reboot to adjust, while tmpfs can be adjusted on the fly.
- ramdisk won't have swap in the picture at all. [good thing]
- Does the 2.6 kernel or other series know or set ramdisk has higher priority on its claim to thus reserved memory?!
- why ext2 file system made with block sizes other than 1K failed to be mounted ? ramdisk is the only case here?
Labels:
centos,
dell poweredge,
dell server,
ext2,
linux,
ramdisk,
Sybase ASE,
sysadmin
Sunday, October 22, 2006
which is eth0 on a PowerEdge 6850 running CentOS 4.4/i386
When setting up the refurbished Dell PowerEdge 6850 I purchased recently, I noticed it is newer and has a few oddities
Once sitting down in the much quieter guest working area, I immediately put the pieces together and realized that the embedded Broadcom Ethernet ports were enumerated as eth2 and eth3, while the two ports on the Intel-Pro PCI-e card were as eth0 and eth1. Here is my "detective work":
grep eth /etc/modprobe .conf
alias eth0 e1000 alias eth1 e1000 alias eth2 tg3 alias eth3 tg3
dmesg
e1000: 0000:17:06.0: e1000_probe: (PCI:66MHz:64-bit) 00:04:23:cd:6e:b8 divert: allocating divert_blk for eth0 e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection ACPI: PCI interrupt 0000:17:06.1[B] -> GSI 133 (level, low) -> IRQ 50 e1000: 0000:17:06.1: e1000_probe: (PCI:66MHz:64-bit) 00:04:23:cd:6e:b9 divert: allocating divert_blk for eth1 e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection tg3.c:v3.52-rh (Mar 06, 2006) ACPI: PCI interrupt 0000:14:02.0[A] -> GSI 64 (level, low) -> IRQ 201 divert: allocating divert_blk for eth2 eth2: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:133MHz:64-bit) 10/100/1000BaseT Ethernet 00:13:72:40:e1:bd eth2: RXcsums[1] LinkChgREG[1] MIirq[1] ASF[1] Split[0] WireSpeed[1] TSOcap[0] eth2: dma_rwctrl[769f4000] ACPI: PCI interrupt 0000:14: 02.1[B] -> GSI 65 (level, low) -> IRQ 209 divert: allocating divert_blk for eth3 eth3: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:133MHz:64-bit) 10/100/1000BaseT Ethernet 00:13:72:40:e1:be eth3: RXcsums[1] LinkChgREG[1] MIirq[1] ASF[0] Split[0] WireSpeed[1] TSOcap[1] eth3: dma_rwctrl[769f4000]
grep HWA /etc/sysconfig/network-scripts/ifconfig-eth?
ifcfg-eth1:HWADDR=00:04:23:CD:6E:B9 ifcfg-eth2:HWADDR=00:13:72:40:E1:BD ifcfg-eth3:HWADDR=00:13:72:40:E1:BE ifcfg-eth0:HWADDR=00:04:23:CD:6E:B8
With the card-to-driver mapping in dmesg and in the modprobe.conf, it is easy to see the embedded Broadcom ethernet ports are mapped to eth2 and eth3, while the add-on Intel-pro ports are mapped to eth0 and eth1. The mac addresses in the 'dmesg' and the HWADDR recorded in ifcfg-eth1 through ifcfg-eth3 files hammered down the last nail. Without the last nail, it'd rather hard to tell if all ports is of the same make and model.
mii-tool sorta told the tale as well. lspci can help too.
eth0: no link eth1: no link eth2: 10baseTx, HD, link ok eth3: no link
Once I duplicated the IP/netmask on eth0 to eth2, then ifdown eth0 and ifup eth2, all is well now per ' mii-tool'
eth2: negotiated 100baseTx-FD, link ok
It'd be an interesting exercises to force eth0 to be the first embedded Broadcom port. It makes more sense for Linux to enumerate the embeded ports first, ethernet or not. What do you think?
- no video connectors at front panel nor the back. SVGA video only on the DRAC 5 card.
- a dual-port Intel Pro copper Ethernet card
- PERC adapter is of 4e/DC instead of the 4e/Di with the other two PE6850 bought new from Dell last summer.
- The "new" port is in an aged Cisco Catalyst 2900XL switch bought off eBay in 2000. My boss told me several times one port is known to be bad on this switch, since I joined the company a year ago. I used twelve free ports so far without problem. This port is one of the last two "free" ports on this 24-port switch.
- The port was in VLAN #1 to mirror two neighboring ports and had not been in active use. I programmed the switch such that the port now participates VLAN#3 instead of VLAN#1, and no longer mirrors any port.
- I was hoping to see a steady green LED and a blinking yellow LED on the NIC, by switching the cable from port to port on the server. The results were not telling. The first embedded Broadcom port had that, but no connectivity. The two Intel-pro ports had only green LED, no blinking yellow LED nor connectivity. I later came to the realization that the fast switching from port to port won't work well, probably due to ARP cache and such on the switch.
Once sitting down in the much quieter guest working area, I immediately put the pieces together and realized that the embedded Broadcom Ethernet ports were enumerated as eth2 and eth3, while the two ports on the Intel-Pro PCI-e card were as eth0 and eth1. Here is my "detective work":
grep eth /etc/modprobe .conf
alias eth0 e1000 alias eth1 e1000 alias eth2 tg3 alias eth3 tg3
dmesg
e1000: 0000:17:06.0: e1000_probe: (PCI:66MHz:64-bit) 00:04:23:cd:6e:b8 divert: allocating divert_blk for eth0 e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection ACPI: PCI interrupt 0000:17:06.1[B] -> GSI 133 (level, low) -> IRQ 50 e1000: 0000:17:06.1: e1000_probe: (PCI:66MHz:64-bit) 00:04:23:cd:6e:b9 divert: allocating divert_blk for eth1 e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection tg3.c:v3.52-rh (Mar 06, 2006) ACPI: PCI interrupt 0000:14:02.0[A] -> GSI 64 (level, low) -> IRQ 201 divert: allocating divert_blk for eth2 eth2: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:133MHz:64-bit) 10/100/1000BaseT Ethernet 00:13:72:40:e1:bd eth2: RXcsums[1] LinkChgREG[1] MIirq[1] ASF[1] Split[0] WireSpeed[1] TSOcap[0] eth2: dma_rwctrl[769f4000] ACPI: PCI interrupt 0000:14: 02.1[B] -> GSI 65 (level, low) -> IRQ 209 divert: allocating divert_blk for eth3 eth3: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:133MHz:64-bit) 10/100/1000BaseT Ethernet 00:13:72:40:e1:be eth3: RXcsums[1] LinkChgREG[1] MIirq[1] ASF[0] Split[0] WireSpeed[1] TSOcap[1] eth3: dma_rwctrl[769f4000]
grep HWA /etc/sysconfig/network-scripts/ifconfig-eth?
ifcfg-eth1:HWADDR=00:04:23:CD:6E:B9 ifcfg-eth2:HWADDR=00:13:72:40:E1:BD ifcfg-eth3:HWADDR=00:13:72:40:E1:BE ifcfg-eth0:HWADDR=00:04:23:CD:6E:B8
With the card-to-driver mapping in dmesg and in the modprobe.conf, it is easy to see the embedded Broadcom ethernet ports are mapped to eth2 and eth3, while the add-on Intel-pro ports are mapped to eth0 and eth1. The mac addresses in the 'dmesg' and the HWADDR recorded in ifcfg-eth1 through ifcfg-eth3 files hammered down the last nail. Without the last nail, it'd rather hard to tell if all ports is of the same make and model.
mii-tool sorta told the tale as well. lspci can help too.
eth0: no link eth1: no link eth2: 10baseTx, HD, link ok eth3: no link
Once I duplicated the IP/netmask on eth0 to eth2, then ifdown eth0 and ifup eth2, all is well now per ' mii-tool'
eth2: negotiated 100baseTx-FD, link ok
It'd be an interesting exercises to force eth0 to be the first embedded Broadcom port. It makes more sense for Linux to enumerate the embeded ports first, ethernet or not. What do you think?
Labels:
centos,
cisco switch,
dell poweredge,
dell server,
linux,
network,
sysadmin,
system engineer
Saturday, October 21, 2006
step-by-step: how to host WordPress under Linux (FC5/RHEL4)
Why:
# set up MySQL access
cd /var/www/html/wordpress && cp wp-config-sample.php wp-config.php && vi wp-config.php
# RTFM using your favorite browser
point to your browser to http://localhost/wordpress/readme.html
# initialize/populate your new WP database. Do record the 'admin' account information generated.
point to your browser to follow http://localhost/wordpress/wp-admin/install.php
# Now, time to verify. If it worked, you'd see the default 'hello world' post using the 'default' theme.
point to your browser to http://localhost/wordpress/index.php
# to customize the blog right away, click on 'edit' link inside the blog and log on with the 'admin' account
# about them themes:
Only two themes are bundled (classic/default). To add a theme, find ones you like and download from a trustworthy source.
- on Blogger: missing certain nice features ( static page support, import/export). Or,
- on WordPress.com: don't like your own html get stripped off. Your little piggy-bank gets left empty since Google Adsense code get stripped or disabled. Or,
- want your own professional domain name and such. Or,
- simply love the experience of DIY and the freedom of customization
- Fedora Core 5 stock installation (All should be applicable to CentOS 4 or RHEL 4 as well, unless the FC5 SELinux changes get in your way.)
- apache (httpd package) installed.
- php package installed.
- mysql-server package installed.
- The stable release (latest.tar.gz) has been downloaded from wordpress.org and extracted under /var/www/html
- comfortable with MySQL and SQL commands.
- starter's knowledge of how a blog works.
# set up MySQL access
This is to set up a MySQL database dedicated for WordPress and a user to access this database from the Apache/PHP combo. The schema and data for the database itself will be populated by install.php later. The names used below assume you want user 'wp_master' to have all access to a local database named 'wordpressDB'.
- /etc/init.d/mysqld start
- /usr/bin/mysqladmin -u root password 'mysqlAdminpassword'
- /usr/bin/mysqladmin -u root -p create wordpressDB
- mysql -Uroot -p mysql
- insert into user (host, user, password) values ('localhost', 'wp_master', password('mastersecret'));
- insert into db (host, db, user) values ('localhost', 'wordpressDB', 'wp_master');
- the 'db' table entry should also grant the 'wp_master' user with most of the privileges to the 'wordpressDB' . So, I simply replaced the 'test' database entry therein, since I am lazy, plus I consider a wild-open test db a security risk.
- update db set host='localhost',db='wordpressDB',user='wp_master' where host='%' and db='test';
- commit;
- restart mysqld by '/etc/init.d/mysqld restart'
- yum -y install php-mysql
- /etc/init.d/httpd restart
cd /var/www/html/wordpress && cp wp-config-sample.php wp-config.php && vi wp-config.php
# RTFM using your favorite browser
point to your browser to http://localhost/wordpress/readme.html
# initialize/populate your new WP database. Do record the 'admin' account information generated.
point to your browser to follow http://localhost/wordpress/wp-admin/install.php
# Now, time to verify. If it worked, you'd see the default 'hello world' post using the 'default' theme.
point to your browser to http://localhost/wordpress/index.php
# to customize the blog right away, click on 'edit' link inside the blog and log on with the 'admin' account
# about them themes:
Only two themes are bundled (classic/default). To add a theme, find ones you like and download from a trustworthy source.
- wget http://downloads.wordpress.org/theme/pool.zip
- cd /var/www/html/wordpress/wp-content/themes && unzip pool.zip
- Now the new theme(s) would just show up if you go back to edit/presentation/themes
- import existing blogs and comments from Blogger and other top blog sites.
- if new, register this new blog with technorati.com, weblogs.com
- new or updated, brief your readers of the changes [advance notice would be appreciated too]
Wednesday, October 18, 2006
Blogger :: a work in progress, beta or not
I joined Technorati and I learned it is supposedly a great thing to have proper label (Blogger's or Google's terminology) or tags (Technorati's term). A few nights back. I started to retrofit my then fifty some posts with labels. I selected a post to edit, added a label or two, then published it again. Back to the main 'edit posts' screen, all the posts looked the same, with or without labels. I said to myself, wouldn't it be great the labels be showed next to the associated post?!
Today I resumed my label-retrofitting task. Lo & behold, the 'edit posts' page has all current labels displayed, just like Gmail does for each messages. Wow, progress!
Editing posts, I noticed the oldest few have hard breaks (
) which broke the lines. The new posts don't have this problem, even if all of them were posted from Gmail.
Today I resumed my label-retrofitting task. Lo & behold, the 'edit posts' page has all current labels displayed, just like Gmail does for each messages. Wow, progress!
Editing posts, I noticed the oldest few have hard breaks (
) which broke the lines. The new posts don't have this problem, even if all of them were posted from Gmail.
Labels:
blog,
blogger,
Blogger beta
PERC 4e/di went offline on a Dell PowerEdge 6850, with nothing apparently bad
I got paged 2:15 in the morning, only to find the main RAID controller on a Dell PowerEdge 6850 yanked itself away from under the running OS (CentOS 4.1/i386), again. The OS seemed to be fine, except any task involving read/write to the disk failed.
The PE6850 server was bought new from Dell last June, and has been in production since. Same day same time last month, it gave us its first outage showing the same symptoms. Questions we asked ourselves are: why starts now? And why exactly one month apart?
The PE6850 server was bought new from Dell last June, and has been in production since. Same day same time last month, it gave us its first outage showing the same symptoms. Questions we asked ourselves are: why starts now? And why exactly one month apart?
- The serial console showed the following message repeating on the screen:
scsi0 (0:0): rejecting I/O to offline device EXT3-fs error (device sda2): ext3_find_entry: reading directory #178817 offset 0
- The nightly full database backup kicked off at 2:00 and should finish by 2:45am.
- Log exported from the PERC 4e/di controller showed that PR (patrol read) is scheduled to run every four hours and one started at 1:00 and didn't get to finish either.
- Same RAID controller log also has two interesting entries that coincided with the two outages. These are the only two "ProcessHostDmaInterrupt: No requests active" for a log stretching back to last June. The DMA thingy made the BIOS update A01 interesting in that it corrects memory address assignment or claims by the raid controller ( The server has 16G DDR2 ECC RAM, 2G chips)
09/18 2:05:47: ProcessHostDmaInterrupt: No requests active! (ch=a0102be8)
09/18 2:05:47: ProcessHostDmaInterrupt: No requests active! (ch=a0102be8)
10/18 2:13:34: ProcessHostDmaInterrupt: No requests active! (ch=a0102be8)
10/18 2:13:34: ProcessHostDmaInterrupt: No requests active! (ch=a0102be8)
09/18 2:05:47: ProcessHostDmaInterrupt: No requests active! (ch=a0102be8)
10/18 2:13:34: ProcessHostDmaInterrupt: No requests active! (ch=a0102be8)
10/18 2:13:34: ProcessHostDmaInterrupt: No requests active! (ch=a0102be8)
- Some postings suggests turning off PR may help in case it conflicts with heavy disk I/O from application tasks. However, at time of both outages, the disk I/O and the CPU load is not the highest, as we've seen much higher ones during the day every work day, per the baselines neatly graphed using RRDTOOL by our new Hobbit Monitor.
- Dell Diag is run and nothing is reported wrong, for the memory, disks, RAID controller and such.
- LSI Logic Perc 4e/Di, v.522A, A13 [[We are at 521S, A00]] Release Date: 9/11/2006
- "Fixed an issue that could cause a blue screen, file system error or system hang when using EVPD inquiry commands."
- Anybody has any idea what is exactly "an issue" ?!
- BIOS upgrade A01 [[ we are at A00 ]]
- Added workaround for lockup resulting from the systems with 8GB RAM or more and RAID storage controller potentially claiming inappropriate addresses.
- SCSI controller firmware update JT00 for various Maxtor drives
- "Under certain circumstances a hard disk drive may go offline, hard disk drives (HDD), may report offline due to a timeout condition. If the HDD is unable to complete commands, this may result in the controller reporting the HDD off line due to a timeout condition."
- "Higher than expected failures rates have been reported on the Maxtor Atlas 10K V ULD (unleaded or lead free) SCSI hard disk drives. If the hard disk drive (HDD) is unable to complete commands, this may result in the controller reporting the HDD offline due to the timeout condition. The primary failure modes have been the HDD failing to successfully rebuild and also failing after a rebuild has completed."
- Sounds promising, but it turned out that drives in our PE6850 are all Seagate.
Labels:
centos,
dell poweredge,
dell server,
linux,
PERC,
sysadmin
Sunday, October 15, 2006
Google's add-comment Firefox extension fails to log me in to Blogger beta
I downloaded Google's add-comment FireFox extension. I thought it is neat to comment on a interesting web page and have the link preserved at the same time in a relevant blog post (This works more like a scrapbook, only for the web). However, first try at www.citrix.com the extension failed to log me in. I tried both my Google account and my blogger account.
As described by previous posts, I converted to Blogger beta last week. Maybe this is one of the glitches for the SSO (single sign-on) the new Blogger beta is using. Strange thing is course, none of those valid accounts work.
As described by previous posts, I converted to Blogger beta last week. Maybe this is one of the glitches for the SSO (single sign-on) the new Blogger beta is using. Strange thing is course, none of those valid accounts work.
Labels:
Blogger beta,
Firefox,
Google
Citrix Netscaler NS7000 : how to create a content switched load balanced farm/service (Part II)
In the previous part of this article, how to create a load balance virtual server (or 'lb vserver' using CLI) was described. Now we'll describe how to create content switching. Currently, as of Netscaler 7.0 beta, Citrix Netscaler supports only HTTP protocol for content switching.
Again, the logical, conceptual flow is clear and intuitive. However, the work flow is less so, the same as any object-oriented model applied to a more procedural mind-set or work flow.
In Citrix Netscaler's terminology, you need to create
Again, the logical, conceptual flow is clear and intuitive. However, the work flow is less so, the same as any object-oriented model applied to a more procedural mind-set or work flow.
In Citrix Netscaler's terminology, you need to create
- a Content Switching (CS) policy. The policy basics states a matching rule set that the HTTP request object must comply. Three basic
- domain, aka, URI
- URL
- any other request attributes/headers using regular expression or absolute matching.
- create a CS virtual server to direct traffic regulated by this policy to go to either individual service or a load balancing virtual server.
- a default mapping is allowed. The policy field would be 'blank'. It'd make more sense to actually call it 'Default' and displays it as such. This basically directs all traffic that fails to satisfy all the policy-vserser mapping to a default LB virtual server.
- Regular Expression can be created on the fly under the 'policy' node, or pre-created with a name from under 'System' node.
Subscribe to:
Posts (Atom)