Undrblog

SLES 10, your kernel is not safe!

October 7, 2009 gangrif

So, i recently came across a startling discovery. On a SLES 10 server, when you install a kernel update, the update process kindly DELETES your old kernel. It’s not clear to me yet if it does this after its next succesful reboot, or if it does it during the update. In other words, the people at SuSE/Novell are _SO_ confident that you’ll never have a problem with a brand spankin new kernel, that they perform without a net. I’m a very conservative sysadmin. I don’t like to do anything without a backup plan. When it comes to kernel updates, that backup plan is option 1 in my grub.conf (option 1 being the second option in my boot list, generally, my old kernel). From what I’m reading, there’s also no way to tell the update process NOT to delete the old kernel. So you’re sort of stuck with this behaviour. This actually bit us a few days ago, when due to some rather odd circumstances, we ended up with a SLES that was trying to boot a kernel that was 1 revision old. Because SLES thought it had cleaned up this kernel, the /lib/modules// directory for this kernel was empty. This obviously caused some confusion on the kernel’s part, and it refused to boot. If the update process had left the older boot/module files alone, and left it up to a responsible sysadmin to clean up old kernels when they saw fit, this wouldn’t have happened. Granted, in this case, the server had other issues, but that’s a different story.

So I’ve set out to fix this. Giving yourself some peace of mind is as simple as taking your kernel, and its modules, and locking copies of them away in a safe deposit box (or at least a backup directory) during the update process. And then putting them somewhere accessible afterwards, then re-add ing the old kernel to grub. This is all well and good, if you had one, maybe two servers to worry about, go ahead and do it manually. If you have a couple dozen, this is a considerable amount of work to do manually, and it takes up your time!

So, i wrote a script to do it for you! It’s a perl script, and it should run on a base install of SLES (or, so it has in my testing).
You can download it here.

Just download it to your SLES server, and run it, it’ll do the work for you. Run it before your update, and select option 1, which backs up the kernel. Then run it again after the update, and select option 2, which restores the kernel.

Enjoy!

-War…

Zettabyte File System (ZFS)

October 6, 2009 gangrif

We’ve been doing a lot of storage research lately, and there’s been a lot of talk about ZFS. I’m going to spare you the magazine article (if you want to read more on what it is, and where it comes from, look elsewhere) and give you some guts.

ZFS is a 128-bit file system, and unfortunately isnt likely to be built into the linux kernel anytime soon. You can however, use it in userspace, using zfs-fuse, similarly to how you might use NTFS on linux (for those of us still dual booting). The machine i’m running on, runs solely Fedora Core 11, and has a handsome amount of beef behind it. It’s also got 500gb of local storage, so I can play around with huge files no sweat. You can do the same things i’m doing, with smaller files, if you’d like.

First of all, you’ll need to install zfs-fuze, this was simple on Fedora.

$ sudo yum install zfs-fuse

Next some blank disk images to toy with.

$ mkdir zfs
$ cd zfs
$ for i in $(seq 8); do dd if=/dev/zero of=$i bs=1024 count=2097152;done

This gives me 8, 2gb blobs. Make these smaller if you’d like. I wanted enough space to throw some large files at zfs. You’ll see in a bit.

Now let’s make our first zfs pool.

$ sudo zpool create jose ~/zfs/1 ~/zfs/2 ~/zfs/3 ~/zfs/4

I named my pool jose. I like it when my blog entries have personality. 😛

zfs list will give you a list of your zfs pools.

$ sudo zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
jose    72K  7.81G    18K  /jose

Creating the pool also mounts it.

$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      454G  210G  221G  49% /
/dev/sda1             190M   30M  151M  17% /boot
tmpfs                 2.0G   25M  2.0G   2% /dev/shm
jose                  7.9G   18K  7.9G   1% /jose

An interesting note. I never created a file system on this pool, i just told zfs to have at it. zfs must work at a block level with the drives.

Now, let’s poke jose with a stick, and see what he does.

$ sudo dd if=/dev/zero of=/jose/testfile bs=1024 count=2097512
2097512+0 records in
2097512+0 records out
2147852288 bytes (2.1 GB) copied, 118.966 s, 18.1 MB/s

$ sudo zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
jose  2.00G  5.81G  2.00G  /jose

Its worth note, that with a zpool add /dev/whatever you can add space to a pool of this sort.

That’s all fun, but this is essentially just a large file system. No really cool features yet. Let’s see what we can really so with this thing.

Let’s make a raid group, instead of just a standard pool.

Goodbye Jose

$ sudo zpool destroy jose

From jose’s ashes, lets make a new pool.

$ sudo zpool create susan raidz ~/zfs/1 ~/zfs/2 ~/zfs/3 ~/zfs/4
$ sudo zfs list
NAME    USED  AVAIL  REFER  MOUNTPOINT
susan  92.0K  5.84G  26.9K  /susan

Notice that susan is smaller than jose, using the same disks. This isn’t because susan has made more trips to the gym than jose, rather it’s because of the raid set. This is similar to raid 5, where one disk is taken for parity. So you lose a one disk worth of capacity.

Let’s remedy that, by throwing more (virtual) hardware at it.

You cant expand a raid group, by adding a disk, so we’ll do it by recreating the group.

$ sudo zpool destroy susan
$ sudo zpool create susan raidz ~/zfs/1 ~/zfs/2 ~/zfs/3 ~/zfs/4 ~/zfs/5
$ sudo zfs list
NAME    USED  AVAIL  REFER  MOUNTPOINT
susan  98.3K  7.81G  28.8K  /susan

And there you go, about 8gb again.
Now let’s poke susan with a stick.

First, here’s her status:

$ sudo zpool status
  pool: susan
 state: ONLINE
 scrub: scrub completed after 0h0m with 0 errors on Tue Oct  6 15:22:24 2009
config:

	NAME                    STATE     READ WRITE CKSUM
	susan                   ONLINE       0     0     0
	  raidz1                ONLINE       0     0     0
	    /home/lagern/zfs/1  ONLINE       0     0     0
	    /home/lagern/zfs/2  ONLINE       0     0     0
	    /home/lagern/zfs/3  ONLINE       0     0     0
	    /home/lagern/zfs/4  ONLINE       0     0     0
	    /home/lagern/zfs/5  ONLINE       0     0     0

errors: No known data errors

Now we’ll dd another file to susan, and we’ll see if we can damage the array.

$ sudo dd if=/dev/zero of=/susan/testfile bs=1024 count=2097512

Then, in another terminal…

$ sudo zpool offline susan ~/zfs/4
$ sudo zpool status
  pool: susan
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
	Sufficient replicas exist for the pool to continue functioning in a
	degraded state.
action: Online the device using 'zpool online' or replace the device with
	'zpool replace'.
 scrub: scrub completed after 0h0m with 0 errors on Tue Oct  6 15:22:24 2009
config:

	NAME                    STATE     READ WRITE CKSUM
	susan                   DEGRADED     0     0     0
	  raidz1                DEGRADED     0     0     0
	    /home/lagern/zfs/1  ONLINE       0     0     0
	    /home/lagern/zfs/2  ONLINE       0     0     0
	    /home/lagern/zfs/3  ONLINE       0     0     0
	    /home/lagern/zfs/4  OFFLINE      0     0     0
	    /home/lagern/zfs/5  ONLINE       0     0     0

errors: No known data errors

The dd is still running.

$ sudo zpool online susan ~/zfs/4

DD’s still going…..

DD finally finished, and it took a little longer than the first copy, but it finished, and the file appears correct.

Now, let’s try something else. With raid, you generally wont just take a drive offline, and then bring it right back, so let’s see what happens if you replace the drive.

Another dd session, and then the drive swap commands.

$ sudo dd if=/dev/zero of=/susan/testfile2 bs=1024 count=2097512

In another terminal…

$ sudo zpool status
  pool: susan
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Tue Oct  6 15:26:06 2009
config:

	NAME                    STATE     READ WRITE CKSUM
	susan                   ONLINE       0     0     0
	  raidz1                ONLINE       0     0     0
	    /home/lagern/zfs/1  ONLINE       0     0     0
	    /home/lagern/zfs/2  ONLINE       0     0     0
	    /home/lagern/zfs/3  ONLINE       0     0     0
	    /home/lagern/zfs/4  ONLINE       0     0     0
	    /home/lagern/zfs/5  ONLINE       0     0     0

errors: No known data errors
$ sudo zpool offline susan ~/zfs/4
$ sudo zpool replace susan ~/zfs/4 ~/zfs/6
$ sudo zpool status
  pool: susan
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h1m, 25.87% done, 0h3m to go
config:

	NAME                      STATE     READ WRITE CKSUM
	susan                     DEGRADED     0     0     0
	  raidz1                  DEGRADED     0     0     0
	    /home/lagern/zfs/1    ONLINE       0     0     0
	    /home/lagern/zfs/2    ONLINE       0     0     0
	    /home/lagern/zfs/3    ONLINE       0     0     0
	    replacing             DEGRADED     0     0     0
	      /home/lagern/zfs/4  OFFLINE      0     0     0
	      /home/lagern/zfs/6  ONLINE       0     0     0
	    /home/lagern/zfs/5    ONLINE       0     0     0

errors: No known data errors

This procedure seriously degraded the speed of the dd. It also made my music chop, once.
After the dd finished, the status was happy again:

$ sudo dd if=/dev/zero of=/susan/testfile2 bs=1024 count=2097512
2097512+0 records in
2097512+0 records out
2147852288 bytes (2.1 GB) copied, 356.92 s, 6.0 MB/s

$ sudo zpool status
  pool: susan
 state: ONLINE
 scrub: resilver completed after 0h4m with 0 errors on Tue Oct  6 15:35:52 2009
config:

	NAME                    STATE     READ WRITE CKSUM
	susan                   ONLINE       0     0     0
	  raidz1                ONLINE       0     0     0
	    /home/lagern/zfs/1  ONLINE       0     0     0
	    /home/lagern/zfs/2  ONLINE       0     0     0
	    /home/lagern/zfs/3  ONLINE       0     0     0
	    /home/lagern/zfs/6  ONLINE       0     0     0
	    /home/lagern/zfs/5  ONLINE       0     0     0

errors: No known data errors

Note that 4 is now replaced with 6.

Time for some coffee………..

Now lets look at some really neat things.

I mentioned that you couldn’t expand a raid volume. What you can do is replace the disks, with larger ones. Its unclear how this affects your data though (at least, it is unclear to me!) so I’m going to try it.

First let’s make some larger “disks”.

for i in $(seq 9 13); do dd if=/dev/zero of=$i bs=1024 count=4195024; done

Here we are at the beginning

$ sudo zpool status
  pool: susan
 state: ONLINE
 scrub: resilver completed after 0h4m with 0 errors on Tue Oct  6 15:35:52 2009
config:

	NAME                    STATE     READ WRITE CKSUM
	susan                   ONLINE       0     0     0
	  raidz1                ONLINE       0     0     0
	    /home/lagern/zfs/1  ONLINE       0     0     0
	    /home/lagern/zfs/2  ONLINE       0     0     0
	    /home/lagern/zfs/3  ONLINE       0     0     0
	    /home/lagern/zfs/6  ONLINE       0     0     0
	    /home/lagern/zfs/5  ONLINE       0     0     0

errors: No known data errors

$ sudo zfs list
NAME    USED  AVAIL  REFER  MOUNTPOINT
susan  4.00G  3.82G  4.00G  /susan

The new disks i created are 4GB, So we should be able to double the capacity in this pool using these disks.

$ sudo zpool replace susan ~/zfs/1 ~/zfs/9
$ sudo zpool replace susan ~/zfs/2 ~/zfs/10
$ sudo zpool status
  pool: susan
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 12.94% done, 0h6m to go
config:

	NAME                       STATE     READ WRITE CKSUM
	susan                      ONLINE       0     0     0
	  raidz1                   ONLINE       0     0     0
	    replacing              ONLINE       0     0     0
	      /home/lagern/zfs/1   ONLINE       0     0     0
	      /home/lagern/zfs/9   ONLINE       0     0     0
	    replacing              ONLINE       0     0     0
	      /home/lagern/zfs/2   ONLINE       0     0     0
	      /home/lagern/zfs/10  ONLINE       0     0     0
	    /home/lagern/zfs/3     ONLINE       0     0     0
	    /home/lagern/zfs/6     ONLINE       0     0     0
	    /home/lagern/zfs/5     ONLINE       0     0     0

errors: No known data errors
$ sudo zpool replace susan ~/zfs/3 ~/zfs/11
$ sudo zpool replace susan ~/zfs/6 ~/zfs/12
$ sudo zpool replace susan ~/zfs/5 ~/zfs/13
$ sudo zpool status
  pool: susan
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 8.21% done, 0h5m to go
config:

	NAME                       STATE     READ WRITE CKSUM
	susan                      ONLINE       0     0     0
	  raidz1                   ONLINE       0     0     0
	    replacing              ONLINE       0     0     0
	      /home/lagern/zfs/1   ONLINE       0     0     0
	      /home/lagern/zfs/9   ONLINE       0     0     0
	    replacing              ONLINE       0     0     0
	      /home/lagern/zfs/2   ONLINE       0     0     0
	      /home/lagern/zfs/10  ONLINE       0     0     0
	    replacing              ONLINE       0     0     0
	      /home/lagern/zfs/3   ONLINE       0     0     0
	      /home/lagern/zfs/11  ONLINE       0     0     0
	    replacing              ONLINE       0     0     0
	      /home/lagern/zfs/6   ONLINE       0     0     0
	      /home/lagern/zfs/12  ONLINE       0     0     0
	    replacing              ONLINE       0     0     0
	      /home/lagern/zfs/5   ONLINE       0     0     0
	      /home/lagern/zfs/13  ONLINE       0     0     0

errors: No known data errors

This took a while, and really hit my system hard. I’d recommend doing this one drive at a time.

$ top

top - 16:12:10 up 25 days,  5:27, 25 users,  load average: 11.36, 9.27, 6.20
Tasks: 280 total,   2 running, 278 sleeping,   0 stopped,   0 zombie
Cpu0  : 10.2%us,  1.3%sy,  0.0%ni, 61.0%id, 27.5%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  1.6%us,  2.9%sy,  0.0%ni,  5.5%id, 89.6%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu2  :  0.7%us,  0.7%sy,  0.0%ni, 92.7%id,  5.9%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  3.9%us,  2.0%sy,  0.0%ni, 94.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :  1.0%us,  0.3%sy,  0.0%ni, 98.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :  1.3%us,  2.0%sy,  0.0%ni,  9.8%id, 86.9%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :  5.4%us,  6.8%sy,  0.0%ni, 87.3%id,  0.0%wa,  0.0%hi,  0.6%si,  0.0%st
Cpu7  :  1.6%us,  1.3%sy,  0.0%ni, 97.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4121040k total,  4004956k used,   116084k free,    13756k buffers
Swap:  5406712k total,   322328k used,  5084384k free,  1441452k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                       
11021 lagern    20   0 1417m 1.1g  35m S 14.2 26.8   2393:07 VirtualBox                                                    
  313 lagern    20   0 1077m 555m  13m R 12.6 13.8   1089:52 firefox                                                       
22170 root      20   0  565m 221m 1428 S  6.6  5.5   5:57.71 zfs-fuse

I think i’ll go read some things on my laptop while this finishes.

Done! Took about 15 minutes to complete. My test files are still present in the pool,

$ ls -lh /susan
total 4.0G
-rw-r--r-- 1 root root 2.1G 2009-10-06 15:27 testfile
-rw-r--r-- 1 root root 2.1G 2009-10-06 15:35 testfile2

My pool does not yet show the new size….

$ sudo zfs list
NAME    USED  AVAIL  REFER  MOUNTPOINT
susan  4.00G  3.82G  4.00G  /susan

I remounted…

$ sudo zfs umount /susan
$ sudo zfs mount susan

No change….

According to harryd a reboot is necesasry. I’m not in the rebooting mood at the moment. I’ll try this, and report back if it doesnt work.

So, there you have it, zfs! Oh, another note. raidz is not the only raid option. raidz2 supports two parity drives. Like raid6. You can specify this via the zpool create command, using raidz2 where raidz was.

Enjoy!

-War…

iostat demystified

September 24, 2009 gangrif

Recently, we’ve been looking into our options for a new SAN at work. That I’ll save for a whole other post. In our search, it became apparent that we didnt truly understand how much we were utilizing our current system. Our current product requires that we purchase a license in order to check these statistics on the SAN, so we turned to the servers for some more insight.

The majority (if not ALL) of our servers are running some flavour of linux, most of which are RHEL 4.x and 5.x. RHEL (and most other distro’s) offer a package called sysstat, which includes an I/O reporting tool called iostat.

The output of iostat looks something like:

[war@somehost ~]$ iostat -x
Linux 2.6.30.5-43.fc11.i686.PAE (somehost) 	09/24/2009

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.87    0.01    1.52    0.22    0.00   96.38

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.08    24.70    4.79    4.36   258.97   232.48    53.74     0.21   23.45   2.27   2.07
sda1              0.00     0.00    0.00    0.00     0.00     0.00    16.26     0.00    7.58   6.91   0.00
sda2              0.08    24.70    4.79    4.36   258.97   232.48    53.74     0.21   23.45   2.27   2.07
dm-0              0.00     0.00    4.83   28.95   258.63   231.64    14.51     0.04    1.21   0.61   2.07
dm-1              0.00     0.00    0.04    0.10     0.34     0.84     8.00     0.02  120.62   1.57   0.02

This is a bit daunting. Lots of info, and no real descriptions. sda{1,2} are your partitions/mounts, dm-{0,1} are virtual devices used by LVM (if you’re using LVM). The rest is somewhat cryptic. The man page for iostat clears things up slightly, but you may not have a full understanding after just reading these descriptions.

(from the iostat man page)
rrqm/s: The number of read requests merged per second that were queued to the device.
wrqm/s: The number of write requests merged per second that were queued to the device.
r/s: The number of read requests that were issued to the device per second.
w/s: The number of write requests that were issued to the device per second.
rsec/s: The number of sectors read from the device per second.
wsec/s: The number of sectors written to the device per second.
rkB/s: The number of kilobytes read from the device per second.
wkB/s: The number of kilobytes written to the device per second.
avgrq-sz: The average size (in sectors) of the requests that were issued to the device.
avgqu-sz: The average queue length of the requests that were issued to the device.
await: The average time (in milliseconds) for I/O requests issued to the device to be served.
svctm: The average service time (in milliseconds) for I/O requests that were issued to the device.
%util: Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100%.

Personally, I’m working on learning this output, so I’m going to use this blog entry as my notes on what these stats mean, and how they react to disk activity. I’ll review all of the stats which i’ve been able to figure out.

rrqm/s and wrqm/s, r/s and w/s

These are all about read and write requests that had to be queued because the drive was busy when the request came in. You can drive these up with some simple tests.

Use DD to write a lot of data to a local disk, and you’ll see the wrqm/s, and w/s counters raise.

I started iostat, and then started dd, writing a 2GB file to my home directory.
dd:

[war@somehost ~]$ dd if=/dev/zero of=foo bs=8k count=262144
262144+0 records in
262144+0 records out
2147483648 bytes (2.1 GB) copied, 31.0321 s, 69.2 MB/s
[war@somehost ~]$

Now, here’s the iostat command, the -x displays extended statistics, and the 1 tells it to refresh every second.

[war@somehost ~]$ iostat -x 1
Linux 2.6.30.5-43.fc11.i686.PAE (somehost) 	09/24/2009

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.71    0.00    2.99    0.00    0.00   93.30

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.44    0.00    5.78   12.15    0.00   80.63

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  7811.00    3.00  395.00    40.00 29024.00    73.03    48.30   76.60   1.11  44.10
sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2              0.00  7811.00    3.00  395.00    40.00 29024.00    73.03    48.30   76.60   1.11  44.10
dm-0              0.00     0.00    3.00 8354.00    40.00 66832.00     8.00   949.32   51.63   0.05  44.10
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.29    0.00    4.29   35.00    0.00   56.43

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00 13542.00    0.00  372.00     0.00 108336.00   291.23   141.91  350.92   2.69 100.00
sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2              0.00 13542.00    0.00  372.00     0.00 108336.00   291.23   141.91  350.92   2.69 100.00
dm-0              0.00     0.00    0.00 13897.00     0.00 111176.00     8.00  5494.24  349.96   0.07 100.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.44    0.00    4.56   32.73    0.00   61.27

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00 18120.00    0.00  468.00     0.00 147456.00   315.08   138.46  316.45   2.14 100.00
sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2              0.00 18120.00    0.00  468.00     0.00 147456.00   315.08   138.46  316.45   2.14 100.00
dm-0              0.00     0.00    0.00 18592.00     0.00 148736.00     8.00  5450.12  313.56   0.05 100.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

/dev/sda2 is a Logical Volume that contains / on my system.
/dev/dm-0 must be the virtual device for that logvol (honestly, i’m guessing here, look at iostat, you’ll see what i mean, look at the w/s on dm-0!)

Now, let’s see if we can get the read counters to raise.

First i tried scping a file from my workstation, to my laptop. That didnt really get me the dramatic raise in activity that dd did. Understandably, its a much slower process. Let’s see what else i can abuse.

I connected my blackberry via usb 2.0. It’s got 8gb of memory. This is the closest thing to a usb mass storage device i had handy.

This was slightly better, but still not extremely fast. I suppose the best way to stress this would be a local drive to local drive copy. At any rate, i did see the r/s and rrqm/s counters rise while the copy was being performed.

Ah Ha! /dev/null is the answer. Copy your 2gb file (created by DD earlier) to /dev/null. You’ll see r/s jump. I got about 800 out of my test.

rsec/s and wsec/s

These counters are very similar to r/s and w/s, except that they deal with sectors. Whether these are useful to you are not, depends on what sort of data collection you’re looking for.

In our example from earlier, you can see the wsec/s rose as w/s and wrqm/s did.

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda2              0.00  7811.00    3.00  395.00    40.00 29024.00    73.03    48.30   76.60   1.11  44.10

await

This is a rather important stat. This tells us how long requests sent to the drive are being forced to wait, in milliseconds. The higher this nubmer gets, the more of a bottleneck we can see in our storage.

I’m continuing to work with this utility, I’ll post more progress as it comes along. I’m hoping to truly get a feel for the rest of the stats.

-War…

PeerGuardian lists, imported to iptables.

September 12, 2009 gangrif

At home, I have a Smoothwall which connects my network to the internet. It’s a very robust replacement for these soho routers that everyone seems to use. It’s not quite as plug and play, but it works very well, and I have a lot more control over it.

I also run PeerGuardian, from Phoenix Labs, on my workstations to help block certain access to my machines. Peer Guardian is a great program, and most of the time it works very well. The problem is, sometimes it has issues, and to be honest, I always thought it’d be cleaner to put the firewalling, on my…. Firewall! So i set out to find a way to add peerguardian’s lists to my Smoothie.

There’s a Project called moblock, which is supposed to do this. Well, i’ve never seen it work. Thats not to say it doesnt work, i just couldnt get it working on my Smoothie. So for a very long time, i went on using peer guardian locally. Well recently I happened to be watching peer guardian run its update, and realized that it;s pulling its lists from an http address. Makes sense that i might be able to do the same, right? So i pointed my web browser there, and sure enough, i’m presented with a list of rules! Rules that dont match iptables, but look very easy to parse! So, I did just that. I started writing my own parser, and before long, i had a very long list of iptables compatible rules. By very long, I mean long! Over 226000 lines!

I decided that the best way to make this list easy to update was to create a new chain, called PGBLOCK, and put my rules in there. I also created a chain called PGALLOW which supersedes the block list. So i can add exceptions if i’d like.

So, on my Fedora 11 Test machine in added the following to /etc/sysconfig/iptables.

After the chain definitions (the :CHAINNAME [number:number] lines) i added 4 lines.
-N PGALLOW
-N PGBLOCK
-A INPUT -j PGALLOW
-A INPUT -j PGBLOCK

This adds the chains, and adds them to the iptables INPUT chain. This tells iptables to pass all inbound packets through my chains before they even touch any other rules.

At first, i tried entering all of my rules into the PGBLOCK chain. This worked, but delayed every inbound packet to the point that my network connection was almost useless.

So I made a slight change. I made a new chain for each class a. 253 in all (i skipped 10. and 127.), and then i setup more specific rules inside of the PGBLOCK chain. PGBLOCK now contains lines similar to:

-A INPUT -s 1.0.0.0/8 -j PGBLOCK1
-A INPUT -s 2.0.0.0/8 -j PGBLOCK2
..
-A INPUT -s 254.0.0.0/8 -j PGBLOCK254
-A INPUT -s 255.0.0.0/8 -j PGBLOCK255

Now each packet gets subjected to a couple hundred (or thousand) rules instead of 226000 of them.

Wondering if you can get ahold of my script?
Here it is: http://www.undrground.org/scripts/getpg.tar.gz

Making this work is pretty easy.
There are a few variables at the top of the scipt that point to where you’d like some things to be saved. It needs a scratch directory for the lists it downloads. You need write access as the user youre running as, to the directory you’re running it from, and the lists directory, of course. But just set all that up, and run the script. It’ll generate a file called pg.firewall. Use that along with iptables-restore to build the firewall.

iptables-restore –noflush < pg.firewall Now, updating the firewall is a little more tricky, you need to flush the tables manually before re-importing. I did this with a perl script that looks something like: #!/usr/bin/perl foreach (1..255) { if ($_ eq 10 || $_ eq 127) { next; } system("/usr/sbin/iptables -F PGBLOCK$_"); } system("/usr/sbin/iptables-restore --noflush < /root/pg.firewall"); This flushes the tables, and then imports the new list. I hope this helps someone else out along the way. Enjoy! -War...

Blackberry Storm, First month or so.

August 13, 2009 gangrif

I’ve had my Storm for a little more than a month now. I’ve decided it’s time for that follow-up review.

First of all, I’ve had to have the phone replaced once already. The buttons on the phone stopped working. Of the 4 buttons below the screen (the hangup, call, menu, and back buttons), two of them stopped working on me. After a quick call to Verizon’s support, they happily replaced the phone. I still like the phone, but I’m a little wary about it’s durability. I’ve had the replacement phone for about 3 weeks now. It was about a month before the first one had its problems, we’ll see how this one fares.

Ironically, while i was waiting for my new phone to come in the mail, the old phone slipped out of my hand, and bumped my desk (about a 2″ fall) and the buttons started working! I still didnt trust the old phone, and sent it back to verizon as they asked when i got the new phone in the mail.

Overall, i like the phone. The screen seems like it could use some work though. If you try to press at any of the edges, it’s difficult to get a response. I did find something of a hack online to fix this. You simply take a small piece of paper, and fold it up a few times so that its just a little thicker than a sheet of paper (in my case, 2 folds seemed to do the trick, making it 4x the thickness of just one sheet). Take that folded piece of paper, remove the back of the phone, pull out the battery, and place the piece of paper between the back of the battery, and the inside panel of the phone. Then put the battery back in, and put the cover back on. This for whatever reason, helps the issue. I do not know what long-term effects this may have on the phone, but it doesnt seem like it’s putting pressure on anything vital, at least, not enough to cause a problem.

Themes for the phone are sometimes finicky. One of the things i really wanted for this phone was an LCARS like theme. It’s a touch phone after all. I tried a handfull of them, and found one that i really liked, and used it for a while, but i eventually switched back to the standard theme. Why? because themes don’t just change the look, some of them actually change some of the interface. Every one that I tried degraded the user interface, some of them slowed the phone down, some even caused lock-ups! I decided the look just wasn’t worth it.

Application support is nice on this phone. It’s not like an iPhone. I cant find app’s that will control every aspect of my life, but that’s ok, i dont need that. I have found a number of cool apps. all of which thus far have been free. Including a weather app, apps for twitter, mysapce, and facebook, and even a shoutcast streaming app called flycast.

One of my concerns was calendaring and syncing with Zimbra. I’m sad to report that i have NOT found a slick way to handle this. The blackberry will not talk directly to Zimbra, well, i lie, e-mail itself is great. I was able to use BIS to setup a standard imap connection Zimbra, but the calendar and contacts are not there. There’s a connector to go from a BES to Zimbra, but we don’t have a BES. I played around, forever, with an open source BES like server, called Funambol, but i’ve had no luck with it. Thus far, the best compromise i’ve found is syncing with outlook. Which is a bummer, because I don’t use outlook regularly. So i need to boot windows, fire up outlook, sync, and then shut outlook back down. Sort of a waste for all that outlook is designed to do, you know?

I do still find myself in somewhat envy of the iPhone crowd, but theyre just not an option for me at the moment. I do not want to switch providers, and verizon does not currently offer an iPhone. I wouldn’t be surprised to find myself switching to an iPhone if Verizon ever offer’s one.

-War…

Blackberry storm: First couple days.

June 30, 2009 gangrif

Well, i’ve had my storm for 4 days now. I’m going to try to put some of my initial impressions and experiences down in writing now.

So far, i very much enjoy this phone. I’ve had one or two little quirks that I’ll touch on, but so far, i like it.

How’s the sure-press screen work? I like it. I’ve even had an iphone user or two comment on how the sure press screen is nice. None of them are rushing out to get a storm, but they didnt puke when they tried to use it. I really like the feedback from the screen. Its subtle enough that it doesnt feel gimmicky, and it feels solid. Some of the apps that i’ve used dont work well with the touch screen. This is probably a compatibility with older apps thing, and not necessarily a flaw in this phone. I can see how that may have been a sticky subject at blackberry’s R&D lab. Do we reject the use of older apps, and make developers re-write every app on the market? Or do we tack on clunky touch screen usage for older apps? I can see both sides of that, and i can understand why they did what they did.

I’ve run into an issue or two with free resources on the phone. The phone will slow down to a crawl, and i’ll have to reboot it in order to free that up. I think this is more due to how i’m using the phone, and not necessarily a problem with the phone. It seems like if you exit an app by using the back button, it doesnt close, rather it goes to the background. I haven’t found a way to find out what’s running, but it seems that if i hit the bb menu, and select close instead of backing out, it works well.

I’m not crazy about how they handle e-mail. I know why they do it, but i dont really love it. When you setup your e-mail, they pass it all through their BIS. You put in some very basic info into a wizard, and it somehow guesses the information, and enters it into their service. Then your mail is passed through their service to your phone. Well, if you use this wizard, it apparently sets up your bis account, but doesnt actually give it a username! I had to call verizon’s support to get that worked out (which i just did). I tried to add my work account, with uses imaps, rather than clear imap. So it takes some advanced settings to get it working. You cant enter these settings on the phone, you need to do it on their site. Once i got into the site, it was actually detected properly, which surprised me. It also let me take the verizon branding out of the signature line of all of my e-mails, which is nice. I dont want my work e-mail being sent out branded.

I’ve dont some fiddling with about 50% of the stuff i see on the main menu of the phpne. I’ve also added a few apps. One of which with google maps, another was a media player which supports shoutcast streams. Pretty cool. Never thought i’d be able to play The Blast on the go. I’ve also run across a few cool free apps, like games and whatnot.

I’ve also been toying with a BES like open source package. It may, if i can get it working, get me my calendar from Zimbra sync’d on the storm. I got it working tonight for email, but it was buggy as hell, and kept crashing the phone. The contacts/calendar app (which is a separate sync app) seems stable, so i’m going to continue to pursue that. The app is called Funambol. I could probably post an entire entry on that later.

So far, i give this phone a big thumbs up. It’s cool enough to be a fun phone, but.. erm.. Blackberry enough to be a good business phone. Exactly what I needed.

-War

…

Blackberry Storm. Before Purchase.

June 25, 2009 gangrif

Tonight I aim to walk into a Verizon Wireless store, and upgrade to a Blackberry Storm.

I’m currently using a Samsung SCH-U740, a phone that has recently been re-branded as the “Alias”. If you were to go to Verizon today, you’d find it on the shelves/website as the Alias. A good phone no doubt. It’s held up well to my abuse, and general use. I’ve recently moved into a position where communication is a bit more prominent than it has been. We have enough of us working on a team that E-Mail and Calendaring is much preferred over the SMS messaging that i’ve been doing primarily on my current phone. It’s served its purpose well, with its dual-flip and full keyboard.

I plan to post again after I’ve purchased the Storm, and used it for a day or so, to give my initial impressions. Then again after a month or so of usage. Assuming that i don’t hate the phone, and have given it back to Verizon. I dont expect that to happen. but who knows.

Today I’m going to talk about why i picked the storm, and what my expectations are. Be aware that i may be be incorrect on some of these assumptions, but you’ll get to see that once i post my first follow-up.

Why the storm?
I want a touch screen phone. I don’t really want to leave Verizon. The Storm was the ONLY touch screen phone offered by Verizon that does not run Windows Mobile. Windows mobile is a horrid piece of trash, and I will not purchase a device that runs it. I’ve heard in the past few days that Verizon now offers another touch screen phone without Windows mobile. It’s an LG, and has a full keyboard when it’s flipped open. I hear that it does not handle e-mail well, and some of its features are locked unless you subscribe to them. I’ll touch on that in a second.

Touch screen aside, i want a phone with email, calendar, and web. We run Zimbra Collaboration Suite at work, and although the storm does not currently support zimbra (which uses activesync to sync over the air) I feel that a popular platform like the BlackBerry is more likely to have some addon written for this functionality, than something that runs a proprietary OS like the LG does. The Storm also has a larger community of folks writing apps for it. Now, the iPhone, and Windows Moble both talk to Zimbra already, but like I said, i have reasons for not going with either of those options.

Features… This phone is packed with them. Including (or so i’m told) an unlocked GPS feature. Which allows me to go and download something like google maps, giving me a fully featured GPS right on my belt at all times. Something that most (if not all) other verizon phones force you to pay monthly for.

Expectations:
I expect that I’ll like this phone. I think that’s obvious, otherwise I wouldn’t be buying it.

I expect that I’ll be able to tweak the storm to make it a little more personalized than is possible with some other phones.

I expect that I’ll be able to find applications to do most of what I’d like to do on this phone. Some things I’d like to do are: Listen to music (including streams, like shoutcast if possible), get directions, and of course, email and calendaring.

What makes me think that this phone will do all of that?
Well, honestly, I dont think it’s going to do every single thing i’d like it to, in fact, i expect that it wont, but if it does, i’ll be VERY pleased. The bottom line is, if it does all of the things that i’ve heard it can, i’ll he happy with it.

What about this Sure-Press screen?
I’ve done a little homework. It seems that the biggest complaint regarding the Storm is the screen. If you’re not familiar with it, I’ll sum it up. The entire screen seems to be mounted on one giant button. When you touch the screen, it responds by hi-lighting things, or scrolling, depending on the gesture of your finger. If you press on the screen slightly, the entire screen clicks, like a button. This makes me wonder about the longevity of the phone. Will this button start to wear out? Will the screen start to get loose? What about dirt and whatnot getting in the crevices around the screen that allow it to move? All i can say to that is.. I’m going to be careful with the phone, and see how it goes.

As far as the operation of the screen goes. I stopped in at a verizon store, and had a look at a powered up, but not activated Storm. I brought up a new text message, and started typing. I like it. I think it’ll work out just fine. I was able to type a sentence with just as much speed and accuracy as i do on my SCH-U740.

That’s it for now, see you in a few days!

-War
…

New software…

June 24, 2009 gangrif

Well i finally gave in, and replaced my custom written blog software with a prebuilt package. Serendipity. It’s still free/open, and i can modify it as needed. It also offers a bunch of cool plugins. It has more functionality out of the box than anything i could have written in as much time. I took…

Install SLES 10, via Autoyast through KickStart.

May 15, 2009 gangrif

I work in a predominantly RedHat Enterprise Linux shop. We have a number of servers, all running RHEL. Doing a number of different things. Things like mail, and dns, and other such services that you’d expect to see running at a college.

We also have a number of Novell servers, which are doing, primarily, file services for employee’s and students. We are in the process of migrating from Novell’s proprietary netware platform, to SLES (Suse Linux Enterprise Server). There are linux equivelant services which run on SLES for each of the NetWare services that we offer. This means we have the need to build a number of SLES servers. We already have a KickStart server setup in order to install RHEL in a rapid fashion, including all of our standard packages, ssh keys, and whatnot. We thought it might not be a bad idea to look into whatever SUSE offers that might be similar to Kickstart. I looked into it, and found out about AutoYast. Yast is the setup tool that SUSE uses to install with (just like RHEL’s anaconda), and just like RHEL, when you install a SUSE box, it generates a config file, for use with its installer, which summarises the options which you configured.

I started reading about how to setup a SLES boot server, similar to our KickStart server, and found a number of similarities, in fact, they’re almost identical. They both use tftp as a means of getting boot images out to the clients, they both use their own internal DHCP servers to assign addresses to PXE clients, and they both use PXE. So it got me thinking. Can i just add SLES to my KickStart boot menu? Hell, it s worth a shot!

This is not a complete howto. It’s intended to show you how to add an autoyast setup to an existing kickstart server. My setup may differ from yours, but if you’re familiar with KickStart, you should be able to take the information and apply it to your setup.

There are a few things you’ll need.
A setup, and fucntional kickstart server
a SLES install disc (or iso image thereof)
an autoinst.xml file from a SLES install.

First, take your SLES install files, and copy the boot files off of it, into a directory inside of your tftp root. For me, this was:
From install media: /boot/x86_64/loader/initrd and /boot/x86_64/loader/linux
Copy to: /tftpboot/linux-install/sles/initrd.img and /tftpboot/linux-install/sles/vmlinuz

Now, you’ll need to make the install media accessible to the pxe booted installer.
I did this via http. I happened to be working with an ISO image of the cd. So i mounted the iso to /var/www/html/sles with something like:
mount -o loop /path/to/iso/file.iso /var/www/html/sles
This makes the install media available via http://kickstart.server.com/sles

Now you need to put your autoinst.xml file into a web accessible area, i put it in /var/www/html. so it would be available at http://kickstart.server.com/autoinst.xml

These are all obviously sensitive files, if i were you i’d secure them somehow.

We have our kickstart server running on a private network, and we power it down when we’re not actively installing anything.

Now, there are some config changes to make. We have things setup such:
/tftproot/linux-install/pxelinux.cfg/default is where we keep the boot loader config, and
/tftproot/linux-install/msgs/boot.msg is where we keep the menu display file.
Add something like this to your equivalent of my "default" boot loader config:
label 5
kernel sles/vmlinuz
append initrd=sles/initrd.img ramdisk_size=65536 autoyast=http://kickstart.server.com/autoinst.xml install=http://kickstart.server.com/sles/

Then add something denoting "5" as the item for sles, on your boot.msg file.

This is just about it, you’ll want to have a good look at the autoinst.xml file, you can read about what’s in there, and how to modify it here: http://www.suse.de/~ug/autoyast_doc/Profile.html

I am still in the process of tweaking my config, but that should get you up and runing, and to the same point that i’m at. Which is, the kick start portion is out of the way, and it’s just getting the auto install xml file perfect.

Happy kickstarting!

-War…

Undrblog

Categories

SLES 10, your kernel is not safe!

Zettabyte File System (ZFS)

iostat demystified

rrqm/s and wrqm/s, r/s and w/s

rsec/s and wsec/s

await

PeerGuardian lists, imported to iptables.

Blackberry Storm, First month or so.

Blackberry storm: First couple days.

Blackberry Storm. Before Purchase.

New software…

Install SLES 10, via Autoyast through KickStart.

Recent Posts

Recent Comments

Archives

Categories

Meta