Just a quick note. I’ve taken a little time to setup categories, and then took a bit more time moving all of my entries into their appropriate categories. Given the random nature of this blog, i thought this would be helpful for those of you who like reading my tech articles, but dont really care…
SLES 10, your kernel is not safe!
So, i recently came across a startling discovery. On a SLES 10 server, when you install a kernel update, the update process kindly DELETES your old kernel. It’s not clear to me yet if it does this after its next succesful reboot, or if it does it during the update. In other words, the people at SuSE/Novell are _SO_ confident that you’ll never have a problem with a brand spankin new kernel, that they perform without a net. I’m a very conservative sysadmin. I don’t like to do anything without a backup plan. When it comes to kernel updates, that backup plan is option 1 in my grub.conf (option 1 being the second option in my boot list, generally, my old kernel). From what I’m reading, there’s also no way to tell the update process NOT to delete the old kernel. So you’re sort of stuck with this behaviour. This actually bit us a few days ago, when due to some rather odd circumstances, we ended up with a SLES that was trying to boot a kernel that was 1 revision old. Because SLES thought it had cleaned up this kernel, the /lib/modules/
So I’ve set out to fix this. Giving yourself some peace of mind is as simple as taking your kernel, and its modules, and locking copies of them away in a safe deposit box (or at least a backup directory) during the update process. And then putting them somewhere accessible afterwards, then re-add ing the old kernel to grub. This is all well and good, if you had one, maybe two servers to worry about, go ahead and do it manually. If you have a couple dozen, this is a considerable amount of work to do manually, and it takes up your time!
So, i wrote a script to do it for you! It’s a perl script, and it should run on a base install of SLES (or, so it has in my testing).
You can download it here.
Just download it to your SLES server, and run it, it’ll do the work for you. Run it before your update, and select option 1, which backs up the kernel. Then run it again after the update, and select option 2, which restores the kernel.
Enjoy!
-War…
Zettabyte File System (ZFS)
We’ve been doing a lot of storage research lately, and there’s been a lot of talk about ZFS. I’m going to spare you the magazine article (if you want to read more on what it is, and where it comes from, look elsewhere) and give you some guts.
ZFS is a 128-bit file system, and unfortunately isnt likely to be built into the linux kernel anytime soon. You can however, use it in userspace, using zfs-fuse, similarly to how you might use NTFS on linux (for those of us still dual booting). The machine i’m running on, runs solely Fedora Core 11, and has a handsome amount of beef behind it. It’s also got 500gb of local storage, so I can play around with huge files no sweat. You can do the same things i’m doing, with smaller files, if you’d like.
First of all, you’ll need to install zfs-fuze, this was simple on Fedora.
$ sudo yum install zfs-fuse
Next some blank disk images to toy with.
$ mkdir zfs $ cd zfs $ for i in $(seq 8); do dd if=/dev/zero of=$i bs=1024 count=2097152;done
This gives me 8, 2gb blobs. Make these smaller if you’d like. I wanted enough space to throw some large files at zfs. You’ll see in a bit.
Now let’s make our first zfs pool.
$ sudo zpool create jose ~/zfs/1 ~/zfs/2 ~/zfs/3 ~/zfs/4
I named my pool jose. I like it when my blog entries have personality. 😛
zfs list will give you a list of your zfs pools.
$ sudo zfs list NAME USED AVAIL REFER MOUNTPOINT jose 72K 7.81G 18K /jose
Creating the pool also mounts it.
$ df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup00-LogVol00 454G 210G 221G 49% / /dev/sda1 190M 30M 151M 17% /boot tmpfs 2.0G 25M 2.0G 2% /dev/shm jose 7.9G 18K 7.9G 1% /jose
An interesting note. I never created a file system on this pool, i just told zfs to have at it. zfs must work at a block level with the drives.
Now, let’s poke jose with a stick, and see what he does.
$ sudo dd if=/dev/zero of=/jose/testfile bs=1024 count=2097512 2097512+0 records in 2097512+0 records out 2147852288 bytes (2.1 GB) copied, 118.966 s, 18.1 MB/s $ sudo zfs list NAME USED AVAIL REFER MOUNTPOINT jose 2.00G 5.81G 2.00G /jose
Its worth note, that with a zpool add
That’s all fun, but this is essentially just a large file system. No really cool features yet. Let’s see what we can really so with this thing.
Let’s make a raid group, instead of just a standard pool.
Goodbye Jose
$ sudo zpool destroy jose
From jose’s ashes, lets make a new pool.
$ sudo zpool create susan raidz ~/zfs/1 ~/zfs/2 ~/zfs/3 ~/zfs/4 $ sudo zfs list NAME USED AVAIL REFER MOUNTPOINT susan 92.0K 5.84G 26.9K /susan
Notice that susan is smaller than jose, using the same disks. This isn’t because susan has made more trips to the gym than jose, rather it’s because of the raid set. This is similar to raid 5, where one disk is taken for parity. So you lose a one disk worth of capacity.
Let’s remedy that, by throwing more (virtual) hardware at it.
You cant expand a raid group, by adding a disk, so we’ll do it by recreating the group.
$ sudo zpool destroy susan $ sudo zpool create susan raidz ~/zfs/1 ~/zfs/2 ~/zfs/3 ~/zfs/4 ~/zfs/5 $ sudo zfs list NAME USED AVAIL REFER MOUNTPOINT susan 98.3K 7.81G 28.8K /susan
And there you go, about 8gb again.
Now let’s poke susan with a stick.
First, here’s her status:
$ sudo zpool status pool: susan state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Tue Oct 6 15:22:24 2009 config: NAME STATE READ WRITE CKSUM susan ONLINE 0 0 0 raidz1 ONLINE 0 0 0 /home/lagern/zfs/1 ONLINE 0 0 0 /home/lagern/zfs/2 ONLINE 0 0 0 /home/lagern/zfs/3 ONLINE 0 0 0 /home/lagern/zfs/4 ONLINE 0 0 0 /home/lagern/zfs/5 ONLINE 0 0 0 errors: No known data errors
Now we’ll dd another file to susan, and we’ll see if we can damage the array.
$ sudo dd if=/dev/zero of=/susan/testfile bs=1024 count=2097512
Then, in another terminal…
$ sudo zpool offline susan ~/zfs/4 $ sudo zpool status pool: susan state: DEGRADED status: One or more devices has been taken offline by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scrub: scrub completed after 0h0m with 0 errors on Tue Oct 6 15:22:24 2009 config: NAME STATE READ WRITE CKSUM susan DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 /home/lagern/zfs/1 ONLINE 0 0 0 /home/lagern/zfs/2 ONLINE 0 0 0 /home/lagern/zfs/3 ONLINE 0 0 0 /home/lagern/zfs/4 OFFLINE 0 0 0 /home/lagern/zfs/5 ONLINE 0 0 0 errors: No known data errors
The dd is still running.
$ sudo zpool online susan ~/zfs/4
DD’s still going…..
DD finally finished, and it took a little longer than the first copy, but it finished, and the file appears correct.
Now, let’s try something else. With raid, you generally wont just take a drive offline, and then bring it right back, so let’s see what happens if you replace the drive.
Another dd session, and then the drive swap commands.
$ sudo dd if=/dev/zero of=/susan/testfile2 bs=1024 count=2097512
In another terminal…
$ sudo zpool status pool: susan state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Tue Oct 6 15:26:06 2009 config: NAME STATE READ WRITE CKSUM susan ONLINE 0 0 0 raidz1 ONLINE 0 0 0 /home/lagern/zfs/1 ONLINE 0 0 0 /home/lagern/zfs/2 ONLINE 0 0 0 /home/lagern/zfs/3 ONLINE 0 0 0 /home/lagern/zfs/4 ONLINE 0 0 0 /home/lagern/zfs/5 ONLINE 0 0 0 errors: No known data errors $ sudo zpool offline susan ~/zfs/4 $ sudo zpool replace susan ~/zfs/4 ~/zfs/6 $ sudo zpool status pool: susan state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 0h1m, 25.87% done, 0h3m to go config: NAME STATE READ WRITE CKSUM susan DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 /home/lagern/zfs/1 ONLINE 0 0 0 /home/lagern/zfs/2 ONLINE 0 0 0 /home/lagern/zfs/3 ONLINE 0 0 0 replacing DEGRADED 0 0 0 /home/lagern/zfs/4 OFFLINE 0 0 0 /home/lagern/zfs/6 ONLINE 0 0 0 /home/lagern/zfs/5 ONLINE 0 0 0 errors: No known data errors
This procedure seriously degraded the speed of the dd. It also made my music chop, once.
After the dd finished, the status was happy again:
$ sudo dd if=/dev/zero of=/susan/testfile2 bs=1024 count=2097512 2097512+0 records in 2097512+0 records out 2147852288 bytes (2.1 GB) copied, 356.92 s, 6.0 MB/s $ sudo zpool status pool: susan state: ONLINE scrub: resilver completed after 0h4m with 0 errors on Tue Oct 6 15:35:52 2009 config: NAME STATE READ WRITE CKSUM susan ONLINE 0 0 0 raidz1 ONLINE 0 0 0 /home/lagern/zfs/1 ONLINE 0 0 0 /home/lagern/zfs/2 ONLINE 0 0 0 /home/lagern/zfs/3 ONLINE 0 0 0 /home/lagern/zfs/6 ONLINE 0 0 0 /home/lagern/zfs/5 ONLINE 0 0 0 errors: No known data errors
Note that 4 is now replaced with 6.
Time for some coffee………..
Now lets look at some really neat things.
I mentioned that you couldn’t expand a raid volume. What you can do is replace the disks, with larger ones. Its unclear how this affects your data though (at least, it is unclear to me!) so I’m going to try it.
First let’s make some larger “disks”.
for i in $(seq 9 13); do dd if=/dev/zero of=$i bs=1024 count=4195024; done
Here we are at the beginning
$ sudo zpool status pool: susan state: ONLINE scrub: resilver completed after 0h4m with 0 errors on Tue Oct 6 15:35:52 2009 config: NAME STATE READ WRITE CKSUM susan ONLINE 0 0 0 raidz1 ONLINE 0 0 0 /home/lagern/zfs/1 ONLINE 0 0 0 /home/lagern/zfs/2 ONLINE 0 0 0 /home/lagern/zfs/3 ONLINE 0 0 0 /home/lagern/zfs/6 ONLINE 0 0 0 /home/lagern/zfs/5 ONLINE 0 0 0 errors: No known data errors $ sudo zfs list NAME USED AVAIL REFER MOUNTPOINT susan 4.00G 3.82G 4.00G /susan
The new disks i created are 4GB, So we should be able to double the capacity in this pool using these disks.
$ sudo zpool replace susan ~/zfs/1 ~/zfs/9 $ sudo zpool replace susan ~/zfs/2 ~/zfs/10 $ sudo zpool status pool: susan state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 0h0m, 12.94% done, 0h6m to go config: NAME STATE READ WRITE CKSUM susan ONLINE 0 0 0 raidz1 ONLINE 0 0 0 replacing ONLINE 0 0 0 /home/lagern/zfs/1 ONLINE 0 0 0 /home/lagern/zfs/9 ONLINE 0 0 0 replacing ONLINE 0 0 0 /home/lagern/zfs/2 ONLINE 0 0 0 /home/lagern/zfs/10 ONLINE 0 0 0 /home/lagern/zfs/3 ONLINE 0 0 0 /home/lagern/zfs/6 ONLINE 0 0 0 /home/lagern/zfs/5 ONLINE 0 0 0 errors: No known data errors $ sudo zpool replace susan ~/zfs/3 ~/zfs/11 $ sudo zpool replace susan ~/zfs/6 ~/zfs/12 $ sudo zpool replace susan ~/zfs/5 ~/zfs/13 $ sudo zpool status pool: susan state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 0h0m, 8.21% done, 0h5m to go config: NAME STATE READ WRITE CKSUM susan ONLINE 0 0 0 raidz1 ONLINE 0 0 0 replacing ONLINE 0 0 0 /home/lagern/zfs/1 ONLINE 0 0 0 /home/lagern/zfs/9 ONLINE 0 0 0 replacing ONLINE 0 0 0 /home/lagern/zfs/2 ONLINE 0 0 0 /home/lagern/zfs/10 ONLINE 0 0 0 replacing ONLINE 0 0 0 /home/lagern/zfs/3 ONLINE 0 0 0 /home/lagern/zfs/11 ONLINE 0 0 0 replacing ONLINE 0 0 0 /home/lagern/zfs/6 ONLINE 0 0 0 /home/lagern/zfs/12 ONLINE 0 0 0 replacing ONLINE 0 0 0 /home/lagern/zfs/5 ONLINE 0 0 0 /home/lagern/zfs/13 ONLINE 0 0 0 errors: No known data errors
This took a while, and really hit my system hard. I’d recommend doing this one drive at a time.
$ top top - 16:12:10 up 25 days, 5:27, 25 users, load average: 11.36, 9.27, 6.20 Tasks: 280 total, 2 running, 278 sleeping, 0 stopped, 0 zombie Cpu0 : 10.2%us, 1.3%sy, 0.0%ni, 61.0%id, 27.5%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 1.6%us, 2.9%sy, 0.0%ni, 5.5%id, 89.6%wa, 0.0%hi, 0.3%si, 0.0%st Cpu2 : 0.7%us, 0.7%sy, 0.0%ni, 92.7%id, 5.9%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 3.9%us, 2.0%sy, 0.0%ni, 94.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 1.0%us, 0.3%sy, 0.0%ni, 98.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 1.3%us, 2.0%sy, 0.0%ni, 9.8%id, 86.9%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 5.4%us, 6.8%sy, 0.0%ni, 87.3%id, 0.0%wa, 0.0%hi, 0.6%si, 0.0%st Cpu7 : 1.6%us, 1.3%sy, 0.0%ni, 97.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4121040k total, 4004956k used, 116084k free, 13756k buffers Swap: 5406712k total, 322328k used, 5084384k free, 1441452k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11021 lagern 20 0 1417m 1.1g 35m S 14.2 26.8 2393:07 VirtualBox 313 lagern 20 0 1077m 555m 13m R 12.6 13.8 1089:52 firefox 22170 root 20 0 565m 221m 1428 S 6.6 5.5 5:57.71 zfs-fuse
I think i’ll go read some things on my laptop while this finishes.
Done! Took about 15 minutes to complete. My test files are still present in the pool,
$ ls -lh /susan total 4.0G -rw-r--r-- 1 root root 2.1G 2009-10-06 15:27 testfile -rw-r--r-- 1 root root 2.1G 2009-10-06 15:35 testfile2
My pool does not yet show the new size….
$ sudo zfs list NAME USED AVAIL REFER MOUNTPOINT susan 4.00G 3.82G 4.00G /susan
I remounted…
$ sudo zfs umount /susan $ sudo zfs mount susan
No change….
According to harryd a reboot is necesasry. I’m not in the rebooting mood at the moment. I’ll try this, and report back if it doesnt work.
So, there you have it, zfs! Oh, another note. raidz is not the only raid option. raidz2 supports two parity drives. Like raid6. You can specify this via the zpool create command, using raidz2 where raidz was.
Enjoy!
-War…
-
Recent Posts
Recent Comments
- Episode 112b – AWS down again, Facemeta, and Android users cant dial 911! – Iron Sysadmin Podcast on Virtual Reality, Online Worlds, and Metaverses
- Untangling my life from the Google ecosystem - Undrblog on Nextcloud, in a container, in a bucket
- Untangling my life from the Google ecosystem - Undrblog on Getting my digital life back from Google Photos
- Nextcloud, in a container, in a bucket - Undrblog on Getting my digital life back from Google Photos
- Getting my digital life back from Google Photos - Undrblog on Untangling my life from the Google ecosystem
Archives
- July 2023
- November 2022
- October 2022
- December 2021
- August 2021
- January 2021
- December 2020
- September 2020
- July 2020
- March 2020
- February 2020
- January 2020
- October 2019
- August 2019
- June 2019
- January 2019
- December 2018
- June 2018
- July 2017
- June 2017
- April 2017
- March 2017
- December 2016
- October 2016
- June 2016
- September 2015
- July 2014
- December 2013
- August 2013
- July 2013
- June 2013
- April 2013
- March 2013
- January 2013
- December 2012
- October 2012
- September 2012
- August 2012
- July 2012
- June 2012
- February 2012
- January 2012
- November 2011
- October 2011
- August 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- October 2010
- September 2010
- August 2010
- June 2010
- May 2010
- April 2010
- February 2010
- January 2010
- December 2009
- October 2009
- September 2009
- August 2009
- June 2009
- May 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- November 2008
- October 2008
- August 2008
- July 2008
- June 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
Categories
Meta