Archive for the ‘Soc2007’ Category

Huntin’ them bugs

Friday, August 17th, 2007

More status updates… I’ve been fixing many small gvinum bugs the last couple of weeks:

  • The state of gvinum objects were changed after reloading. This meant that objects got the wrong state when gvinum was brought up.
  • Made gvinum always use the most recent configuration it finds when setting object states.
  • Make sure the newest drive is always the newest, and not the first in the drivelist, as was previously assumed.
  • Add “growable”-state to be used when a plex is ready to be grown.
  • Allow a plex to be rebuilt even though it’s also growable.
  • Do not change the size of the volume until the plex is completely grown.
  • Add status of growing and rebuild of a plex in the list output.
  • Prevent rebuild to take over the I/O system increasing access-count at the start and end of the rebuild.

Probably a couple of other fixes as well. Also, I’ve updated the vinum-examples page in the handbook to reflect new features and more practical examples. I’ve posted a “call for testers” on current@, arch@ and geom@, and have received some response from people who are willing to help me test. Thanks to them. I’ve uploaded the code-sample that I’ll be delivering to google here: http://folk.ntnu.no/lulf/gvinum_soc2007.tar.gz

Cleaning up

Monday, August 6th, 2007

The last couple of weeks I’ve tested and done bugfixing and cleanup of gvinum code. I refactored some parts to make the code belong where it seems logical. I also implemented growing for striped plexes, but that was quite easy since I could reuse most of the code for growing RAID-5 plexes. Unfortunately I was sick for a week and unable to work.

What remains now is to do more testing (can’t get enough), and write and update documenatation on gvinum. I have updated patches for gvinum at http://folk.ntnu.no/lulf/patches/freebsd/gvinum for both RELENG_6 and CURRENT. I appreciate reports from brave users who tries it out, even if it works :)

Also, I created a new perforce-branch called gvinum_cache. I’ve currently implemented a read/write-cache to check if this would give much speed-up for gvinum. It’s not very nice for reliability, but could be an option for those who want better performance. Anyway, I’ll update more on this later.

Growing up

Tuesday, July 17th, 2007

Since last post I haven’t really done that much do gvinum, but a few things.

  • I added a few automated test-scripts to check if a volume behaves properly
  • Go through test-plan and make sure that gvinum passes the tests.
  • I’ve been thinking a lot on how to best implement growing RAID-5 plexes.
  • I’ve implemented growing of RAID-5 plexes.

Now, the first and second points are quite boring to do, but I had to do it. Now the last points were trickier, since I didn’t really know where I should start. Finally I decided the best way was to let the plex overwrite itself! A more detalied explanation can be found in the TODO of my perforce branch. I need to test the implementation a bit now. Other than that, I’ve been a bit lazy on my own work this week, and tried to help other students with reviews etc.

Bugathon-week

Wednesday, July 4th, 2007

Since last post, there has been many small bugfixes to gvinum. After some debating with myself on how I should implement concat/stripe/mirror, I think I got it pretty much right. The event system changed gvinum a lot, so I had to rewrite most of the code I already had on this.

I have done a lot of testing this week, and I made a test plan that I’m going to follow. Hopefully, I’ll also be able to create som automatic tests for this.

I’ve even been a good boy and updated the gvinum manpage! I added some examples to the manpage as well, so that it’s easier to get into gvinum for inexperienced users (not sure if we want gvinum to live even longer, but :) ).

A lot of small problems with weird states being set was also fixed, since this can be very confusing if you havent used gvinum much.

What I’d like to do next, is create a set of testscripts that I can use to test quickly and easily with. I also noticed that it would be nice to have a similar command like ‘mirror’ for RAID-5 volumes. This could be used like this: ‘gvinum raid5 <disk1> <disk2> <disk3>’. Other than that, I’ve started to think on how I’m going to implement raid-5 resizing and other goals in my proposal.

Bug-monster dying slowly

Thursday, June 28th, 2007

Finally, an update on what I’ve been doing since the last time. This time I have a lot of small changes that have been done:

  • Implement initialization of RAID5-Arrays. This basically writes zeros over everything and makes sure parity is correct.
  • Fix a bug with mirror code. The length of the completed requests got doubled if you have a mirror with two plexes, tripled if you have a mirror with three plexes etc.
  • When a mirrored plexes are syncing, all requests after and including the first _write-request_ are delayed until syncing is finished.
  • Allow rebuilding a RAID-5 array while it is in use (e.g. mounted). Delay requests that are in conflict with the rebuild, but allow requests on the already rebuilt part to be run.
  • Allow subdisks to come up automagically after rebuild.
  • Allow stripesizes not divideable by the subdisk size. A regression in the new gvinum code prevented this.
  • Modify the event system to contain two intmax_t fields, so we won’t have to allocate/deallocate pointers all the time when passing args to gv_post_event.
  • Add support for the rename and move commands to new gvinum. The code has been rewritten for the new gvinum.
  • Fix a bug in the code for degraded writes to a RAID5-array, where only zeros were written.
  • Other minor bug/style fixes.

Next, I’m going to implement concat/stripe/mirror functionality. I already have some code from previous work I did, so I just need to adapt it to new gvinum, as well as change some ugly parts. There are some small facade-changes left, but I will do this after the last of the original vinum features is completed. Also, I will try write a nice status report, and get a testable patch out by the time the reports are finished.

Even happier…

Monday, June 18th, 2007

Finally I did the initalization code for raid5 plexes, and this means I’m pretty much complete with updating old gvinum to the new event system, but it will probably need some fixes here and there as it gets tested.

What remains in terms of needed functionality is the concat/mirror/stripe commands to easily create a concat/mirror/stripe volume out of three disks. I also have noted some issues that I think could need an improvement. More on this next time.

Subdisks now live happy in raid5-town…

Wednesday, June 13th, 2007

So, finally the exams are over, and I’ve been able to work sort of full-time on my project the last days. What I’ve done is (a bit technical this time perhaps, but this stuff tends to become that):

Implemented attach/detach routines. This makes it possible to attach a subdisk/plex to a plex/volume, or detach a plex/subdisk from a volume/plex. The detach routine makes sure all connections between the objects are broken correctly, and only if it’s possible (unless forced ofcourse). The attach routine makes sure the objects are correctly connected together again, and that a plex that misses subdisks includes them in the previous size when calculating the new size (so we don’t get wrong sizes on the plexes).

Tested rebuild of degraded plexes. The detach/attach routines enabled me to check if the rebuild of a degraded raid5 plex could work. And it did! This means (and this is something that I really missed in old gvinum), that when a drive fails, you can detach the failed subdisk, create a new subdisk on the plex (and it will check if the plex “misses” a subdisk), and then use ‘start <plexname>’ to rebuild the plex (The state of the plex must be degraded and the subdisk you wish to rebuild must be stale) and you’re good to go!

Bugfixes.

Implement syncing of plexes. This means one can now add a mirror to a volume, and have the new plex to be synced from the original. After a couple of tests, it seems to work, but I did get a bug I need to reproduce.

I also discovered some bugs regarding mirrored plexes that I will address in
the near future. This probably came with the change with the new gvinum
event system.

Next on my schedule is to hunt some weird state bugs where the state is not correctly set, as well as the mirrored plex problems I’ve seen. Also, I need to guarantee that a plex sync is up-to-date (that no data is written to the synced plex in the meantime).

Raid5 improvements

Saturday, May 26th, 2007

The last couple of weeks I had to practice for some exams. In other words, a great time for coding :)

This week I’ve been working on making RAID5 parity rebuild work. This includes user initiated rebuild/check, as well as rebuilding a degraded plex during plex initialization. (This is a vital feature, since if a drive fails, one must be able to rebuild the plex with the new drive. I have not been able to test this enough yet because I need the attach/detach routines to do it. So instead of continuing and getting the initalization/synchronization-routines in, I will implement attach/detach next, which should be quick since I already have some old code for it.

Getting started

Wednesday, May 9th, 2007

Well, there has been some time now to get familiar with the projects and how things are done, although I am already quite familiar with the procedures. Therefore I started a bit earlier on the actual project, since I have a period with finishing exams from now until 8. June.

I’ve committed some work on the setstate functionality making it possible to be used again after Lukas’ rewrite of gvinums event-handling. Setstate now works on subdisks, plexes, volumes and drives. (Only subdisks and drives was supported before).

I have also been thinking of and made error codes to be used internally in gvinum to sort of help me see what happens when debugging etc. Other than that, I’ve rewritten the resetconfig a bit to prevent (what i think is) a race condition that was introduced with the new event system.

My plan is now to further adapt gvinum code to the new event system. I also have more patches from my previous that need to be integrated.

Ready, Set, GO!

Monday, April 16th, 2007

Hi, and welcome to this blog and my first post! I’m Ulf Lilleengen, and I will work on improving the Gvinum Volume Manager this summer. I’ll try as good as I can to keep people updated on my work through this blog.

If you have any questions or comments regarding my work, please let me know.