<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>collation</title>
	<atom:link href="http://blogs.freebsdish.org/konrad/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.freebsdish.org/konrad</link>
	<description>Just another FreeBSD Committers Blogs weblog</description>
	<pubDate>Tue, 05 Aug 2008 11:55:39 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6</generator>
	<language>en</language>
			<item>
		<title>Enter Libc</title>
		<link>http://blogs.freebsdish.org/konrad/2008/08/05/enter-libc/</link>
		<comments>http://blogs.freebsdish.org/konrad/2008/08/05/enter-libc/#comments</comments>
		<pubDate>Tue, 05 Aug 2008 11:55:39 +0000</pubDate>
		<dc:creator>konrad</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blogs.freebsdish.org/konrad/2008/08/05/enter-libc/</guid>
		<description><![CDATA[I spend last weeks on importing interesting parts of Apple&#8217;s libc to FreeBSD. I ended up with this patch: http://versus.ath.cx/patches/fbsd_7.0_collation.patch, which I than applied to the CURRENT libc, and commited as my p4 libc branch.
It wasn&#8217;t however without accident - I suffered two serious laptop breakages, ending with en essentially dead one. First the display [...]]]></description>
			<content:encoded><![CDATA[<p>I spend last weeks on importing interesting parts of Apple&#8217;s libc to FreeBSD. I ended up with this patch: http://versus.ath.cx/patches/fbsd_7.0_collation.patch, which I than applied to the CURRENT libc, and commited as my p4 libc branch.</p>
<p>It wasn&#8217;t however without accident - I suffered two serious laptop breakages, ending with en essentially dead one. First the display broke down - stopped showing anything - I think it was the backlight. I was lucky - I had external monitor enabled and configured for Dual Head. When I connected it however, I started to get IDE timeout problems when booting FreeBSD. So my hard drive also died. I managed to boot off live-cd, and copy and my gsoc data, and that was it.</p>
<p>After installing everything on my new-bought laptop, I got to finishing my work on libc, and was able to get in into - I think - fully working state. There are still some little things I want to fix/enhance, but general functionality is there.</p>
<p>Now I&#8217;m focusing on writing regression tests, than will come manpages. At the end will be a call for testers.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.freebsdish.org/konrad/2008/08/05/enter-libc/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Reinforcements from Apple</title>
		<link>http://blogs.freebsdish.org/konrad/2008/07/12/reinforcements-from-apple/</link>
		<comments>http://blogs.freebsdish.org/konrad/2008/07/12/reinforcements-from-apple/#comments</comments>
		<pubDate>Sat, 12 Jul 2008 15:45:36 +0000</pubDate>
		<dc:creator>konrad</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blogs.freebsdish.org/konrad/2008/07/12/reinforcements-from-apple/</guid>
		<description><![CDATA[Last time I said that it would be nice to use some Apple&#8217;s work in the area of collation. I exchanged a few emails with Jordan K. Hubbard and it seems we can user their code without problems - all interesting parts are still on the BSD licence. That&#8217;s because this is  still our [...]]]></description>
			<content:encoded><![CDATA[<p>Last time I said that it would be nice to use some Apple&#8217;s work in the area of collation. I exchanged a few emails with Jordan K. Hubbard and it seems we can user their code without problems - all interesting parts are still on the BSD licence. That&#8217;s because this is  still our code, only extended by Apple. Even the copyrights weren&#8217;t changed a bit (so we don&#8217;t know who did the extending).</p>
<p>Anyway, as the code is fairly mature, I decided to use it. The libc part of the code is the one I am most interested in, but to see how it works, I first had to port the userland tool - Apple&#8217;s version of colldef. Doing this I extended it a little - to not choke on the expansions. As I don&#8217;t have the locale data that Apple is using, I made the tool work on my data - at the same time making it more posix compliant. There were many little issues while porting the code - and I wanted it to work perfectly before I submitted it - so it took me more than a week to complete the porting. I even made it compile with &#8220;-ansi -Wall -Wextra -pedantic&#8221;, thing I always do with my code.</p>
<p>Now, as the tool is completed and I did a final cleanup, I will test it on a bigger amount of data, and then proceed to port the libc part. I&#8217;m really excited to see how it works. When those two things are completed I will have to make a few more extensions to Apple&#8217;s code to make if fully compliant with UCA.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.freebsdish.org/konrad/2008/07/12/reinforcements-from-apple/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Gathering the basic elements</title>
		<link>http://blogs.freebsdish.org/konrad/2008/06/16/gathering-the-basic-elements/</link>
		<comments>http://blogs.freebsdish.org/konrad/2008/06/16/gathering-the-basic-elements/#comments</comments>
		<pubDate>Mon, 16 Jun 2008 23:34:11 +0000</pubDate>
		<dc:creator>konrad</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blogs.freebsdish.org/konrad/2008/06/16/gathering-the-basic-elements/</guid>
		<description><![CDATA[Last time I&#8217;ve written about the purpose of collation. Now is the time to write a little about how I want to deal with it.
I&#8217;ve been a little busy with my exams lately (who hasn&#8217;t), but I have the last one June 26&#8242;th. Anyway, I&#8217;ve managed to gather some basic building blocks, which will support [...]]]></description>
			<content:encoded><![CDATA[<p>Last time I&#8217;ve written about the purpose of collation. Now is the time to write a little about how I want to deal with it.</p>
<p>I&#8217;ve been a little busy with my exams lately (who hasn&#8217;t), but I have the last one June 26&#8242;th. Anyway, I&#8217;ve managed to gather some basic building blocks, which will support the rest of my project:</p>
<ol>
<li>imported &#8220;Common Unicode Data Repository&#8221; - the source of all locale data that you will ever need - into my p4 tree</li>
<li> written converter scripts to change the symbolic character names as found in this repository into UTF-8 sequences</li>
<li>written a program called colldef.c that uses the data output from the scripts and builds the binary collation table, doing some fancy compression/reduction on the way, so that all character weighs fit within one byte.</li>
</ol>
<p>The next steps that I will take will be writing the libc part - the one that uses the binary table and does the sorting/collation. I will have to rewrite most of the string/strcoll.c and locale/collate.c.</p>
<p>I&#8217;ve been contacted by Alexander Leidinger recently, who told me that Apple has already done full conversion to UTF-8 of their base system. I skimmed through their strcoll.c and collate.c and I can confirm this. It would be nice if we could use some part of this work.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.freebsdish.org/konrad/2008/06/16/gathering-the-basic-elements/feed/</wfw:commentRss>
		</item>
		<item>
		<title>A few words about collation</title>
		<link>http://blogs.freebsdish.org/konrad/2008/06/12/a-few-words-about-collation/</link>
		<comments>http://blogs.freebsdish.org/konrad/2008/06/12/a-few-words-about-collation/#comments</comments>
		<pubDate>Thu, 12 Jun 2008 07:12:08 +0000</pubDate>
		<dc:creator>konrad</dc:creator>
		
		<category><![CDATA[soc2008]]></category>

		<guid isPermaLink="false">http://blogs.freebsdish.org/konrad/2008/06/12/a-few-words-about-collation/</guid>
		<description><![CDATA[I&#8217;d like to introduce newcomers to the topic - what is the collation and why do we need it - eg. - why not just strcmp.
In the simplest form - comparing English words - we don&#8217;t need collation at all (save case differences) - the binary character encodings (called codepoints in unicode) are all we [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;d like to introduce newcomers to the topic - what is the collation and why do we need it - eg. - why not just strcmp.</p>
<p>In the simplest form - comparing English words - we don&#8217;t need collation at all (save case differences) - the binary character encodings (called codepoints in unicode) are all we need.  However, when we have to deal with, for example, accents, our task is more difficult - differences in accents should be ignored in most languages if there are any differences in the base letters - even in the base letters which are _later_ in the string. Then there come the differences in case, which should be even less important than differences in accent, and at the end are differences in punctuation.</p>
<p>This way we end with 3 or 4 comparison levels, the first one is always conducted, and the others conditional, only if the earlier level showed no difference in string. Add to this contractions: when two characters have to be treated as one - and expansions - when one character should behave in sorting as two - and you have some basic idea of what collation is.</p>
<p>On top of this, each language off course has it&#8217;s own rules, so we need to tailor the collation to the current locale - we basically have to have data files for all supported languages.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.freebsdish.org/konrad/2008/06/12/a-few-words-about-collation/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
