Behind INDEX-*.db
Published May 19th, 2007 in GeneralSeizing the opportunity of having more free time from my internship, I’ve been able to unravel a bit more behind INDEX-*.db’s format.
Based on the information dumped from MySQL 5.0.x’s db_dump185 source, I can see that the Ruby generated the following formatted database:
:categories{Categories}
:db_version
[db specific version string]
:origins
{Package name
Package origin}
:pkgnames
{Package name
Package origin}
:virtual categories
{
{?Virtual Category Name
{complete Virtual Category Package Origins}
}
{Package name
Package origin}
{Origin
Package Name|Path|Prefix|Comment|Description|Maintainer|{Categories}|{Build Dependencies}|{Run Dependencies}|Website|{Extract Dependencies}|{Patch Dependencies}|{Fetch Dependencies}
a}
}
Observations:
The reason for:
{Package name
Package origin}
being injected every once in a while is probably due to the overflow facility of the BDB database format, combined with the fact that my output was from a raw database dump.
Also, it does appear as if the:
{Package name
Package origin}
set is sorted by “Package name”, which I find a) interesting and b) inefficient, depending on the algorithm used to extract the fields from the INDEX-* file.
Summary:
So, the ruby ports management scripts tack on an additional metadata to the existing INDEX-* file most likely for what the author considered to be wise for looking up ports / packages. It may decrease the search time, but serves only to increase the overall raw INDEX-* data by 2.1MB, which results in greater pre-/post-processing.
Notes:
- A raw version of dumped INDEX-7.db (~1.4 MB) is available at: http://soc2007-freebsd-project-hosting.googlecode.com/files/INDEX-7.raw.tar.bz2.
- By bloating the size of the INDEX-* data, I was basing it off the following info:
gcooper@optimus ~/gcooper/Desktop
$ ls <del>lh INDEX</del>7 INDEX-7.raw
<del>rwx</del>-----+ 1 gcooper None 9.9M Apr 14 15:34 INDEX-7
<del>rwx</del>-----+ 1 gcooper None 12M May 19 00:17 INDEX-7.raw
- The
{Origin
Package Name|Path|Prefix|Comment|Description|Maintainer|{Categories}|{Build Dependencies}|{Run Dependencies}|Website|{Extract Dependencies}|{Patch Dependencies}|{Fetch Dependencies}
a}
set appears to have been taken verbatim from the INDEX-* file. See http://www.lpthe.jussieu.fr/~talon/freebsdports.html#htoc11 for more details.
Notation:
- “{ }” : denotes pattern repeating list, typically space delimited.
codeblocks denote verbatim string segments or characters.- Italicized text denotes package metadata fields.
One Response to “Behind INDEX-*.db”
Please Wait
Leave a Reply
You must log in to post a comment.