Archive for July, 2007

Kernel loaded

Wednesday, July 25th, 2007

httpfs now works. It uses pxe_http code to open keep-alive connections to http-server. Also it ressurects it to life if connection is dead due server’s settings (e.g. timeout or exceeded max requests per connection). Need to think how to make requests optimal. Loader uses 4K buffers at request, may be I’ll add some sort of caching mechanism for httpfs to reduce count of requests to server.

So, it’s possible to use common commands of BTX such as ‘more’, ‘load’, ‘help’. I’ve played a lot with them while testing of http code, rather boring process. But after that one problem with interrupt handling in pxe_core was located. It seems, that end of interrupt is handled not correctly, so interrupt not occurs from time to time till some unrelated top pxe_core code finishes interrupt correctly. Problem is temporarily fixed by ignoring of __pxe_isr_occured flag and checking of available incoming frames in pxe_core_recv_packets() despite of value of this flag. Meanwhile TCP code was simplified by rewrtting it in pxe_tcp module, reducing code lines and function count. Testing showed it works identically to previous code, except one silly misprint, that was fixed in new variant of code.

Anyway, it’s possible to load modules, kernel and etc. It’s interesting that loader tries to load gzipped variants of files first, so it performs four requests to server. E.g. we need “loader.rc”, theese files will be requested: “loader.rc.gz.split”, “loader.rc.gz”, “loader.rc.split”, “loader.rc”. It was unexpected and exceeded limits of filters and connections tables, cause connections felt to TIME_WAIT state (keep-alived connection were actively closed by client. Need to try in nearest future sending of “Connection: close” field in header to make passive connection closing at closing of file), no new connections were possible to establish. Well, I’ve added opportunity to free connections structures in this state for establishing of new connections. It’s violates RFC, but helps keep relatively small amount of structures. If PXE_MAX_CONNECTIONS is set to big enough number, then TCP code behaviour is as expected in RFC.

After changing loader.rc to something very simple, my home made kernel was loaded and ‘lsmod’ showed a number of modules. But ‘boot’ command failed (hang or panic depending on which virtual machine and real machine were used). It was frustrating – result is practically visible, but yet goal not achieved. I’ve tried loading of RAM-drives, but also failed.

In bad mood of me normal loader.rc was started. The first surprise is that on VMWare machine I’m see normal booting process till time of mounting of root file system (which is unknown, cause not set by environment parameters in loader and /etc/fstab is empty in Apache’s DocumentRoot directory). Second surprise that another virtual machine hangs during loading of support.4th (according to Apache logs it’s read not completely, about 70-80%) and my real machine just reboots while reading same file. Rather suspicous :)
So next days I’ll be solving quest for finding why loading of support.4th, which have no nothing criminal in it, may cause such problems and exploring issue with lost somewhere interrupt.

Something looking like result

Saturday, July 14th, 2007

From the last post there were big steps ahead.

Second variant of TCP code was good enough and I’ve even got result ‘tes’ (3 letters of word ‘test’) on server side while testing. After some testing and improvments, whole buffer was transferred well, but there were no resending queue. It appeared in third variant.

For that purposes was needed a small and cute packet allocation manager (receive buffer is still uses common cyclic buffer routines, but send data buffer is used by packet allocation manager). Effective memory allocation in case of variable size of packets is rather difficult task, so I was implementing other task – allocating of fixed size packets. Simple enough to implement it in short period of days. All packets were divided in two groups – “small” packets and “big” packets. Small packets are used for system messages (SYN, FIN, ACK without data, RST), big – are for data transmitting.

How it works? Buffer space is divided in chunks with size equal to maximum size of “small” packet. It’s now 64 bytes (this is enough for IP, TCP header, TCP options and segment description). Every chunk contains TCP related segment description (resending time and other). Sequential chunks are gathered together in blocks. Block is used for allocating data for “big” packets and it’s size 512 bytes. So, memory manager just allocates blocks or chunks in place. Block may be exclusively be owned by one big segment or by count of small segments. Amazing, that it worked from the beginning without signifcant errors. Segment resending function uses segment description, so it’s possible to send segments allocated using other manager or just statically allocated (used in sending RST packets when there are no connection, thus there are no socket and thus there are no buffer and space to allocate chunks)

So memory manager was ready and resending become pretty simple – just check resend time of segment and resend it. And I’ve started simple testing of code by sending blocks of generated sequences with identical initial generators on both sides of connection. And it worked perfectly till summar transferred data to one side was less then buffersize. Bigger amounts of data make code crazy, guard checks of memory allocator become very talkative and loved easily to fell to panic. Few days of testing located problem part of code in buffers read/write code in some cases. After standalone tesing module for buffers all become correct and data started transferring well. First test was in trabsferring 8 MBytes of data (that exceeds defsault buffer size 32K). Good new was that data is going correctly: per byte checking showed no errors. Bad new was that speed was about 5-6Kbytes/s at FastEthernet. Really annoying to wait all this 8Mb to be transferred.

The reason was simple – after filling receiving buffer and getting small window size (less then default mss) server side was waiting about second or two before asking client side, is any free space now. So, one fix in pxe_recv(), which started checking connection by sending ACK every 100 ms if receving buffer have much free space, saved the Earth, pardon, project. Speed become 500K-1000K. Well, still is not best, but better than was, next speed up may be done by sending/recv’ing bigger blocks (in test it was byte by byte in order to check sequences are correct). Now there is another problem :) client code is very nasty if server is not sending anything, client bombs server with ACKs about big window ready to work. I need to think how to make client behavior more correct. At this point 80Mb of data were transferred correctly and relatively fast, so test was over with verdict – passed.

It was so good, that even warned me a little bit. Time showed I was right, not all was so good. To be more accurate – connection breaking was handled incorrectly, when server initiated breaking (passive closing from client view on this situation). Some updates in tcp_disconnect() helped to solve problem and LAST_ACK, CLOSE_WAIT are now also handled in right way. At least I think so and tests prove it. This changing in disconnect function give also speed up bonus in active connection closing.

But anyway, next version of TCP implementation is already in thoughts to simplify pxe_tcp module and reduce function count. May be some other changes will be done. E.g. I’m trying to reduce sending of ACK packets by using incremental ACK (now every incoming segment is confirmed). This already is performed in checking of “same” packets: every packet that is just copy of current, placed in queue, with older sequence number to ACK is removed from resending queue. It helps a little bit, but solution is still not good enough.

Meanwhile I was doing http client, the only function of which was to fetch chosen file (I was testing on html-page) from web-server. There was not anything special, except that I was sent again to read style(9) and implemented simple snprintf() in libstand by modyfing PCHAR() macro. http receiving code was working, so next function implemented partial requesting of data. It’s rather simple, just adding of other field in header (assuming server supports such feature). Main task to do here in future – don’t establish connection for every reading of part of file. Now it established at start of reading and closed after that.

Next step, I forget about – filesystem. Well, for reasons I don’t know, I was thinking pxe.c is all that needed to change and files become getting itself from web server (by the way, www-server option was added to DHCP-client to get it ip). But it’s rather logical to separate filesystem and device. So, I was needed – make stub filesystem code. Using already available code it was simple – just call pxe_http module functions at needed time.

So, next days I’ll continue checking and correcting of existing code (also, once more style updates…), but main task will be tie together filesystem, pxenet device, http code and see what will be in result of this mix.