Archive for May, 2007

Some thoughts about sockets

Saturday, May 26th, 2007

I’m deciding, what is more important: syntax compatibility with usual sockets implementation (mimic function declarations and function appearing in user code order as much as possible ) or basic compatiblility (module behaviour is similar, but some functions named differently and function sequences to establish connection may be another) is more than enough. May be it’s one task, but views from two sides. This day I’m thinking first variant is more appropriate, but it limits some project functionality available with current understanding of pxe_filter.
All incoming packets flows from hidden (may be not hidden, anyway userbased networking needs receive cycle) from user pxe_core_recv_packets() function to registered by IP based protocol callback function (e.g. for UDP – pxe_udp_callback(), installed by pxe_udp_init() call at pxe_core startup). Callback function must dispatch packet data to one of sockets, if any inited.

For that purpose Ive create module, named pxe_filter. Every socket at eatablishing connection time (or at time of equivalent of listen() call, or just at socket creating time) installs filter to filter table. Filter conatains destination/source ip/port with masks for every parameter and reference to socket, for which installed filter. So, protocol handler function starts pxe_filter_check() with extracted from packet values of ports, ips and protocol type. Filter check function sequentialy checks filter entries from start to end entry, and returns when (if) first satisfactory entry found (well, similar to firewall rules). If filter entry found, socket is extracted from pxe_filter_entry structure and socket member of structure is valid – data dispatched to the socket buffer, if there is free space in it.

What benefits it gives:

  • it’s simple. And it adds some functionality of primitive firewall (which may be not interesting in this project).
  • not connection oriented code may listen only that ip in whcih it interested. Usually check from which ip datagram received is performed after recvfrom () or recvmsg() by one of returned values. But in case of filters – there would no necessity in this. Filter may limit from which ip to get packets. I think, it’s useful, e.g. for DNS resolver or any other non broadcast UDP packets transmiting, when it’s known from which ip data expected. If mimic absolutely sockets function syntax, this functionality may be lost.
  • listen() call may use filter to “fork” sockets, creating socket with more strict filter higher (earlier in checking order) in filter table. So all accepted connections may be assumed as child to base socket. And it’ll help to check if there are more connections to accept and after closing of base socket – which connections also to kill, if needed. Well, my project is more client oriented, than server, but anyway good opportunity.

For me, it seems not bad idea at all, so I’ve started doing it in this way.


can someone say how to fix SpamKarma 2, which is installed incorrectly? all links from manage page links to… but not to… it’s a little bit annoying to enter manually



sql_skip_rows=0&sql_score_threshold=-100 to check if some comment is mistaken with cialis ads.

Ping works

Wednesday, May 23rd, 2007

Somehow PXE code works, and in nearest future only C and TCP/IP related stuff expected, no assembler and PXE related problems. Last issue (code worked only in one available to me implementation of UNDI API ) was solved by modifying most used function pxe_core_call(). For some reasons, it returns incorrect value in ax register, however code works correctly. So, ax register now ignored till understanding why it’s contents is corrupted.

IP imlementation was made really fast, but checksums of IP and ICMP headers provided some nervous hours: ICMP reply (from PXE client) packets were noticed in tcpdump, but ping itself ignored this packets. Problem was in incorrect icmp checksum (first time tcpdump wasn’t saying me that there is problem with icmp checksum, but after some changes in checksum calculating it began warn. I guess, why it was not warning earlier, when checksum was calculated only for icmp header without transmitted data? May be it was a miracle and this bytes were filled by zeroes, so checksum was independent from transmitted data) .

The second problem that was expected – routing of packets to gateways. For this purpose, some routing related functions were implemented, and now it’s possible perform ping of death from PXE booted workstations :) Routing functions too primitive in my project, I need do something with them. By default, at start (pxe_ip_route_init()) two routes are added: default (it must be got from PXE cached packets, but there are only zeroes in place of gip member, so I set my home gateway as default gateway. After implementation of some DHCP/BOOTP functions, it’ll be possible to get gateway automatically). pxenet0 (this route for hosts in local network, in fact – it’s not gateway. It’s named as interface in original pxeboot). IP related code uses first route, which network is equal to destination network of ip packet.

For testing purposes I’ve added to pxeboot command pxe, which allows to test project subsystems:

1. pxe arp ip.addr – sends ARP prequests for provided ip. Returns MAC address of desired ip, if host is upped.

2. pxe ping ip.addr – sends 5 pings with 32 bytes to provided ip.

3. pxe route – change route table. e.g. ‘pxe route add default′ will set as default gateway. As you see there is no mask, it’s calculated from ip address class (no CIDR, so…). ‘pxe route print’ – shows current routing table.

4. pxe await – starts infinite cycle ofchecking received packets. May be used for client pinging purposes or serving client as server. There is no exit from this test module.

Well, in fact I need somebody to test modified pxeboot in PXE environment to be sure code is compatible with most existing PXE implementations (I’ve tested all available to me implementations and it worked, but who knows…). If somebody, may find a time and test – help will be greatly appreciated.

Here is tar.bz2 of modified version of pxeboot. It’s needed to up DHCP server with appropriate  settings  and TFTP service.

First result

Sunday, May 20th, 2007

Well, first of all, about pxe.h. It has error in structure naming t_PXENV_UNDI_INITALIZE, it took some time to find, which letter is missed. Other issues with this fie: not armoured header (which disturbed me while compiling, cause included twice), PXENV_UNDI_GET_STATE related structure and definitions are missing. I’ve updated it only locally, but may be it’ll be better to make perforce mapping later.

After successfull complilation of libpxe_http.a, I’ve thought all is good, however linker thought differently. There was “R_386_16 relocation truncated” error. Now I know, it was gas ‘feature’, it’s rather problematic to write mixed 16 and 32 bit code with it, but first time I was in confusion. Now pxe_isr.S uses manualy rewritten segment:offset adresses, based on calculating sement and offset to data_start. I’ve thought about using code segment as data segment (in fact used data is placed in code segment), but first try gived me bad results, so for first time it will be somehow ugly, but working code

ARP related code was already near working state, so main problem was how to test it. I’ve modified pxe.c (implements pxenet device and NFS (via UDP) loading of kernel), just to start test of arp (this test sends broadcast request who-is provided ip and receives replies, updating internal arp_table structure). And after some trys understood, that pxe_core_receive() code is doing nothing, cause UNDI returns pointer to buffer, to which there is no access from userspace (starts at 0xa000, buffer is in lower addresses). So, __mem_copy() in pxe_isr.S was born with destiny to copy data in vm86 mode, and data_buffer array finally came in handy as place to copy to. There was also bug in receiving cycle, so I’ve made debug macro and started printing out to screen everything useful and useless. And have found, that receiving buffer includes media header and transmit (at least, for known by UNDI protocols) doesn’t include ethernet header.

Meanwhile, command ‘arp’ was added to loader (just to check, if I understand right how to add cmmands there), it performs arp test (but to say true, it not works, cause earlier started by autoboot pxe related code, and thus other test, from pxe.c function pxe_open() which shutdowns undi after all).

After fixing of some bugs and rebuilding – arp test began working in virtual machine (I am omitting tonnes of messages ‘unknown hardware address format’ in dmesg :) and some stupid errors with broadcast address for ethernet). It sends broadcast message for my home computer in localnet and receives reply, caches it and finishes. Well, I was such happy, that even rebooted desktop trying to perform test on one of it’s NICs (internal Realtek 8168; there is also 3Com 3C905C-TX, but I’ve not tested on it yet). Well, as always, reality was cruel. All pxe_core_transmit() calls were unsuccessfull, I’ve tryed other virtual machine (with different implementation of PXE) and found the same result.

I think, problem is incorrect sequence of initialization of UNDI services in pxe_core_init(). PXE spec says, that on remote.0 stage (started pxeldr) it’s possible to stop everything (STOP_BASE. UNLOAD_STACK, STOP_UNDI) and start remote.2 (the same executable in my case) with initialization UNDI again, but after that even first virtual machine rejected my code (mmm… in fact I’ve done also UNDI_CLOSE, UNDI_CLEANUP, UNDI_SHUTDOWN). there is also two mysteries – GET_STATE function always returns ‘failed’ result, but it seems, returned UNDI state value is not such incorrect. And some transmit calls fails with Status = 0 (which is something like ‘not any error’ status). It’s strange (I guess if I understand right – call failed, if ax register != 0).

So main problem for next few days – to find correct initialization sequence of calls, if problem is in that. And when problem will be solved – start first testing and implementing of ip protocol.

Three days of silliness

Friday, May 11th, 2007

Well, I’ve started implementation of both variants of interrupt handling. First in real mode, second one in supervisor mode.

I’m such idiot :) tryed to use call gate to change privelege level from ring 0 to ring 3. Digged memory dumps trying to understand why code causes #GP. Fixed stack changing issues (that was first reason of fault), made correct call gate installation… but problems was still here with me every day and a huge bit of night. the reason is simple, lcall via callgate with decreasing of privelege level is impossible. Hmm. Next time I’ll read Intel specs more times per day even if it’ll consume time I’m spending on reading of Ray Bradbury’s stories. But anyway, this my stupidity helped me to get along with memory dumps in virtual machine and improved my knowledge of AT&T syntax.

The other way of handling, I’m also implementing (may call it “main branch” of project) – handling in real mode via catching reflected to vm86 mode interrupts from vm86 monitor. It’s adopted from Intel PXE SDK with slight differencies.

The main thing to think is interface to packet receiving.

Real mode code now assumes, network operations are performed in cycle from start to end of interexchange operation without interrupting for other needs. Something like this pseudocode (not formatted well, but it’s best I could get…):

pxe_poll() {

if (0 == __pxe_isr_occured)
{ return 0; }

if (pxe_packet_recv())
{ return 1; }

return 0;

while(1) {

if (pxe_poll())
{ do_recieve(); }

if (received_all)
{ break; }


What problems are here? If we use TCP connection and recieve_all triggered it doesn’t means sender have received our ACK for last needed packet and that he received FIN. After receive_all is true, we begin next stage of working with this file and meanwhile sender may continue sending packets (he lost ACK and FIN for some reasons…), this packets will spam during some time NIC’s receiving queue and will cause queue overflow (the same if we are not checking if we received packet within reasonable time interval). Well, may be it’s not such big thing for http, but anyway is not very clean and shiny solution.

After understanding problem with privilege level and call gates, I’ve thought about a little bit another way.

ISR is executed in ring 0 (CPL0) and performs all recieve/send/resend operations, using user data selectors and code in userspace. If it’ll work, of course. After call gate issue, I’m rather cautious in my thoughts.

So this code will handle interrupt, get all packets that suit to installed packet filters. Packet filters are installed via pxe_socket() calls. If packet fits filter, than it’ll be stored in packet queue (currently it is handled in pxe_core), if not it is dropped. So, tcp related code must also be started in ring 0 (thanks to gods, UDP doesn’t need this). Usercode will query ring 0 via calls or may be via direct access to structures in userspace. the first case is more simple to synchronize packet adding/removing from pxe_core packet queue.

Well, it’s big enough post, it’s time to finish it. And start implementing further more of ICMP (the main goal now to achieve before beginning of summer of code).