CS 536 Fall 2025

Lab 3: Lightweight and Reliable File Transport, Traffic Monitoring [250 pts]

Due: 10/15/2025 (Wed.), 11:59 PM

Objective

The objectives of this lab are to monitor Ethernet LAN traffic by capturing and analyzing Ethernet frames, implement a fast and reliable file transfer app.


Reading

Read chapter 3 from Peterson & Davie (textbook).


Problem 1 [70 pts]

1.1 System set-up, traffic generation, and capture

In this problem, you will sniff Ethernet frames on an Ethernet interface veth0 on one of the amber machines in our HAAS G050 lab. When sniffing Ethernet frames, operating systems including Linux require the interface operate in promiscuous mode which requires superuser privilege. On an amber machine, run

% sudo /usr/local/etc/tcpdumpwrap-veth0 -c 32 -w - > etherlogfile

which will capture 32 Ethernet frames and save them into etherlogfile. Create v1/ under lab3/ and place etherlogfile in v1. Enter your password when prompted. tcpdumpwrap-veth0 is a wrapper of tcpdump that allows sudo execution. Check the man page of tcpdump for available options. To generate traffic arriving on veth0, use the ping app from Problem 3, lab2, with the server running on an amber machine bound to IP address 192.168.1.1. The client is executed from the same machine using

% veth 'udppingc 192.168.1.1 portnum secret portnum2 pcount'

where veth executes udppingc at a machine with IP address 192.168.1.2. Thus the ping client transmits/receives packets on interface 192.168.1.2, and the ping server transmits/receives traffic through interface 192.168.1.1. 192.168.1.1 is a private IP address that is not routable on the global IP Internet which has been configured for veth0. 192.168.1.2 is the IP address at the opposite end of veth0 as if the two interfaces were connected by a point-to-point Ethernet link. Running

% veth 'ifconfig veth0'

will show the configuration of veth0 at the opposite/remote end whose IPv4 address is 192.168.1.2 as noted above.

For security reasons, we cannot perform sniffing on eth0 which is the interface of the lab machines, a shared resource. Therefore we use virtual/dummy interfaces in Linux that allows veth0 to be configured as a separate Ethernet interface -- albeit virtual, not physical -- with private IP address 192.168.1.1 that can reach 192.168.1.2, and vice versa. Thus performing veth at 192.168.1.2 on an amber machine does not execute udppingc on a different physical machine equipped with an Ethernet interface veth0 with IP address 192.168.1.2. Instead, both server and client run on the same physical machine, and packet forwarding is handled virtually by Linux as if 192.168.1.2 were a physical Ethernet interface on a separate machine. For our Ethernet frame sniffing and inspection exercise, this will suffice.

1.2 Traffic analysis

Use test file and payload size so that at least 32 Ethernet frames are generated by the file transfer client/server app which will be captured by tcpdumpwrap-veth0. After doing so, analyze etherlogfile using wireshark or tcpdump (tcpdump is also an analysis tool). Wireshark (/usr/bin/wireshark), the postcursor of ethereal, is a popular graphical tool for analyzing (as well as capturing) traffic logs in pcap format. Use wireshark or tcpdump to inspect the 32 captured Ethernet frames. Using the MAC address associated with 192.168.1.1 (perform ifconfig -a on an amber machine) and the MAC address associated with 192.168.1.2 (perform veth 'ifconfig -a' on the same amber machine), identify the relevant Ethernet frames whose payload are IPv4 packets that, in turn, contain UDP packets generated by the client/server ping app, as payload. Inspect the type field of the captured Ethernet frames to determine if they are DIX (i.e., Ethernet II) frames.

The first 20 bytes of Ethernet payload comprise IP header and the next 8 bytes the UDP header. The last 8 bytes of the IP header specify the source IP address and the destination IP address. Check their values against the IP addresses used by the file transfer app. The first four bytes of the UDP header specify the source and destination ports. Check that they match the port numbers used by the client/server. Inspect the remaining bytes of the captured Ethernet frames which comprise the application layer payload communicated between client and server by calling sendto(). Wireshark/tcpdump will provide output where Ethernet header fields are already decoded. Inspect the captured raw data in hexadecimal form to match the IPv4 addresses and UDP port numbers. Use wireshark/tcpdump as a confirmation tool. Do the same when analyzing the application layer payload carried by the Ethernet frames. Discuss your findings in lab3.pdf.

Note: To run tcpdump, please use the command, % tcpdump -r - < etherlogfile, instead of, % tcpdump -r etherlogfile, which will trigger an "access denied" error. You may also run the command-line version of wireshark, /usr/bin/tshark, instead of tcpdump. To run wireshark, you will have to be physically at a lab machine. If you prefer, you may install wireshark on a Windows, MacOS, or Linux machine, copy etherlogfile to the machine and run wireshark to inspect captured frames.


Problem 2 [180 pts]

Implement a UDP-based reliable file transfer protocol, suft (small UDP-based file transport), in v2/ that is suited for small files transported in low loss network environments. The sender, suft, is executed at a host with arguments

% suft <rcvip> <rcvport> <filename>

where rcvip and rcvport specify the receiver's IPv4 address (dotted decimal) and port number. The third argument specifies the file to be transported where, for simplicity, we will impose the restriction that it must be exactly 6 bytes long (excluding '\0' when treated as a string) comprised of lower-case alphabet characters only. The receiver, suftd, is executed as

% suftd <rcvport>

where rcvport specifies the port number that it binds to in order to process incoming UDP packets. The receiver performs file I/O to save the file's content only after the last byte has been received. This reduces the influence of file system on network protocol performance.

2.1 Sender behavior

Create a subdirectory v2/sender/ where the sender is coded. Upon startup suft reads a text file, sender.param, where sender-side parameters are stored. The first parameter, call it maxfilesize, specifies the maximum file size allowed (in unit of byte), a positive integer. When suft opens a file to transfer and determines that its size exceeds maxfilesize, a suitable error message is output to stdout before terminating. The second parameter denotes a time interval (in unit of microsecond), a nonnegative integer, call it micropace, which is used by the sender to regulate how fast it transmits data packets to the receiver. The third parameter, payloadsize, a positive integer, specifies the fixed payload size of UDP which carries the bytes of the file. Unless payloadsize is an integer multiple of file size, the last UDP data packet will have a payload smaller than payloadsize. We will limit payloadsize to not exceed 1400 to allow part of UDP's payload to be utilized for bookkeeping/management purposes.

When evaluating suft in our HAAS G050 lab environment, we need to consider that user directories are remote mounted on the amber machines through the networked file system NFS which introduces its own management and network communication overhead. Although the focus of Problem 2 is on correctness, a secondary concern is performance where our aim is to remove the influence of NFS to the extent meaningful. Since suft is targeted at transporting small files, we mitigate the overhead of NFS by first reading the entire file into main memory before UDP data transmission is initiated. In a similar vein, the receiver suftd stores received UDP data in main memory, writing the memory content to a file when reliable transport of file data has completed. By taking a timestamp before the first data packet is transmitted and recording a second timestamp after acknowledgement from the receiver that all segments of a file have been received, their difference can be used to gauge the contribution of suft to file transfer completion time. Following are the main steps carried out by the sender.

Read file into memory. Allocate heap memory to hold all bytes of an input file and read its content into main memory. Abide by the dictum of modularity by performing such operations in separate functions coded in separate files. For example, if main memory to hold file content is organized as a 1-D char array, different parts of a file can be passed to the sendto() system calls by reference along with its length.

Data structure to track successful data segment transmission. Using the ceiling value of file size divided by payloadsize, call it numpackets, as a bound, use a data structure to track which data packets have been successfully transmitted. For example, it may be a simple 1-D integer array where successfully transmitted data segments are marked as 1, otherwise 0. Successful transmission means receiving positive acknowledgement from the receiver suftd that a data segment has been received.

Communicate parameters to receiver. suft sends a UDP packet to the receiver that contains the following information: 6B filename, size of file (in unit of byte) as 4B unsigned int, payloadsize as 4B unsigned int. Before transmitting the packet, suft sets a timer to expire after 250 msec. If a response from suftd is not received before the timer expires, use a SIGALRM handler to retransmit a packet containing the parameter payload. suft gives up after 5 attempts, outputting a suitable error message to stdout before terminating.

Synchronous actions. The overall structure of the sender protocol is comprised of a synchronous part and an asynchronous part. The synchronous logic comprises an infinite loop wherein a for-loop bounded by numpackets checks if a data segment of payloadsize (but for the last payload which may be less) has been successfully transmitted, and, if not, is retransmitted. The data segments of a file are marked by a 4-byte sequence number (type unsigned int), starting at 0 and going up to numpackets - 1. Hence the k'th data segment of a file is transmitted by calling sendto() where the first four bytes of the UDP payload is integer k followed by the bytes of the k'th data segment. As in lab2, please consider if byte ordering concerns exist that require action. The for-loop is repeated until all data segments are acknowledged as having been received. gettimeofday() is called before the first data segment is transmitted and called a second time when the for-loop detects that all data segments have been received. Their difference, in unit of millisecond, is output to stdout before the sender terminates. In general, use exit(0) to indicate normal termination with exit(1) noting termination due to error conditions. For any data segment, at the 11'th retransmission attempt the sender gives up, printing a suitable error message to stdout before terminating. Successive data segment transmissions are paced by calling usleep() with argument micropace which specifies the number of microseconds to sleep before calling sendto().

Asynchronous actions. The asynchronous part of the sender protocol is responsible for marking data segments as having been successfully transmitted when feedback UDP packets are received from suftd. Packet receive events generate interrupts that are exported by Linux as SIGIO (or SIGPOLL) that can be caught by the sender by registering a SIGIO signal handler. The receiver will send a cumulative positive ack where sequence number s indicates that all data segments with sequence number r (strictly) less than s have been received. Hence data segment of sequence number s is the first data segment that the receiver is missing (and expecting). If s equals numpackets then the response indicates that all data segments have been received. When asynchrony is introduced care needs to be exercised to avoid corruption of shared data structures by the synchronous and asynchronous parts of the code. In some cases, techniques such as temporary masking of signals, use of mutex, in addition to consideration of reentrance may be needed to achieve correct operation. In other cases, if there are clear dividing lines between readers and writers of shared data structures then complications and overhead stemming from concurrency may be avoided. Please describe in lab3.pdf how you approached this issue.

2.2 Receiver behavior

Create a subdirectory v2/receiver/ where the receiver is implemented. Following are the main steps carried out by the receiver.

Process new file transfer initiation. Upon receiving a UDP packet on the port specified in the command-line argument, suftd extracts the 6-byte filename, 4B file size, and 4B payload size from its payload. The filename is used to create a new file whose name consists of the 6-byte filename appended with "_n". From the file size and payload size the receiver calculates numpackets which enables determining the sequence number of the last data segment to be transmitted by the sender. The receiver transmits to the sender a UDP packet whose payload contains the 6-byte filename which serves as a response that acknowledges acceptance of a new file transfer session. Before transmitting the response a timer is set for 250 msec. If no packet is received before the timer expires, a response packet containing the same payload is retransmitted. Note that a duplicate file transfer initiation packet may arrive from the same sender before the timer expires. Describe in lab3.pdf how your implementation handles this situation. A file transfer session commences when the first UDP packet containing the first data segment with sequence number 0 arrives. For simplicity we will ignore scenarios where UDP packets from other senders (i.e., different IPv4 address and/or port number) arrive after the first file transfer initiation packet is received. Ignoring means discarding UDP packets not from the same IPv4 address and port number. suftd gives up after 5 attempts, ready to accept new file transfer initiation requests.

Allocate memory to store file data. suftd allocates heap memory to store data segments containing file data. For example, a 1-D char array may be used where the sequence number can be used to index the array to upload file content from UDP payload. The content of the heap memory is written to the new file (ending in "_n") after all data segments have been received.

Data structure to track successful data segment receipt. Similar to the sender, devise a data structure to keep track of which data segments have been received. A variable, nextsegmentexpected, contains the sequence number of the first data segment that is missing. It is updated as UDP data packets are received. When a data packet is received, an ACK packet is transmitted to the sender whose 4B payload consists of nextsegmentexpected. Thus a cumulative positive ACK is returned to the sender upon receiving any UDP data packet from the sender. For example, if the receiver's state is

0 1 2 _ 4 5 _ 7

where segments with sequence numbers 3 and 6 are missing, when segment 6 arrives leading to

0 1 2 _ 4 5 6 7

an ACK packet is emitted containing sequence number 3. By assumption of "small" file size, future segments have guaranteed space to store even though one or more past segments may still be missing and await retransmission/receipt.

Synchronous actions. The design of the receiver follows a synchronous structure where upon entering the data segment transmission phase suftd blocks on recvfrom(). When recvfrom() returns an ACK packet containing nextsegmentexpected is sent to the sender after updating the data structure for tracking data segment receipts. In this sense, data ACK generation is passive, a response to data segment arrivals. There is no timer set to actively retransmit ACK packets that may be lost.

Session termination. File transfer session termination is initiated by the receiver by sending an ACK packet containing sequence number nextsegmentexpected. A complication arises since the ACK packet may get lost which leaves the sender in the dark about the fate of the file transfer. The last ACK packet containing nextsegmentexpected is transmitted 10 times in succession to increase the likelihood that one of the ACKs will be received by the sender. Note that the sender will eventually give up if the last ACK is not received after 10 retransmission attempts.

2.3 Controlled packet loss

Under "normal" operating conditions our lab machines connected by Ethernet switches may not drop any packets during a small file transfer session. To help determine correctness and gauge performance, we will introduce controlled packet losses at receiver side (i.e., a loss model). At start-up suftd will read from a text file, receiver.lossmodel, containing up to 10 nonnegative integers (sorted in increasing order) which specify the sequence numbers of data segments to be ignored/discarded when they arrive at the receiver. For example, if the integers are 20, 35, 51 then upon returning from recvfrom() in the synchronous part of the receiver code the payload's sequence number is compared against 20, 35, 51 and discarded once. That is, a retransmit of 20 is not discarded twice. A simple brute-force method is to loop over a 1-D array containing the sequence numbers whose packets are to be dropped. Since the sequence numbers are monotonically increasing there are more efficient ways to determine if a UDP data packet needs to be dropped. For up to 10 sequence numbers overhead will not be an issue.

Do the same for the sender loss model where sender.lossmodel is a text file containing up to 10 sequence numbers contained in ACK packets to be discarded at the sender. A difference at sender side is that its asynchronous part checks if an arriving ACK packet needs to be ignored.

2.4 Testing

Correctness. The focus of the lab assignment is on correct operation of the file transport protocol. When conducting correctness tests, it is recommended to proceed in a step-by-step modular fashion where simpler versions of the final system are tested first before incrementally adding components/functionality. Coding the entire functionality as noted in the description followed by testing is, in many instances, more difficult and significantly more time consuming. For example, test the system without introducing packet losses (outside of real-world environmental losses than can occur for different reasons), for a smallish file size (e.g., 20 KB), data payload size 1000 B, and slow packet pacing (e.g., micropace set to 50 msec) which keeps a lid on potential complications. Then incrementally add complexity, at each turn performing sanity testing. Provide separate README and Makefile for v2/sender and v2/receiver.

Performance. Although application performance with respect to reducing completion time is a secondary concern for lab3, after establishing correctness tune the parameters of the system to gauge how they contribute to protocol performance with the aim of identifying a reasonably effective parameter set. For example, all else being equal, increasing data payload size, up to a point, will reduce completion time since more file data are packed per UDP payload. The same goes for reducing packing spacing micropace. If optimizing performance were monotonic where reducing (or increasing) a parameter results in continued performance improvement followed by a plateau (saturation), life would be simple. In most real-world systems, monotonicity breaks down so that neither "too little" nor "too much" yield desirable performance but there is a point or region of "just enough" that we seek to identify. Along with an understanding of why. For file sizes around 100 KB (e.g., 100 packets if payload is 1 KB) try to gauge what performance can be attained in our lab environment where latency is small, bandwidth high, and losses (for the most part) low. Discuss your results and findings in lab3.pdf. Up to 10 implementations deemed by the TAs as especially noteworthy with respect to performance (assuming correctness) additional 25 bonus points will be provided.


Bonus problem [30 pts]

The Bonus Problem may be approached in the default way by utilizing 802.11 traffic traces provided, or capturing your own WLAN traffic. Use one of the two approaches, not both. In the first approach, you will find three 802.11 frame capture files, m2*.pcap, in the course directory. Use Wireshark to provide a coarse analysis of the captured traffic. Each file contains roughly a 30-second walltime 802.11 traffic trace captured in HAAS. Since two of the files are not small (about 10000-11000 frames), you may choose a subinterval of 0.5-second duration to analyze the data. If so, please specify for each file which interval you chose. The smaller file contains only about 500 frames. For each file, describe basic features of captured traffic such as which 802.11 network (e.g., 802.11g or 802.11a, frequency band) the frames belong to, basic service sets, types of frames (e.g., beacon, RTS/CTS), data rates (e.g., 6 Mbps indicates that more error correction is applied to protect against noise vs. 54 Mbps), among other factors that may be relevant. The aim is provide a high level characterization of the observed WLAN networks that provide a synopsis of its structure and activities. To earn full credit, describing signal levels (e.g., SNR) of high traffic basic service sets, MAC address of their access points, would be useful to know. In the last part of your analysis, compare your results across the three data sets, highlighting any differences.

The second approach entails capturing your own WLAN trace and providing an analysis similar to above. Capturing 802.11 traffic is not straightforward and system dependent. For example, WLAN drivers in specific Linux, Windows, MacOS operating systems for specific 802.11 interfaces may not support monitor mode (analogous to promiscuous mode for WLANs). Even when supported, the driver may not export captured 802.11 frames but 802.3 frames that carry 802.11's payload but otherwise discards 802.11 specific frame information. The second approach is meaningful only if you have access to a device where you have a root (superuser, administrator) account and the system allows capture of 802.11 frames that may be inspected using Wireshark to analyze traffic. If you are following the second approach, please specify the system environment (OS, 802.11 interface/card, driver, when/where traffic was captured) in addition to providing your analysis. Please keep in mind that you may -- after spending time to research -- determine that your system cannot capture raw 802.11 frames, or that doing so entails significant time investment to configure your environment (possibly involving coding) to enable 802.11 frame capture. If you are interested in Wi-Fi and wireless systems in general, the time spent may yield productive insight, but, otherwise, following the first approach is straightforward and recommended.

The Bonus Problem is completely optional. It serves to provide additional exercises to understand material. Bonus problems help more readily reach the 40% contributed by lab component to the course grade.


Turn-in instructions

Electronic turn-in instructions:

i) For problems that require answering/explaining questions, submit a write-up as a pdf file called lab3.pdf. Place lab3.pdf in your directory lab3/. You can use your favorite editor subject to that it is able to export pdf files which several freeware editors do. Files submitted in any other format will not be graded.

ii) We will use turnin to manage lab assignment submissions. Please check that the relevance source code including Makefile are included in the relevant subdirectories of lab3. In the parent directory of lab3, run the command

turnin -c cs536 -p lab3 lab3

You can check/list the submitted files using

turnin -c cs536 -p lab3 -v

This lab is individual effort. Please note the assignment submission policy specified on the course home page.


Back to the CS 536 web page