In network simulators, it is easy to create a topology, assign tasks to nodes, and monitor every single packet. A basic testbed -- without any software support that mirrors some of these simulator capabilities -- is extremely limited in its usefulness, since it requires the experimenters to be experts in system-level programming. To achieve a level of control that is comparable to that provided by a simulator on physical testbed machines is a significant undertaking, requiring extensive utility development. Topology creation capabilities are provided by emulation testbeds, such as Emulab and DETER, but an experimenter only acquires bare machines that form the desired topology, without any tools running on them.
A natural approach to describe tasks that must be performed on the testbed nodes is to use event scripts, much like events in an event-driven simulator. The Emulab software implements certain event types such as link failures; however, most of the interaction with the nodes must be performed via a secure shell (SSH) session. We have designed a flexible mechanism to control all test machines from a central location, since manually using each computer is impossible, especially when timed events are involved. We have developed a utility, which we refer to as a Scriptable Event System (SES), to parse a script of timed events and execute it on the test machines. Our utility is capable of receiving callbacks, such that event synchronization can be achieved.
Instrumentation and measurement on a testbed also pose a significant challenge. The capability to log and correlate different types of activities and events in the test network is essential. Not only are packet traces important, but also system statistics must be measured. We have developed a set of tools to log events on the test nodes on a per second basis. Statistics such as CPU utilization, packets per second, and memory utilization are logged to the local disk for later inspection.
The SES is composed of a master server and zombies/clients.
MASTER
The master server must be started before the zombies can attach to it. The only customization that can be done to the master at this point in time is the ability to specify a port on which to open a socket (20666 is the default).
The master can reside either on users.emulab.net or on some experiment node as long as the control network is used to make sure that messages always reach the participating zombies.
Master control
exit - Exit the server and cleanly destroy all the threads
list - show all the available zombies
t "pause" | Block processing of new commands for the given time period. |
t name1..nameN "cmd" | Have zombies name1 through nameN perform a command cmd, in "t" seconds from now. |
t name1..nameN "cmd!" | Same as the above, except the input of new commands is blocked until the "cmd" finishes on all of the specified zombies. |
t name1..nameN "stop" | Stop all the tasks that are running on the specified zombies. Kills everything in the zombie process group except the zombie (i.e., any processes started from shell scripts will die as well). |
^C | Stop all tasks for all the zombies and also purge all tasks from the input stream. Sometimes ^C has to be hit a few times to ensure that all tasks are purged and callbacks are terminated. |
run cmd | Run a command and use its output as the input of the server. Thus, a PERL script can have loops and conditionals and just print master server commands which will get executed by the master server. |
Note 1: "t" is always an offset relative from the current moment in
time; however, events that pause input (pause, !) cause the current
time to advance.
Example 1:
5 "pause"
1 node1 "ls > ls.out"
Node1 will execute the cmd "ls > ls.out" 6 seconds after the script
starts.
Example 2:
0 node2 "run_some_test!"
0 node3 "copy_measurements"
node3 will execute the command after the command on node2 finishes. If
the command takes 500 seconds, then node3 will execute its command at the
500 second mark.
Note 2: A node can schedule more than 1 task at a time. The tasks will
be executed in the order they are encountered in the script.
0 node1 "ls > ls.out"
0 node1 "ls -la > la.out"
"ls > ls.out" will be executed before "ls -la > la.out"; The tasks will
be executed back to back in the same scheduling round.
CLIENT
The client is similar to a regular shell except it does not support pipes or fancy things such as pattern matching. A client can start multiple tasks. It then periodically polls which ones are still active in order to maintain its task list. The client has to be provided with the name of the machine that runs the master server. Optionally, the port number can be specified if the master is not running at the default port.
For convenience, the user can use the "-nt" command argument to reduce the length of the zombie name reported to the server (e.g., node1.proj.group would be reported as node1 to the master server).
tmeas - This tool records a number of system level statistics. The tool measures
data on all of the interfaces that have a 10.x.x.x address. If something
else is used, then MATCH_ADDR in tmeas.c can be modified. The tool
expects the the user provides the name of the file where logging must
occur. (Note: always log on local disk and do not use the NFS while the
experiment is running.) It is possible to specify the duration of the
run in seconds or the tool can be just killed (i.e., send stop command in the SES).
The file fields are as follows
timestamp,
bytes_per_sec,
pack_per_sec,
bytes_per_sec_up,
pack_per_sec_up,
memtotal,
memused,
uptime,
idletime,
established TCP connections,
half_open TCP connections,
TCPSlowStartRetrans count,
TCPAbortOnTimeout count,
errs on the device drivers,
drops on the device drivers
cwnd_track - This is loosely based on tmeas. The purpose of the tool is pretty
limited in its current form. The main goal is to poll TCP congestion window (Cwnd) values
for a given IP address. If there is no connection to the
provided IP address, the tool waits and logs nothing. Once the
connection appears, the tool logs the value along with the time
stamp.
We have written a set of scripts that are helpful in analyzing data from BGP
logs files and the "tmeas" tool. Since "tmeas" collects a lot of system
dependent data per node in a single file, it is essential to be able to
merge similar statistics for several nodes. The resulting merged file
can be easily fed into gnuplot or other plotting tools.
The scripts are short and can be easily examined. A short
overview is given below.
dataPlot.sh is the top level script that needs to be executed. The user
has to specify the directory that contains the measurement files. tmeas
files need to be named tmeas.nodeX. The script dataPlot.sh can be
modified to specify which nodes the user wants to merge and plot. If
there are any bgp log files (dataPlot assumes that *.log is a BGP log file), then
those files will be aggregated and total number of BGP update messages will be plotted
as time progresses. The outputs of the scripts are gnuplot-generated
png or ps files.
The example script below demonstrates how to automate
experiments with the SES. The user must first launch the master server
while the experiment is swapping in, so that the clients can establish
the connections. After all of the connections are established, the scripts
can be executed. The folloring are the steps that the script below executes:
1. Start taking system measurements (node0, node2, node3, r1, r2)
2. Create a TCP sink at node2
3. Run tcpdump on node0 and node2
4. Create a TCP-targeted (square wave) attack in the direction of the receiver
5. Copy a 10 MB file to the TCP sink from node0 and log the output of the
file transfe.
6. Wait until the file transfer is complete
7. Stop the attack and logging
8. Copy the dump files, file transfer results, and system measurements
to a central location
The high level tasks above can be expressed as the following script.
If one wishes to repeat the experiment many times or vary the attack
parameters, a PERL script can be written to print the statements
below and perform string substitution depending on the iteration. You can use
the run command in SES for this.
0 node0 node2 node3 r1 r2 "./tmeas -f /usr/local/tmeas.out"
0 node2 "/usr/bin/ttcp -r > /dev/null"
0 node0 node2 "rm /usr/local/dump.dmp"
0 node0 node2 "sh /proj/DDoSImpact/exp/bell/scripts/dump.sh"
5 node1 "./flood node3 -U -s100 -S10.1.4.1 -W160-4500 -D80000"
9 node0 "/usr/bin/ttcp -v -t node2 < /usr/local/f10m >/usr/local/ttcp.out!"
0 node0 node1 node2 node3 r1 r2 "stop"
0 node0 node2 r1 r2 "killall tcpdump"
1 "pause"
0 node0 "cp /usr/local/dump.dmp /proj/DDoSImpact/exp/bell/data/dump.node0"
0 node2 "cp /usr/local/dump.dmp /proj/DDoSImpact/exp/bell/data/dump.node2"
0 node0 "cp /usr/local/ttcp.out /proj/DDoSImpact/exp/bell/data"
0 node0 "cp /usr/local/tmeas.out /proj/DDoSImpact/exp/bell/data/tmeas.out.node0"
0 node3 "cp /usr/local/tmeas.out /proj/DDoSImpact/exp/bell/data/tmeas.out.node3"
0 node1 "cp /usr/local/tmeas.out /proj/DDoSImpact/exp/bell/data/tmeas.out.node1"
0 node2 "cp /usr/local/tmeas.out /proj/DDoSImpact/exp/bell/data/tmeas.out.node2"
0 r1 "cp /usr/local/tmeas.out /proj/DDoSImpact/exp/bell/data/tmeas.out.r1"
0 r2 "cp /usr/local/tmeas.out /proj/DDoSImpact/exp/bell/data/tmeas.out.r2"