Networking Performance on Intel architectures
Yair Amir & Jonathan Stanton
Department of Computer Science
Johns Hopkins University
 
 

All the machines used for this test were connected by a 100 BaseT Ethernet network using two 3Com LinkBuilder FMS hubs. The chart below specifies each machine used in the tests.

Hardware and OS
 
 

Machine  CPU  Ram(MB)  NIC  OS 
com5  PPro 200  64  SMC 9332  BSDI 3.0 
com3  Pentium 133  64  SMC 9332  BSDI 3.0 
ice2  PPro 200  64  SMC 9332  Linux 2.0.31 
ice1  Pentium II 266  64  DEC DE500  Linux 2.0.31 
comwin  PPro 200  64  SMC 9332  WinNT 4.0 
win3  PPro 200  64  SMC 9332  Win95 
The tests consisted of sending a series of UDP packets from one machine to another. The only processing done upon receipt was checking for corruption and counting packets received and missed. The sender would send a burst of 'b' packets of size 's' and then delay for 't' milliseconds. This test does not measure the total TCP bandwidth, or the bandwidth under other reliable protocols, but rather it measures the rate at which the NIC and operating system can send or receive packets without dropping any(or a minimal number).
 
 

The first observation is that both versions of unix (BSDI and Linux) were able to achieve throughput very close to the maximum possible on fast ethernet, with 95.40 Mbits/second of data and approximately 99% of the bandwidth with headers. Also, the limiti ng factor was not cpu power since the Pentium II was not able to improve on the Pentium Pro's performance. The cpu was the limiting factor for com3 which only has a Pentium 133 which proved especially limiting when receiving packets.
 
 

Second, neither Windows machine was able to achieve either the sending or receiving rates of the Unix machines. Windows NT could only achieve a rate of 76 Mbits/second when sending point to point and Windows 95 could only achieve a rate of 66 Mbits/seco nd compared to the 95+ Mbits/second for unix. The difference in receiving speed was even more significant. Windows NT could only receive 50-55 Mbits/second before beginning to drop significant numbers of packets and Windows 95 was only able to do 3.69 Mb its/second on our tests. Even the highest of these rates still show a 40 Mbits/second gap between Windows and Unix.
 
 

The third major observation is the existence of a number of anomalies in the results. First, both Windows 95 and Windows NT could not send as fast to an IP broadcast address (of the form 128.220.221.255) as they could to a direct point-to-point address ( like 128.220.221.56). The gap was about 20 Mbits/second for both operating systems. Neither of the Unix systems exhibited this difference and were able to send broadcast packets as fast as unicast packets.
 
 

Second, Windows NT exhibited significant performance differences depending on the size of the packet. When sending packets larger then 1024 bytes, even as few as 1030 bytes, it was not able to do more then 44 Mbits/second on unicast packets, which is mor e then 30 Mbits/second slower then the rate for 1024 bytes packets.

The performance differences also appeared in receiving rates. However, the direction of the slow-down was reversed. When receiving packets, larger packets could be received faster and with fewer drops then smaller packets. For example, when using the s ettings 85-1-10000-1450 (b,t,n,s)* we achieved between 0 and a few hundred drops, if we change the packet size to 1024 we get around 2000 drops, and if we use a packet size of 1200 we get around 1000 drops, from 1200-1450 it scales down to the 0 to a few hundred we get at 1450. This seems rather inexplicable as the most common case would seem to be smaller packets could be handled faster.
 
 

Finally, as a point of comparison one should note that a Pentium 133 running BSDI is able to perform equivalently to or better then Windows NT on a Pentium Pro 200 at sending, and significantly better then Windows 95 on a Pentium Pro 200 at both sending a nd receiving.
 
 

Main Results
 
 

Machine  Test Type  Settings(b,t,n,s)*  Time  Lost  Speed  Notes 
com5  sending  100-0-100000-1400  11.72  95.40 
receiving  100-0-100000-1400  11.72  95.40 
com3  sending  100-0-100000-1400  15.84  70.70 
receiving  45--1-10000--1400  4.44  25.28 
ice2  sending  100-0-100000-1400  11.70  95.40 
receiving  100-0-100000-1400  11.70  95.40 
ice1  sending  100-0-100000-1400  11.70  95.40 
receiving  100-0-100000-1400  11.70  95.40 
comwin  sending  100-0-100000-1024  10.776  76.020  Point to Point 
sending  100-0-100000-1024  14.682  55.79  Broadcast 
sending  100-0-100000-1400  25.286  44.29  For any size >1024 
receiving  100-1-20000--1450  4.12  0->500  56.58  # drops varied dramatically 
receiving  85--1-20000--1450  4.70  49.25  Stable at 0 drops 
receiving  20--1-10000--1024  10.00  20-50  8.19**  Much slower receiving smaller packets 
win3  sending  100-0-100000-1400  16.15  66.272  Point to Point 
sending  100-0-100000-1400  24.6  45.528  Broadcast 
receiving  9---1-10000--1024  22.22  3.69** 
* b = # packets in burst, t = ms of delay between bursts, n = number of packets sent in test, s = size of data packet
 
 

** This is not the maximum bancwidth possible. We were able to achieve much more then that using our own reliable multicast protocols which take care of flow control and reliability. The speed in the table indicates the maximum bandwidth that the machine can sustain without substantial packet losses (losses are always below %0.2), making this test protocol-independent.
 
 

The source code for the test programs and binaries for (IRIX 6.3, Solaris, SunOS, Linux 2.0, BSDI 3.0, Windows NT, and Windows 95) are available at

ftp://ftp.cnds.jhu.edu/pub/benchmarks/udp_bench.tar.gz

If you would like to compile the programs yourself please contact us at jonathan@cs.jhu.edu or yairamir@cs.jhu.edu.
 
 

Back to Center for Networking and Distributed Systems and Department of Computer Science.