On the Effectiveness of Simultaneous Multithreading on Network Server Workloads

Report ID: TR-793-07
Author: Pai, Vivek / Ruan, Yaoping / Nahum, Erich / Tracey, John
Date: 2007-08-00
Pages: 12
Download Formats: |PDF|
Abstract:

This paper experimentally investigates the effectiveness of simultaneous multithreading (SMT) for network server workloads. We study how well SMT improves performance on two very different shipping platforms that support SMT: IBM's POWER5 and the Intel Xeon. We use the architectural performance counters available on each processor to compare their architectural behavior, examining events such as cache misses, pipeline stalls, etc. By observing how these events change in response to the introduction of threading, we can determine whether and how SMT stresses the system.

We find that POWER5 makes more effective use of SMT for improving performance than the Xeon. In general, POWER5 achieves a 40-50% increase whereas the Xeon exhibits only a 10-30% gain. Examination using the performance counters reveals that cache size and memory bandwidth are the key requirements for fully exploiting SMT. A secondary but still noticeable component is minimizing pipeline stalls due to branch mispredicts and interrupts. We present suggestions for improving SMT performance for each platform based on our observations.