On the Effectiveness of Simultaneous Multithreading on Network Server Workloads
Report ID: TR-793-07Author: Pai, Vivek / Ruan, Yaoping / Nahum, Erich / Tracey, John
Date: 2007-08-00
Pages: 12
Download Formats: |PDF|
Abstract:
This paper experimentally investigates the effectiveness of simultaneous
multithreading (SMT) for network server workloads. We study how well SMT
improves performance on two very different shipping platforms that support
SMT: IBM's POWER5 and the Intel Xeon. We use the architectural performance
counters available on each processor to compare their architectural
behavior, examining events such as cache misses, pipeline stalls, etc. By
observing how these events change in response to the introduction of
threading, we can determine whether and how SMT stresses the system.
We find that POWER5 makes more effective use of SMT for improving
performance than the Xeon. In general, POWER5 achieves a 40-50% increase
whereas the Xeon exhibits only a 10-30% gain. Examination using the
performance counters reveals that cache size and memory bandwidth are the
key requirements for fully exploiting SMT. A secondary but still
noticeable component is minimizing pipeline stalls due to branch
mispredicts and interrupts. We present suggestions for improving SMT
performance for each platform based on our observations.