| Home Publications
 Awards
 Research
 NB Collectives
 MPI Topologies
 MPI Datatypes
 Netgauge
 LogGPS
 OS Noise
 eBB
 Network Topologies
 Ethernet BTL eth
 ORCS
 DFSSSP
 Older Projects
 cDAG
 LogGOPSim
 CoMPIler
 Teaching
 Miscellaneous
 Full CV [pdf]
 BLOG
 bio
 
 
 
   
 
   
 Events
 
   
   
   
   
  
 
 
 Past Events
 
   
   
   
   
   
   | Netgauge - Operating System Noise Measurement 
 
| Netgauge OS Noise Measurement Description:The noise pattern in Netgauge allows the precise measurement of OS 
  Noise. The current version supports three different benchmark methods:
Netgauge (including all OS Noise benchmarks) can be downloaded on the 
main Netgauge page.Fixed Work Quantum
Fixed Time Quantum
Selfish Detour
 |  
	| 
 |  
	| GeneralThe benchmarks use Netgauge's high-performance timers for different 
architectures. Users should make sure that the configure script detected the
timer correctly and that it works reliably (no frequency scaling etc.). 
Netgauge should be run on all cores (CPUs) of a processing system to ensure 
realistic benchmarks. For example, if the machine has four cores and two sockets,
then Netgauge should be run with 8 processes on this machine.
General help: mpirun -n 1 ./netgauge -x noise --help |  
	| 
 |  
	| Selfish Detour (selfish)This is the default benchmark. It is a modified version of the selfish 
detour benchmark proposed in [5]. The benchmark runs in a tight loop and 
measures the time for each iteration. If an iteration takes longer than the 
minimum times a particular threshold, then the timestamp (detour) is recorded.
The benchmark runs until it recorded a predefined number of detours (it will
never halt on a noise-free BG/P system!).
Example run (on a very noisy laptop):
mpirun -n 2 ./netgauge -x noise 
# Info:   (0): Netgauge v2.2 MPI enabled (P=2) (./netgauge -x noise )
# initializing x86-64 timer (takes some seconds)
# Info:   (0): writing data to ng.out
# Info:   (0): performing Selfish benchmark
# min clock cycles per rank: 91 91 
# Minimal cycle length [ns]: 42.055092 
# Number of iterations (recorded+unrecorded): 606344901 
# Threshold: [% minimal cycle length]: 900 
# CPU overhead due to noise: 7.80%
# Measurement period: 41.26 s
The file "ng.out" contains the time of each detour. The 
data can be plotted with the gnuplot command: plot "ng.out" .   |  
	| 
 |  
	| Fixed Work Quantum (FWQ)The fixed work quantum benchmark performs a fixed amount of work 
multiple times and records the time it takes for each run.
Example run:
mpirun -n 2 ./netgauge -x noise -e fwq
# Info:   (0): Netgauge v2.2 MPI enabled (P=2) (./netgauge -x noise -e 
# fwq )
# initializing x86-64 timer (takes some seconds)
# Info:   (0): writing data to ng.out
# Info:   (0): performing Fixed Work Quantum benchmark
# random output (to prevent compiler optimizations): 1497980154
# random output (to prevent compiler optimizations): 1497980154
# random output (to prevent compiler optimizations): -931598522
# random output (to prevent compiler optimizations): -931598522
The file "ng.out" contains the detailed time of each work-quantum. The 
data can be plotted with the gnuplot command: plot "ng.out" .   |  
	| Fixed Time Quantum (FTQ)Netgauge supports the Fixed Time Quantum (FTQ) Benchmark described in [2]:
A very small work quantum is performed until a fixed time quantum has exceeded,
for each iteration it is recorded how many workload iterations were carried out.
In the absence of noise this number should be equal for every sample. When there
is noise this number varies. Because the start and end time of every sample is
defined (because every sample takes an equal amount of time), periodicity in the
occurance of noise can be analyzed with this method.
The workload should be portable (written in C) and not modified by compiler
optimizations. For this purpose we use the workload described in [4]. A good
starting point for the length of the time quantum should be one millisecond as
suggested by [3].
Example run:
mpirun -n 2 ./netgauge -x noise -e ftq
# Info:   (0): Netgauge v2.2 MPI enabled (P=2) (./netgauge -x noise -e ftq )
# initializing x86-64 timer (takes some seconds)
# Info:   (0): writing data to ng.out
# Info:   (0): performing Fixed Time Quantum benchmark
# random output (to prevent compiler optimizations): 1497980154
# random output (to prevent compiler optimizations): 1497980154
# random output (to prevent compiler optimizations): -931598522
# random output (to prevent compiler optimizations): -931598522
The file "ng.out" contains a detailed number of iterations for each time 
slice. The data can be plotted with the gnuplot command: plot "ng.out" .   |  
	| 
 |  
	| AcknowledgmentsThe noise pattern was funded by the FastOS II (LAB 
07-23) project. |  
 
	| 
 |   References | HPCC'07 | [1] Torsten Hoefler, Torsten Mehlan, Andrew Lumsdaine and Wolfgang Rehm: |  |  | Netgauge: A Network Performance Measurement Framework Vol 4782, In Proceedings of High Performance Computing and Communications, HPCC'07, presented in Houston, USA, pages 659-671, Springer, ISBN: 978-3-540-75443-5, Sep. 2007,      | 
 
  | [2] 
    Matthew Sottile and Ronald Minnich: |  |  | Analysis of microbenchmarks for performance tuning of clusters
  IEEE International Conference on Cluster Computing, 2004, pages 371-377, ISSN: 1552-5244, ISBN: 0-7803-8694-9 |  | [3] 
    Fabrizio Petrini, Darren J. Kerbyson, Scott Pakin: |  |  | The Case of the Missing Supercomputer Performance
  SC '03: Proceedings of the 2003 ACM/IEEE conference on Supercomputing, ISBN 1-58113-695-1 |  | [4] 
    Carl Staelini and Larry McVoy |  |  | mhz: Anatomy of a micro-benchmark
  USENIX Annual Technical Conference (NO 98), 1998 |  | [5] 
    P. Beckman, K. Iskra, K. Yoshii, S. Coghlan, and A. Nataraj |  |  | Benchmarking the Effects of Operating System Interference on Extreme-Scale Parallel Machines
  Cluster Computing, vol. 11, no. 1, pp. 3-16, 2008. |  |