I've been confused about the scaling of pencil on Kraken. Just to remind myself, here are data on a 64 x 256^2 run

some data:

[jsoishi@krakenpf7 B100Re1600Pm4_64]$ grep microsec *
R1600P4_64_128proc.o175745: Wall clock time/timestep/meshpoint [microsec] = 0.398E-01
R1600P4_64_256proc.o175750: Wall clock time/timestep/meshpoint [microsec] = 0.216E-01
R1600P4_64_64proc.o175743: Wall clock time/timestep/meshpoint [microsec] = 0.755E-01

in table form, where speedup is the efficiency compared to perfect scaling (ie, 2 for 128, 4 for 256)

procs

usec/step/point

speedup

64

0.755E-01

1

128

0.398E-01

0.95

256

0.216E-01

0.87

Actual Performance

On the problem, we did 12 hours on 256 processors, going 289000 steps, to 34.4 orbits, or roughly 1/3 of the way through. This means that we should take 3*256*12 = 9216 CPU hours for this run. The second 11.9 hour run went 354100 steps, to 74.6 orbits, or 3/4 (rather than 2/3) of the way. Performance appears quite variable: this run didn't exceed wall limits and so reported a usec/step/point of 0.290E-01.

MriDynamo/Scaling (last edited 2009-07-10 15:53:48 by JsOishi)