Are there performance advantages to be gained by linking GapMiner against locally-compiled static GMP and MPFR libraries which have been optimised for the host CPU, as opposed to using the standard distro packages which are only configured for the host OS?
This test is a comparison between reported performance of standard GapMiner versus one compiled with optimised GMP/MPFR (where “optimised” means “compiled with GCC flags -static -march=native -mtune=native
”).
Column labels map directly to miner output: pps
is average primes per second, tps
is average tests per second, gps
is average gaps per second, glst
is the size of the gaplist and l/s
is the time in seconds to scan the gaplist at the reported rate of gaps per second.
standard gapminer
measure | pps | tps | gps |
---|---|---|---|
mean | 381417 | 121373 | 17836 |
max | 450402 | 143195 | 21061 |
min | 371074 | 118095 | 17352 |
std | 15163 | 4804 | 709 |
optimised gapminer
measure | pps | tps | gps |
---|---|---|---|
mean | 409228 | 130275 | 19137 |
max | 479724 | 152951 | 22436 |
min | 398576 | 126829 | 18638 |
std | 15814 | 5077 | 740 |
per cent improvement
measure | pps | tps | gps |
---|---|---|---|
mean | 7.29 | 7.33 | 7.29 |
max | 6.51 | 6.81 | 6.53 |
min | 7.41 | 7.34 | 7.41 |
std | 4.29 | 5.68 | 4.37 |
The charts included below are for completeness. The figures given are the means and they differ from the pandas-produced means because, in order to support the different requirements of charting, the first 1/5th of the data is ignored in order to aid visual comparison. The chartline uses the full dataset.
Replication:
./bin/gapminer-standard -o localhost -p 31397 -u $(RPCUSER) -x $(RPCPASSWORD) -e -t 4 -f 32 > standard-miner.log 2>&1
./bin/gapminer-optimised -o localhost -p 31397 -u $(RPCUSER) -x $(RPCPASSWORD) -e -t 4 -f 32 > optimised-miner.log 2>&1