SGI Performance Comparisons

RC5-64 and DES Encryption-Cracking

Last Change: 29/Oct/2010

There is an ongoing project to develop a distributed computer system for cracking the RC5 and DES encryption algorithms. The purpose of the project is to prove that brute-force computation can break the encryption, so the project organisers have enlisted the help of anyone on the Internet who has some compute cycles to spare. The combined computational power of the entire collection of machines in enormous; as a result, DES can be cracked in a matter of hours.

Rather usefully for benchmarking purposes, the software can be downloaded for any platform and run in a 'benchmark' mode, in which 10 million RC5-64 tests are executed along with 20 million DES tests (or 'keys') - the numbers of keys processed per second for RC5-64 and DES are the benchmark results (NB: RC5-64 is a more complex problem and so gives lower numbers). Running the benchmark test is simple: download the pre-compiled software for the target system, unpack the archive and run the main program in benchmark mode. For example, on an Indy, the command is:

   rc5des-mips3-32bit -benchmark

A typical output, for example from a 200MHz R4400SC Indy, looks like this:

   Benchmarking RC5 with 10000000 tests:
   .....10%.....20%.....30%.....40%.....50%.....60%.....70%.....80%.....90%....
   Completed in 0.00:01:34.63 [105674.20 keys/sec]
   Benchmarking DES with 20000000 tests:
   .....10%.....20%.....30%.....40%.....50%.....60%.....70%.....80%.....90%....
   Completed in 0.00:00:31.07 [674831.50 keys/sec]

To keep the results intuitively meaningful, one often divides the results by 1000, to give kilokeys/sec. Note: at present, I am not entirely sure that these tests perform integer-only calculations. I shall check on this, but in the meantime I'll include RC5/DES as an integer test - more details below.

Table 34 shows the results (note carefully how you form your initial judgements based on the numbers shown in this table). These results are not multi-threaded, ie. a multi-CPU system only uses one CPU for this test (any multi-CPU system in the table includes a note in curly brackets {} showing the number of CPUs in the system, even though only one is used, just for reference).

                         Clock              RC5-64         DES
System           CPU     (MHz)   L2/L3   (Kkeys/sec)   (Kkeys/sec)

Tezro          R16000    1000    16MB      1628.11      15677.72    I.M. {4}        [hinv]
Fuel           R16000     900     8MB      1459.87      14056.47    I.M.            [hinv]
Fuel           R16000     800     4MB      1296.36      12478.54    I.M.            [hinv]
Origin350      R16000     700     4MB      1139.46      10981.30    ramq            [hinv]
Tezro          R16000     700     4MB      1138.83      10967.92    I.M. {2}        [hinv]
Fuel           R16000     700     4MB      1131.50      10921.14    I.M.            [hinv]
Fuel           R14000     600     4MB       972.76       9371.29    I.M.            [hinv]
Origin300      R14000     500     2MB       813.92       7833.54    I.M. {4}        [hinv]
Onyx2          R14000     500     8MB       812.20       7820.33    I.M. {4}        [hinv]
O2             R7000      600   256K/1MB    799.73       3634.95    I.M.            [hinv]
Fuel           R14000     500     2MB       782.20       7482.29    I.M.            [hinv]
Octane         R12000     400     2MB       645.91       6227.00    I.M.            [hinv]
O2             R12000     400     4MB       581.94       2973.82    Stefan E.       [hinv]
Octane         R12000     350     1MB       568.58       5475.53    I.M. {2}        [hinv]
Octane         R12000     350     2MB       567.39       5464.29    I.M.            [hinv]  (Cache divisor 1.5)
Octane         R12000     350     2MB       567.07       5461.03    I.M.            [hinv]  (Cache divisor 2.0)
O2             R12000     400     2MB       557.70       2929.73    Colin Anderson  [hinv]
O2             R12000     380     2MB       554.58       2792.09    I.M.            [hinv]
Octane         R12000     300     2MB       467.59       4489.21    I.M.            [hinv]
Origin200      R12000     270     4MB       438.92       4229.23    I.M. {2}        [hinv]
O2             R12000     300     1MB       436.52       2200.38    I.M.            [hinv]  (CPU mod, core from an Octane single-300)
O2             R12000     300     1MB       434.37       2213.72    I.M.            [hinv]
O2             R7000      300     256K      395.80       1787.09    Stefan E.       [hinv]
Onyx2          R10000     250     4MB       395.76       3827.72    I.M. {14}       [hinv]
Octane         R10000     250     1MB       384.17       3720.62    {2}
Octane         R10000     250     2MB       394.78       3816.29    I.M.            [hinv]
O2             R10000     250     1MB       354.83       1844.99    I.M.            [hinv]
Onyx           R10000     195     1MB       312.95       2971.11    [Credit 2] {4}  [hinv]
Onyx2          R10000     195     4MB       307.63       3006.16    {12}            [hinv]
Indigo2        R10000     195     1MB       306.93       2979.51    [Credit 2]
Octane         R10000     195     1MB       288.26       2803.58    I.M.
Origin200QC    R10000     180     2MB       286.59       2787.29    [Credit 2] {2}  [hinv]
O2             R10000     195     1MB       276.56       1443.21    I.M.            [hinv]
Indigo2        R10000     175     1MB       271.45       2652.31    I.M.            [hinv]
Origin200      R10000     180     1MB       257.73       1250.00    [Credit 2] {2}  [hinv]
O2             R10000     175     1MB       244.32       1209.19    [Credit 2]      [hinv]
Octane         R10000     175     1MB       234.47       1160.77    [Credit 2]      [hinv]
O2             R5200      300     1MB       227.93       1056.41    I.M.            [hinv]
O2             R5000SC    250     1MB       188.64        869.36    Stefan E.       [hinv]
O2             R5000SC    200     1MB       141.82        652.33    I.M.
Onyx           R4400SC    250     4MB       138.88       1609.56    [Credit 2] {4}  [hinv]
Onyx           R4400SC    250     4MB       136.89       1706.84    I.M.            [hinv]
Indy           R5000SC    180     512K      134.40        617.85    [Credit 1]
Indigo2        R4400SC    250     2MB       133.03        851.04    I.M.
O2             R5000SC    180     512K      131.43        605.61    I.M.            [hinv]
POWER Onyx     R8000SC     90     4MB       129.79       1101.54    [credit 2] {2}  [hinv]
Indy           R5000SC    150     512K      113.07        518.02    I.M.            [hinv]
Onyx           R4400SC    200     4MB       110.87       1289.13    [Credit 2] {4}  [hinv]
Indy           R5000PC    150      -        110.32        506.13    I.M.
Indigo2        R4400SC    200     1MB       106.45        672.71    I.M.
POWER Onyx     R8000       75     4MB       107.96        906.43    [credit 2] {2}  [hinv]
Indy           R4400SC    200     1MB       105.67        674.83    I.M.
Indy           R4600SC    133     512K       97.66        442.76    [Credit 2]
Indy           R4600PC    133      -         92.71        408.47    I.M.
Onyx           R4400SC    150     1MB        83.12        968.46    [Credit 2]
Crimson        R4400SC    150     1MB        81.66        522.75    [Credit 2]      [hinv]
Indigo         R4400SC    150     1MB        79.17        503.75    I.M.            [hinv]
Indigo2        R4400SC    150     1MB        78.46        501.14    I.M.            [hinv]
Indy           R4600PC    100      -         69.17        305.48    I.M.
POWER Series   R4400SC    100     1MB        54.05        345.92    [Credit 2]      [hinv]
POWER Series   R4000SC    100     1MB        32.67        282.19    [Credit 2]      [hinv]
Indigo         R4000SC    100     1MB        31.10        267.65    I.M.            [hinv]
Indigo2        R4000SC    100     1MB        30.89        265.06    I.M.            [hinv]
POWER Series   R3000SC     33     256K       27.67        118.10    [Credit 2] {6}  [hinv]
POWER Series   R3000SC     25     256K       20.50         89.13    [Credit 2] {2}  [hinv]

Table 34: RC5-64/DES v2.7100.415 Benchmark Test Results (Kkeys/sec)

Here is a typical curious observation:

Why this odd difference? Well, firstly, the DES test is actually not that complicated. Even the slow Indy completes the test in not much more than a minute. As a result, the differences between some systems in terms of actual elapsed time can be very small (just 1 second in the case of O2 R5K/200 vs. Indy R4400/200), so the DES numbers themselves are not very useful. Thus, in terms of a meaningful benchmark, the RC5-64 test is the more useful of the two, though I shall list both results as DES can be used to compare older SGIs such as Crimson and Indigo (this is like the uselessness of some graphics tests which achieve many-tens or hundreds of frames per second on modern systems: useless for comparing newer systems, but handy for comparing older ones).

Larger L2 cache sizes seems to aid the DES test, eg. the R4400SC/250 Indigo2 DES result. Another example: Onyx2 R10K/195 (4MB) is a bit faster than Octane R10K/195 (1MB).

Either way, here is the same information as Table 34, but this time the results are shown as elapsed times for each system, rounded to the nearest second, in order of best RC5-64 scores:

                                    RC5-64         DES         TOTAL
                                    (mm:ss)      (mm:ss)      (mm:ss)

Tezro R16000 1GHz 16MB L2:           00:06        00:01        00:07   [hinv]
Fuel R16000 900MHz 8MB L2:           00:07        00:02        00:09   [hinv]
Fuel R16000 800MHz 4MB L2:           00:08        00:02        00:10   [hinv]
Origin350 R16000 700MHz 4MB L2:      00:09        00:02        00:11   [hinv]
Tezro R16000 700MHz 4MB L2:          00:09        00:02        00:11   [hinv]
Fuel R16000 700MHz 4MB L2:           00:09        00:02        00:11   [hinv]
Fuel R14000 600MHz 4MB L2:           00:10        00:02        00:12   [hinv]
Origin300 R14000 500MHz 2MB L2:      00:12        00:03        00:15   [hinv]
Onyx2 R14000 500MHz 8MB L2:          00:12        00:03        00:15   [hinv]
Fuel R14000 500MHz 2MB L2:           00:13        00:03        00:16   [hinv]
O2 R7000 600MHz 256K/1MB L2:         00:13        00:06        00:19   [hinv]
Octane R12000 400MHz 2MB L2:         00:16        00:03        00:19   [hinv]
O2 R12000 400MHz 4MB L2:             00:17        00:07        00:24   [hinv]
O2 R12000 400MHz 2MB L2:             00:17        00:07        00:24   [hinv]
Octane Dual-R12000 350MHz 1MB L2  :  00:18        00:04        00:22   [hinv]
Octane R12000 350MHz 2MB L2:         00:18        00:04        00:22   [hinv]
O2 R12000 380MHz 2MB L2:             00:18        00:08        00:26   [hinv]
Octane R12000 300MHz 2MB L2:         00:21        00:05        00:26   [hinv]
Origin200 R12000 270MHz 4MB L2:      00:23        00:05        00:28   [hinv]
O2 R12000 300MHz 1MB L2:             00:23        00:09        00:32   [hinv]
O2 R12000 300MHz 1MB L2 (mod):       00:23        00:10        00:33   [hinv]
Onyx2 R10000 250MHz 4MB L2:          00:25        00:05        00:30   [hinv]
Octane R10000 250MHz 2MB L2:         00:25        00:05        00:30   [hinv]
O2 R7000 300MHz 256K L2:             00:25        00:12        00:37   [hinv]
Octane R10Kx2 250MHz 1MB L2:         00:26        00:06        00:32
O2 R10000 250MHz 1MB L2:             00:28        00:11        00:39   [hinv]
Onyx R10000 195MHz 1MB L2:           00:32        00:07        00:39   [hinv]
Indigo2 R10000 195MHz 1MB L2:        00:33        00:07        00:40
Octane R10000 195MHz 1MB L2:         00:35        00:07        00:42
Origin200QC R10000 180MHz 2MB L2:    00:35        00:08        00:43   [hinv]
O2 R10000 195 1MB L2:                00:36        00:15        00:51   [hinv]
Indigo2 R10000 175MHz 1MB L2:        00:37        00:08        00:45   [hinv]
Origin200 R10000 180MHz 1MB L2:      00:39        00:16        00:55   [hinv]
O2 R10000 175MHz 1MB L2:             00:41        00:17        00:58   [hinv]
Octane R10000 175MHz 1MB L2:         00:43        00:17        01:00   [hinv]
O2 R5200 300MHz 1MB L2:              00:44        00:20        01:04   [hinv]
O2 R5000SC 250MHz 1MB L2:            00:53        00:24        01:17   [hinv]
O2 R5000SC 200MHz 1MB L2:            01:11        00:32        01:43
Onyx2 R4400SC 250MHz 4MB L2:         01:12        00:13        01:25   [hinv]
I2 R4400SC 250MHz 2MB L2:            01:15        00:25        01:40
Indy R5000SC 180MHz 512K L2:         01:15        00:34        01:49
O2 R5000SC 180MHz 512K L2:           01:16        00:35        01:51   [hinv]
POWER Onyx R8000SC 90MHz 4MB L2:     01:17        00:19        01:36   [hinv]
Indy R5000SC 150MHz 512LK L2:        01:28        00:40        02:08   [hinv]
Onyx R4400SC 200MHz 4MB L2:          01:30        00:16        01:46   [hinv]
Indy R5000PC 150MHz:                 01:31        00:41        02:12
Indigo2 R4400SC 200MHz 1MB L2:       01:34        00:31        02:05
Indy R4400SC 200MHz 1MB L2:          01:34        00:31        02:05
Indy R4600SC 133MHz 512K L2:         01:43        00:47        02:30
Indy R4600PC 133MHz:                 01:48        00:51        02:49
Crimson R4400SC 150MHz 1MB L2:       02:02        00:40        02:42   [hinv]
Indigo R4400SC 150MHz 1MB L2:        02:06        00:42        02:48   [hinv]
Indigo2 R4400SC 150MHz 1MB L2:       02:07        00:42        02:49   [hinv]
Indy R4600PC 100MHz:                 02:25        01:09        03:34
POWER Series R4400SC 100MHz 1MB L2:  03:05        01:00        04:05   [hinv]
POWER Series R4000SC 100MHz 1MB L2:  05:07        01:14        06:21   [hinv]
Indigo R4000SC 100MHz 1MB L2:        05:22        01:18        06:40   [hinv]
Indigo2 R4000SC 100MHz 1MB L2:       05:24        01:19        06:43   [hinv]
POWER Series R3000SC 33MHz 256K:     06:01        02:58        08:59   [hinv]
POWER Series R3000SC 25MHz 256K:     08:08        03:55        12:03   [hinv]

     Table 35: RC5-64/DES Benchmark Test Results (Times)

To an extent, these tests really do need to last much longer in order to be genuinely statistically useful - running the benchmark 10 times in a row is not an answer though, for reasons described below. Unlike Table 34, Table 35 clearly shows that the differences for DES between O2 R5K/200 and Indy R4400/200 are meaningless - just 1 second (remember that the program may easily take a second or more to load and initiate the test). Given this fact, I recommend quoting elapsed times when comparing systems, not the Kkeys/sec results.

Another two items to consider, points which in a way contradict each other:

A final warning about these tests. I said before that the program may take a moment or two to initialise. This extra time can hide the real differences between systems. Sometimes, the differences can be quite stark. For example, using the 'timex' command, try extracting the actual elapsed user-time from the overall running time for Indigo2 R4400SC/250 vs. O2 R5000SC/200 (note at this point that Table 35 shows the Indigo2 to apparently have taken less time overall). Using this command for O2:

   timex ./rc5des-mips4-32bit -benchmark

and this command for Indigo2:

   timex ./rc5des-mips3-32bit -benchmark


Here are the results, plus data for other systems, in order of real (total) time:

                                       real        user        sys
                                    (mm:ss.ss)  (mm:ss.ss)    (s.ss)

Tezro R16000 1GHz 16MB L2:           00:07.57    00:07.48      0.01   [hinv]
Fuel R16000 900MHz 8MB L2:           00:08.39    00:08.32      0.01   [hinv]
Fuel R16000 800MHz 4MB L2:           00:09.52    00:09.36      0.01   [hinv]
Origin350 R16000 700MHz 4MB L2:      00:10.69    00:10.68      0.01   [hinv]
Tezro R16000 700MHz 4MB L2:          00:10.70    00:10.69      0.01   [hinv]   
Fuel R16000 700MHz 4MB L2:           00:10.95    00:10.70      0.02   [hinv]
Fuel R14000 600MHz 4MB L2:           00:12.65    00:12.48      0.01   [hinv]
Origin300 R14000 500MHz 2MB L2:      00:15.06    00:14.96      0.01   [hinv]
Onyx2 R14000 500MHz 8MB L2:          00:15.17    00:14.96      0.02   [hinv]
Fuel R14000 500MHz 2MB L2:           00:15.63    00:15.00      0.09   [hinv]
O2 R7000 600MHz 256K/1MB L2:         00:18.31    00:18.09      0.02   [hinv]
Octane R12000 400MHz 2MB L2:         00:18.92    00:18.67      0.02   [hinv]
Octane R12000 350MHz 2MB L2:         00:21.47    00:21.39      0.02   [hinv] (Cache divisor 1.5)
Octane Dual-R12000 350MHz 1MB L2:    00:21.52    00:21.39      0.02   [hinv]
Octane R12000 350MHz 2MB L2:         00:21.69    00:21.39      0.02   [hinv] (Cache divisor 2.0)
O2 R12000 400MHz 2MB L2:             00:24.50    00:23.73      0.03   [hinv]
O2 R12000 380MHz 2MB L2:             00:25.58    00:25.26      0.02   [hinv]
Octane R12000 3000MHz 2MB L2:        00:26.09    00:25.04      0.17   [hinv]
Origin200 R12000 270MHz 4MB L2:      00:27.90    00:27.69      0.02   [hinv]
Onyx2 R10000 250MHz 4MB L2:          00:30.76    00:30.63      0.03   [hinv]
Octane R10000 250MHz 2MB L2:         00:30.84    00:30.63      0.04   [hinv]
Octane R10000 250MHz 1MB L2:         00:32.22    00:30.76      0.14
O2 R12000 300MHz 1MB L2 (mod):       00:32.48    00:31.89      0.03   [hinv]
O2 R12000 300MHz 1MB L2:             00:32.53    00:31.83      0.08   [hinv]
Onyx R10000 195MHz 1MB:              00:39.06    00:38.93      0.03   [hinv]
O2 R10000 250MHz 1MB L2:             00:39.59    00:38.82      0.04   [hinv]
Indigo2 R10000 195MHz 1MB L2:        00:39.68    00:39.16      0.06
Octane R10000 195MHz 1MB L2:         00:42.36    00:39.38      0.70
Origin200QC R10000 180MHz 2MB L2:    00:42.44    00:42.32      0.03   [hinv]
Indigo2 R10000 175MHz 1MB L2:        00:45.65    00:43.69      0.08   [hinv]
O2 R10000 195 1MB L2:                00:50.74    00:49.58      0.14   [hinv]
Origin200 R10000 180MHz 1MB L2:      00:55.64    00:53.87      0.15   [hinv]
O2 R10000 175MHz 1MB L2:             00:57.56    00:55.58      0.14   [hinv]
Octane R10000 175MHz 1MB L2:         00:59.90    00:55.57      0.58   [hinv]
O2 R5200 300MHz 1MB L2:              01:03.78    01:02.65      0.06   [hinv]
Onyx R4400SC 250MHz 4MB L2:          01:25.11    01:24.73      0.05   [hinv]
Indigo2 R4400SC 250MHz 2MB L2:       01:39.87    01.37.77      0.17
O2 R5000SC 200MHz 1MB L2:            01:42.77    01:34.65      0.28
POWER Onyx R8000SC 90MHz 4MB L2:     01:36.25    01:35.49      0.10   [hinv]
Onyx R4400SC 200MHz 4MB L2:          01:46.63    01:45.94      0.08   [hinv]
Indy R5000SC 180MHz 512K L2:         01:48.83    01:45:30      0.30
O2 R5000SC 180MHz 512K L2:           01:50.75    01:45.20      0.31   [hinv]
POWER Onyx R8000SC 75MHz 4MB L2:     01:55.84    01:54.66      0.14   [hinv]
Indigo2 R4400SC 200MHz 1MB L2:       02:05.16    02:02.41      0.26
Indy R4400SC 200MHz 1MB L2:          02:06.35    02:02.97      0.41
Indy R5000SC 150MHz 512K L2:         02:09.06    02:06.16      0.25   [hinv]
Indy R5000PC 150MHz:                 02:07.52    02:12.15      0.62
Indy R4600SC 133MHz 512K L2:         02:30.44    02:26.03      0.50
Indy R4600PC 133MHz:                 02:32.36    02:24.83      0.62
Indigo R4400SC 150MHz 1MB            02:48.09    02:43.43      0.44   [hinv]
Indigo2 R4400SC 150MHz 1MB L2:       02:49.41    02:43.65      0.42   [hinv]
Crimson R4400 150MHz 1MB:            02:52.66    02:51.54      0.19   [hinv]
Indy R4600PC 100MHz:                 03:33.14    03:20.84      1.53
POWER Series R4400SC 100MHz 1MB L2:  04:05.81    04:02.42      0.46   [hinv]
POWER Series R4000SC 100MHz 1MB L2:  06:20.84    06:15.53      0.68   [hinv]
Indigo R4000SC 100MHz 1MB L2:        06:40.09    06:23.73      1.56   [hinv]
Indigo2 R4000SC 100MHz 1MB L2:       06:42.95    06:23.68      1.74   [hinv]
POWER Series R3000 33MHz 256K L2:    09:10.20    09:04.15      2.93   [hinv]
POWER Series R3000 25MHz 256K L2:    12:03.47    11:49.01      4.98   [hinv]

             Table 36: Timex output for running RC5-64/DES.

The results are enlightening. They show that the O2 and slower Indys spent much longer running up the program and initialising, while the Indigo2 and fast Indy did not. Look at the 'user' times: the O2 is faster, even though the overall elapsed time shows the opposite compared to Indigo2. Thus, when actually running the proper processing software, I expect the R5K/200 O2 would be quicker than R4400/250 Indigo2 (ie. the application startup time will quickly become unimportant).

Although I said earlier that the benchmark can be run on any system, the above observations do present a problem: how can one tell what proportion of a run time for a non-SGI system is the equivalent of the 'user' times shown above? I don't know is the short answer. Certainly, one shouldn't make conclusions between different systems when the run times are small. In that sense, from an SGI perspective, the 'user' times in Table 36 are actually more useful even than the times given in Table 35. What this may mean for how one should analyse non-SGI systems is anyone's guess. It certainly makes things more complicated. Personally, I would only make inter-vendor judgements when the overall run times are very different (eg. comparing a Pentium200 to an R4400SC/200), ie. more than 10% apart, baring in mind that other differences such as motherboard-type may be important.

Actually, at this point I have a small confession to make: this page includes the RC5/DES benchmark results precisely because there is more to these tests than meets the eye. Let this serve as a warning not to take apparent 'benchmark' results at face value - it's one reason why the SPEC95 tests are designed to take a very long time to run. Thinking along the same lines, this is also why I'm looking for much more complicated 3D graphics tests (complex 3D models).

In my opinion, a proper benchmarking mode for the RC5-64/DES program would offer the option of running much more than just 10 or 20 million tests; ten times that many would be much better, say 100 million. Better still, include a command-line option so that one can specify the number of keys to be test-processed. This would allow one to mask-out the effects of program initialisation and thus focus on the actual key-cracking processing power of the CPU.

Credits:

[1] - Martin Doll, Ph. D. (mkhd@into.ch), Institute of Organic Chemistry Univerisity of Zurich Wintherthurerstr. 157 CH-8057 Zurich.

[2] - Simon Pigot (simon@dpiwe.tas.gov.au), Parks and Wildlife GIS Unit, Tasmania. Supplied hinv of the POWER Series, Octane, O2, Origin200, Origin200QC and Onyxs: