Changes between Version 15 and Version 16 of LimulusBenchmarks
- Timestamp:
- 02/03/16 08:45:56 (9 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
LimulusBenchmarks
v15 v16 3 3 === HPL Performance === 4 4 5 ==== i5-2400S ====5 ==== i5-2400S Sandy Bridge ==== 6 6 * '''200.3 GFLOPS''' N=40220 ([wiki:i5-2400S-Raw-HPL Raw HPL Results]) 7 7 * 58% of Peak (3.3GHz * 4 cores * 8 DP FLOPS/cycle) + (2.5Ghz * 12 cores * 8 FLOPS/cycle) = 345.6 GFLOPS Peak 8 * Three i5-2400S and one i5-2500K each with 4 MB RAM, GbE, Intel MKL and compilers8 * Three i5-2400S and one i5-2500K each with 4 MB DDR3 RAM, GbE, Intel MKL and compilers 9 9 10 ==== i5-3470S ====10 ==== i5-3470S Ivybridge ==== 11 11 * '''256.4 GFLOPS''' N=58800 ([wiki:i5-3740S-Raw-HPL Raw HPL Results]) 12 12 * 69% of Peak (2.9GHz * 16 cores * 8 DP FLOPS/cycle) = 371.2 GFLOPS Peak 13 * Four i5-3470S each with 8 MB RAM, GbE, Intel MKL and compilers13 * Four i5-3470S each with 8 MB DDR3 RAM, GbE, Intel MKL and compilers 14 14 15 ==== i5-4570S ====15 ==== i5-4570S Haswell ==== 16 16 17 17 * '''385.5 GFLOPS''' N=60000 ([wiki:i5-4570S-Raw-HPL Raw HPL Results]) 18 18 * 52% of Peak (2.9GHz * 16 cores * 16 DP FLOPS/cycle) = 742.4 GFLOPS Peak (Note: Haswell is now 16 FLOPS/cycle) 19 * Four i5-4570S each with 8 MB RAM, GbE, Intel MKL and compilers19 * Four i5-4570S each with 8 MB DDR3 RAM, GbE, Intel MKL and compilers 20 20 21 ==== i7-4770S ==== 21 * '''567.4 GFLOPS''' N=60000 ([wiki:i5-4570S-Raw-HPL Raw HPL Results]) 22 * 76% of Peak (2.9GHz * 16 cores * 16 DP FLOPS/cycle) = 742.4 GFLOPS Peak (Note: Haswell is now 16 FLOPS/cycle) 23 * Four i5-4570S each with 8 MB DDR3 RAM, '''10-GbE''', Intel MKL and compilers 24 25 ==== i7-4770S Haswell ==== 22 26 23 27 * '''444.8 GFLOPS''' N=86000 ([wiki:i7-4570S-Raw-64G-HPL Raw HPL Results]) 24 28 * 56% of Peak (3.1GHz * 16 cores * 16 DP FLOPS/cycle) = 742.4 GFLOPS Peak (Note: Haswell is now 16 FLOPS/cycle) 25 * Four i7-4770S each with 16 MB RAM, GbE, Intel MKL and compilers29 * Four i7-4770S each with 16 MB DDR3 RAM, GbE, Intel MKL and compilers 26 30 27 31 * '''498.3 GFLOPS''' N=126000 ([wiki:i5-4570S-Raw-128G-HPL Raw HPL Results]) 28 32 * 62% of Peak (3.1 GHz * 16 cores * 16 DP FLOPS/cycle) = 742.4 GFLOPS Peak (Note: Haswell is now 16 FLOPS/cycle) 29 * Four i7-4770S each with 32 MB RAM, GbE, Intel MKL and compilers33 * Four i7-4770S each with 32 MB DDR3 RAM, GbE, Intel MKL and compilers 30 34 31 35 36 ==== i5-6500 Skylake ==== 32 37 38 * '''480.2 GFLOPS''' N=86000 ([wiki:i5-6500-Raw-64G-HPL Raw HPL Results]) 39 * 59% of Peak (3.2GHz * 16 cores * 16 DP FLOPS/cycle) = 819.2 GFLOPS Peak (Note: Skylake is 16 FLOPS/cycle) 40 * Four i5-6500 each with 16 MB DDR4 RAM, GbE, Intel MKL and compilers 33 41 34 42 … … 87 95 88 96 89 '''These tests are very preliminary using old hardware and untuned software.'''97 '''These tests are preliminary using old hardware and untuned software. This approach has been abandoned. The results remain for reference.''' 90 98 91 99 The first issue is how to add 10GigE (GbE) without a switch using low cost dual port GigE cards. Create a four node loop using [http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge#Bridge_priority Ethernet Bridge]. A simple loop will require a Spanning Tree Protocol, but that cuts a link and introduces one 2-hop route, two 1-hop routes, and three 0-hop routes. If one of the bridges is replaced by a [http://www.linuxfoundation.org/collaborate/workgroups/networking/bonding bonded link] in "mode 0" (round robin) then there are ''effectively'' four 1-hop routes and two 0-hop routes, but no route ''effectively'' takes more than "1-hop" in terms of latency and bandwidth.