######################################################################## This is the DARPA/DOE HPC Challenge Benchmark version 1.5.0 October 2012 Produced by Jack Dongarra and Piotr Luszczek Innovative Computing Laboratory University of Tennessee Knoxville and Oak Ridge National Laboratory See the source files for authors of specific codes. Compiled on Jul 18 2019 at 13:13:54 Current time (1565705605) is Tue Aug 13 16:13:25 2019 Hostname: 'phd-sid' ######################################################################## ================================================================================ HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008 Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK Modified by Julien Langou, University of Colorado Denver ================================================================================ An explanation of the input/output parameters follows: T/V : Wall time / encoded variant. N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns. Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system. The following parameter values will be used: N : 2560 NB : 80 PMAP : Column-major process mapping P : 1 Q : 1 PFACT : Right NBMIN : 4 NDIV : 2 RFACT : Crout BCAST : 1ringM DEPTH : 1 SWAP : Mix (threshold = 64) L1 : transposed form U : transposed form EQUIL : yes ALIGN : 8 double precision words -------------------------------------------------------------------------------- - The matrix A is randomly generated for each test. - The following scaled residual check will be computed: ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N ) - The relative machine precision (eps) is taken to be 1.110223e-16 - Computational tests pass if scaled residuals are less than 16.0 Begin of MPIRandomAccess section. Running on 1 processors (PowerofTwo) Total Main table size = 2^22 = 4194304 words PE Main table size = 2^22 = 4194304 words/PE Default number of updates (RECOMMENDED) = 16777216 Number of updates EXECUTED = 16777216 (for a TIME BOUND of 60.00 secs) CPU time used = 2.043656 seconds Real time used = 2.043702 seconds 0.008209228 Billion(10^9) Updates per second [GUP/s] 0.008209228 Billion(10^9) Updates/PE per second [GUP/s] Verification: CPU time used = 0.441646 seconds Verification: Real time used = 0.441731 seconds Found 0 errors in 4194304 locations (passed). Current time (1565705607) is Tue Aug 13 16:13:27 2019 End of MPIRandomAccess section. Begin of StarRandomAccess section. Main table size = 2^22 = 4194304 words Number of updates = 16777216 CPU time used = 0.381476 seconds Real time used = 0.381513 seconds 0.043975493 Billion(10^9) Updates per second [GUP/s] Found 0 errors in 4194304 locations (passed). Node(s) with error 0 Minimum GUP/s 0.043975 Average GUP/s 0.043975 Maximum GUP/s 0.043975 Current time (1565705608) is Tue Aug 13 16:13:28 2019 End of StarRandomAccess section. Begin of SingleRandomAccess section. Main table size = 2^22 = 4194304 words Number of updates = 16777216 CPU time used = 0.376862 seconds Real time used = 0.376865 seconds 0.044517894 Billion(10^9) Updates per second [GUP/s] Found 0 errors in 4194304 locations (passed). Node(s) with error 0 Node selected 0 Single GUP/s 0.044518 Current time (1565705609) is Tue Aug 13 16:13:29 2019 End of SingleRandomAccess section. Begin of MPIRandomAccess_LCG section. Running on 1 processors (PowerofTwo) Total Main table size = 2^22 = 4194304 words PE Main table size = 2^22 = 4194304 words/PE Default number of updates (RECOMMENDED) = 16777216 Number of updates EXECUTED = 16777216 (for a TIME BOUND of 60.00 secs) CPU time used = 2.601211 seconds Real time used = 2.601313 seconds 0.006449518 Billion(10^9) Updates per second [GUP/s] 0.006449518 Billion(10^9) Updates/PE per second [GUP/s] Verification: CPU time used = 0.434888 seconds Verification: Real time used = 0.434899 seconds Found 0 errors in 4194304 locations (passed). Current time (1565705612) is Tue Aug 13 16:13:32 2019 End of MPIRandomAccess_LCG section. Begin of StarRandomAccess_LCG section. Main table size = 2^22 = 4194304 words Number of updates = 16777216 CPU time used = 0.355066 seconds Real time used = 0.355123 seconds 0.047243387 Billion(10^9) Updates per second [GUP/s] Found 0 errors in 4194304 locations (passed). Node(s) with error 0 Minimum GUP/s 0.047243 Average GUP/s 0.047243 Maximum GUP/s 0.047243 Current time (1565705613) is Tue Aug 13 16:13:33 2019 End of StarRandomAccess_LCG section. Begin of SingleRandomAccess_LCG section. Main table size = 2^22 = 4194304 words Number of updates = 16777216 CPU time used = 0.354582 seconds Real time used = 0.354614 seconds 0.047311271 Billion(10^9) Updates per second [GUP/s] Found 0 errors in 4194304 locations (passed). Node(s) with error 0 Node selected 0 Single GUP/s 0.047311 Current time (1565705613) is Tue Aug 13 16:13:33 2019 End of SingleRandomAccess_LCG section. Begin of PTRANS section. M: 1280 N: 1280 MB: 80 NB: 80 P: 1 Q: 1 TIME M N MB NB P Q TIME CHECK GB/s RESID ---- ----- ----- --- --- --- --- -------- ------ -------- ----- WALL 1280 1280 80 80 1 1 0.02 PASSED 0.843 0.00 CPU 1280 1280 80 80 1 1 0.02 PASSED 0.844 0.00 WALL 1280 1280 80 80 1 1 0.02 PASSED 0.843 0.00 CPU 1280 1280 80 80 1 1 0.02 PASSED 0.849 0.00 WALL 1280 1280 80 80 1 1 0.02 PASSED 0.843 0.00 CPU 1280 1280 80 80 1 1 0.02 PASSED 0.846 0.00 WALL 1280 1280 80 80 1 1 0.02 PASSED 0.843 0.00 CPU 1280 1280 80 80 1 1 0.02 PASSED 0.848 0.00 WALL 1280 1280 80 80 1 1 0.02 PASSED 0.842 0.00 CPU 1280 1280 80 80 1 1 0.02 PASSED 0.843 0.00 Finished 5 tests, with the following results: 5 tests completed and passed residual checks. 0 tests completed and failed residual checks. 0 tests skipped because of illegal input values. END OF TESTS. Current time (1565705614) is Tue Aug 13 16:13:34 2019 End of PTRANS section. Begin of StarDGEMM section. Scaled residual: 0.00913512 Node(s) with error 0 Minimum Gflop/s 24.890206 Average Gflop/s 24.890206 Maximum Gflop/s 24.890206 Current time (1565705615) is Tue Aug 13 16:13:35 2019 End of StarDGEMM section. Begin of SingleDGEMM section. Scaled residual: 0.0056868 Node(s) with error 0 Node selected 0 Single DGEMM Gflop/s 38.765038 Current time (1565705615) is Tue Aug 13 16:13:35 2019 End of SingleDGEMM section. Begin of StarSTREAM section. ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2184533, Offset = 0 Total memory required = 0.0488 GiB. Each test is run 10 times. The *best* time for each kernel (excluding the first iteration) will be used to compute the reported bandwidth. The SCALAR value used for this run is 0.420000 ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of 5830 microseconds. (= 5830 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- VERBOSE: total setup time for rank 0 = 0.030306 seconds ------------------------------------------------------------- Function Rate (GB/s) Avg time Min time Max time Copy: 4.8433 0.0078 0.0072 0.0088 Scale: 4.7400 0.0077 0.0074 0.0088 Add: 4.9909 0.0111 0.0105 0.0131 Triad: 4.9897 0.0112 0.0105 0.0134 ------------------------------------------------------------- Solution Validates: avg error less than 1.000000e-13 on all three arrays ------------------------------------------------------------- Node(s) with error 0 Minimum Copy GB/s 4.843338 Average Copy GB/s 4.843338 Maximum Copy GB/s 4.843338 Minimum Scale GB/s 4.740036 Average Scale GB/s 4.740036 Maximum Scale GB/s 4.740036 Minimum Add GB/s 4.990861 Average Add GB/s 4.990861 Maximum Add GB/s 4.990861 Minimum Triad GB/s 4.989700 Average Triad GB/s 4.989700 Maximum Triad GB/s 4.989700 Current time (1565705616) is Tue Aug 13 16:13:36 2019 End of StarSTREAM section. Begin of SingleSTREAM section. ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2184533, Offset = 0 Total memory required = 0.0488 GiB. Each test is run 10 times. The *best* time for each kernel (excluding the first iteration) will be used to compute the reported bandwidth. The SCALAR value used for this run is 0.420000 ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of 4714 microseconds. (= 4714 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- VERBOSE: total setup time for rank 0 = 0.030273 seconds ------------------------------------------------------------- Function Rate (GB/s) Avg time Min time Max time Copy: 5.0908 0.0081 0.0069 0.0088 Scale: 4.8398 0.0085 0.0072 0.0088 Add: 5.2874 0.0118 0.0099 0.0131 Triad: 5.4024 0.0115 0.0097 0.0134 ------------------------------------------------------------- Solution Validates: avg error less than 1.000000e-13 on all three arrays ------------------------------------------------------------- Node(s) with error 0 Node selected 0 Single STREAM Copy GB/s 5.090752 Single STREAM Scale GB/s 4.839811 Single STREAM Add GB/s 5.287395 Single STREAM Triad GB/s 5.402445 Current time (1565705616) is Tue Aug 13 16:13:36 2019 End of SingleSTREAM section. Begin of MPIFFT section. Number of nodes: 1 Vector size: 524288 Generation time: 0.033 Tuning: 0.023 Computing: 0.065 Inverse FFT: 0.068 max(|x-x0|): 1.299e-15 Gflop/s: 0.763 Current time (1565705616) is Tue Aug 13 16:13:36 2019 End of MPIFFT section. Begin of StarFFT section. Vector size: 1048576 Generation time: 0.061 Tuning: 0.000 Computing: 0.072 Inverse FFT: 0.079 max(|x-x0|): 1.698e-15 Node(s) with error 0 Minimum Gflop/s 1.459255 Average Gflop/s 1.459255 Maximum Gflop/s 1.459255 Current time (1565705617) is Tue Aug 13 16:13:37 2019 End of StarFFT section. Begin of SingleFFT section. Vector size: 1048576 Generation time: 0.064 Tuning: 0.000 Computing: 0.076 Inverse FFT: 0.079 max(|x-x0|): 1.698e-15 Node(s) with error 0 Node selected 0 Single FFT Gflop/s 1.383714 Current time (1565705617) is Tue Aug 13 16:13:37 2019 End of SingleFFT section. Begin of LatencyBandwidth section. Current time (1565705617) is Tue Aug 13 16:13:37 2019 End of LatencyBandwidth section. Begin of HPL section. ================================================================================ HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008 Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK Modified by Julien Langou, University of Colorado Denver ================================================================================ An explanation of the input/output parameters follows: T/V : Wall time / encoded variant. N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns. Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system. The following parameter values will be used: N : 2560 NB : 80 PMAP : Column-major process mapping P : 1 Q : 1 PFACT : Right NBMIN : 4 NDIV : 2 RFACT : Crout BCAST : 1ringM DEPTH : 1 SWAP : Mix (threshold = 64) L1 : transposed form U : transposed form EQUIL : yes ALIGN : 8 double precision words -------------------------------------------------------------------------------- - The matrix A is randomly generated for each test. - The following scaled residual check will be computed: ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N ) - The relative machine precision (eps) is taken to be 1.110223e-16 - Computational tests pass if scaled residuals are less than 16.0 ================================================================================ T/V N NB P Q Time Gflops -------------------------------------------------------------------------------- WC11C2R4 2560 80 1 1 0.53 2.116e+01 -------------------------------------------------------------------------------- ||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0034721 ...... PASSED ================================================================================ Finished 1 tests with the following results: 1 tests completed and passed residual checks, 0 tests completed and failed residual checks, 0 tests skipped because of illegal input values. -------------------------------------------------------------------------------- End of Tests. ================================================================================ Current time (1565705618) is Tue Aug 13 16:13:38 2019 End of HPL section. Begin of Summary section. VersionMajor=1 VersionMinor=5 VersionMicro=0 VersionRelease=f LANG=C Success=1 sizeof_char=1 sizeof_short=2 sizeof_int=4 sizeof_long=8 sizeof_void_ptr=8 sizeof_size_t=8 sizeof_float=4 sizeof_double=8 sizeof_s64Int=8 sizeof_u64Int=8 sizeof_struct_double_double=16 CommWorldProcs=1 MPI_Wtick=1.000000e-09 HPL_Tflops=0.0211623 HPL_time=0.52899 HPL_eps=1.11022e-16 HPL_RnormI=1.81171e-12 HPL_Anorm1=666.101 HPL_AnormI=663.835 HPL_Xnorm1=1494.04 HPL_XnormI=2.76479 HPL_BnormI=0.499975 HPL_N=2560 HPL_NB=80 HPL_nprow=1 HPL_npcol=1 HPL_depth=1 HPL_nbdiv=2 HPL_nbmin=4 HPL_cpfact=R HPL_crfact=C HPL_ctop=1 HPL_order=C HPL_dMACH_EPS=1.110223e-16 HPL_dMACH_SFMIN=2.225074e-308 HPL_dMACH_BASE=2.000000e+00 HPL_dMACH_PREC=2.220446e-16 HPL_dMACH_MLEN=5.300000e+01 HPL_dMACH_RND=1.000000e+00 HPL_dMACH_EMIN=-1.021000e+03 HPL_dMACH_RMIN=2.225074e-308 HPL_dMACH_EMAX=1.024000e+03 HPL_dMACH_RMAX=1.797693e+308 HPL_sMACH_EPS=5.960464e-08 HPL_sMACH_SFMIN=1.175494e-38 HPL_sMACH_BASE=2.000000e+00 HPL_sMACH_PREC=1.192093e-07 HPL_sMACH_MLEN=2.400000e+01 HPL_sMACH_RND=1.000000e+00 HPL_sMACH_EMIN=-1.250000e+02 HPL_sMACH_RMIN=1.175494e-38 HPL_sMACH_EMAX=1.280000e+02 HPL_sMACH_RMAX=3.402823e+38 dweps=1.110223e-16 sweps=5.960464e-08 HPLMaxProcs=1 HPLMinProcs=1 DGEMM_N=1477 StarDGEMM_Gflops=24.8902 SingleDGEMM_Gflops=38.765 PTRANS_GBs=0.841939 PTRANS_time=0.0155679 PTRANS_residual=0 PTRANS_n=1280 PTRANS_nb=80 PTRANS_nprow=1 PTRANS_npcol=1 MPIRandomAccess_LCG_N=4194304 MPIRandomAccess_LCG_time=2.60131 MPIRandomAccess_LCG_CheckTime=0.434899 MPIRandomAccess_LCG_Errors=0 MPIRandomAccess_LCG_ErrorsFraction=0 MPIRandomAccess_LCG_ExeUpdates=16777216 MPIRandomAccess_LCG_GUPs=0.00644952 MPIRandomAccess_LCG_TimeBound=60 MPIRandomAccess_LCG_Algorithm=0 MPIRandomAccess_N=4194304 MPIRandomAccess_time=2.0437 MPIRandomAccess_CheckTime=0.441731 MPIRandomAccess_Errors=0 MPIRandomAccess_ErrorsFraction=0 MPIRandomAccess_ExeUpdates=16777216 MPIRandomAccess_GUPs=0.00820923 MPIRandomAccess_TimeBound=60 MPIRandomAccess_Algorithm=0 RandomAccess_LCG_N=4194304 StarRandomAccess_LCG_GUPs=0.0472434 SingleRandomAccess_LCG_GUPs=0.0473113 RandomAccess_N=4194304 StarRandomAccess_GUPs=0.0439755 SingleRandomAccess_GUPs=0.0445179 STREAM_VectorSize=2184533 STREAM_Threads=1 StarSTREAM_Copy=4.84334 StarSTREAM_Scale=4.74004 StarSTREAM_Add=4.99086 StarSTREAM_Triad=4.9897 SingleSTREAM_Copy=5.09075 SingleSTREAM_Scale=4.83981 SingleSTREAM_Add=5.28739 SingleSTREAM_Triad=5.40244 FFT_N=1048576 StarFFT_Gflops=1.45925 SingleFFT_Gflops=1.38371 MPIFFT_N=524288 MPIFFT_Gflops=0.763105 MPIFFT_maxErr=1.29948e-15 MPIFFT_Procs=1 MaxPingPongLatency_usec=-1 RandomlyOrderedRingLatency_usec=-1 MinPingPongBandwidth_GBytes=-1 NaturallyOrderedRingBandwidth_GBytes=-1 RandomlyOrderedRingBandwidth_GBytes=-1 MinPingPongLatency_usec=-1 AvgPingPongLatency_usec=-1 MaxPingPongBandwidth_GBytes=-1 AvgPingPongBandwidth_GBytes=-1 NaturallyOrderedRingLatency_usec=-1 FFTEnblk=16 FFTEnp=8 FFTEl2size=1048576 M_OPENMP=-1 omp_get_num_threads=0 omp_get_max_threads=0 omp_get_num_procs=0 MemProc=64 MemSpec=-1 MemVal=-1 MPIFFT_time0=9.72301e-07 MPIFFT_time1=0.00967972 MPIFFT_time2=0.010832 MPIFFT_time3=0.00386491 MPIFFT_time4=0.0288869 MPIFFT_time5=0.00850896 MPIFFT_time6=6.25849e-07 CPS_HPCC_FFT_235=0 CPS_HPCC_FFTW_ESTIMATE=0 CPS_HPCC_MEMALLCTR=0 CPS_HPL_USE_GETPROCESSTIMES=0 CPS_RA_SANDIA_NOPT=0 CPS_RA_SANDIA_OPT2=0 CPS_USING_FFTW=0 End of Summary section. ######################################################################## End of HPC Challenge tests. Current time (1565705618) is Tue Aug 13 16:13:38 2019 ########################################################################