Skip to content

Metrics Oct 2021

Pei-Hung Lin edited this page Oct 13, 2021 · 7 revisions

Evaluation platform:

Inspector: LLNL Quartz: Intel(R) Xeon(R) CPU E5-2695

Others: Docker container @ OSX with 2.9 GHz Quad-Core Intel Core i7

Metrics report:

File-level checking:

Tool Languages TP FP TN FN Recall Specificity Precision Accuracy TSR Adjusted F1
Intel Inspector C/C++ 76 41 46 10 0.837 0.5287 0.6495 0.7052 0.9558 0.7487
ROMP C/C++ 62 13 65 18 0.775 0.8333 0.8266 0.8037 0.8729 0.8000
ThreadSanitizer C/C++ 69 1 89 20 0.7752 0.9888 0.9857 0.8826 0.9889 0.8679
Coderrect C/C++ 74 2 87 14 0.8409 0.9775 0.9736 0.9096 0.9779 0.9024
LLOV C/C++ 58 9 78 29 0.6666 0.8965 0.8656 0.7816 0.9613 0.7532
Intel Inspector Fortran 66 11 65 17 0.7951 0.8552 0.8571 0.8238 0.9464 0.825
ROMP Fortran 57 10 54 20 0.7402 0.8437 0.8507 0.7872 0.8392 0.7916
ThreadSanitizer Fortran 52 0 65 15 0.7761 1 1 0.8863 0.7857 0.8739
Coderrect Fortran 52 0 66 15 0.8148 0.8533 0.8571 0.8333 0.9398 0.8354
LLOV Fortran 40 11 70 36 0.5263 0.8641 0.7843 0.7006 0.9457 0.6299

Fine-grain checking with source line information:

Tool Languages TP FP TN FN Recall Specificity Precision Accuracy TSR Adjusted F1
Intel Inspector C/C++ 77 154 112 25 0.755 0.421 0.333 0.514 0.9558 0.442
ROMP C/C++ 59 37 194 43 0.578 0.840 0.615 0.721 0.8729 0.520
ThreadSanitizer C/C++ 67 19 209 35 0.657 0.917 0.779 0.812 0.9889 0.705
Coderrect C/C++ 64 14 214 38 0.627 0.939 0.821 0.810 0.9779
LLOV C/C++ 19 11 216 83 0.186 0.952 0.633 0.607 0.9613 0.277
Intel Inspector Fortran 64 51 175 30 0.681 0.774 0.556 0.747 0.946 0.579
ROMP Fortran 49 57 173 45 0.521 0.752 0.462 0.655 0.839 0.411
ThreadSanitizer Fortran 47 29 196 47 0.5 0.871 0.618 0.723 0.7857 0.434
Coderrect Fortran 44 28 197 50 0.468 0.876 0.611 0.711 0.9398 0.498
LLOV Fortran 26 25 200 68 0.277 0.889 0.510 0.633 0.9457 0.339

Fine-grain checking with source line information & variable read/write access type:

Tool Languages TP FP TN FN Recall Specificity Precision Accuracy TSR Adjusted F1
Intel Inspector C/C++ 83 192 100 28 0.748 0.342 0.302 0.454 0.9558 0.411
ThreadSanitizer C/C++ 67 20 229 44 0.604 0.920 0.770 0.787 0.9889 0.669
LLOV C/C++ 19 11 237 90 0.174 0.956 0.633 0.608 0.9613 0.263
Intel Inspector Fortran 66 81 158 34 0.66 0.661 0.449 0.661 0.9464 0.506
ThreadSanitizer Fortran 35 46 187 65 0.35 0.803 0.432 0.610 0.7857 0.304
LLOV Fortran 26 25 208 73 0.263 0.893 0.510 0.629 0.9457 0.328

Metrics formula:

precision (P) = TP/(TP + FP)

recall (R) = TP/(TP +FN)

accuracy (A) = (TP +TN)/(TP +FP +TN +FN)

specificity (S) = TN / ( TN + FP)

F1 score (F1) = 2 * (P * R) / (P + R)

test support rate (TSR) = (TP +FP +TN +FN)/(TEST NUMBER)

adjusted F1 (AF1) = TSR * F1

Tool & compiler version:

Tool Version Compiler
ThreadSanitizer 10.0.0 Clang/LLVM 10.0.0
TSan(GPU) 12.0 Clang/LLVM 12.0.0, gfortran 10.3.0
Intel Inspector 2021.1 (build 604894) Intel Compiler 2021.3.0
ROMP 20ac93c GCC/gfortran 7.4.0
Coderrect 0.8.0 Clang/LLVM 9.0.0
LLOV N/A Clang/LLVM 6.0.1

Note that Coderrect does not require a specific compiler version, as long as the user's code can be built on their machine. However, for the static analysis Coderrect generates LLVM IR using Clang-9.0.0, which is packaged into the tool.