From c0c77bd58936d70d4c615f27835df0f9ee249c29 Mon Sep 17 00:00:00 2001
From: qingfengxia <qingfeng.xia@gmail.com>
Date: Thu, 17 Dec 2020 11:35:30 +0000
Subject: [PATCH 1/7] update ParallelDesign.md wiki page

---
 wiki/ParallelDesign.md | 102 +++++++++++++++++++++--------------------
 1 file changed, 53 insertions(+), 49 deletions(-)

diff --git a/wiki/ParallelDesign.md b/wiki/ParallelDesign.md
index ef040fee..00679904 100644
--- a/wiki/ParallelDesign.md
+++ b/wiki/ParallelDesign.md
@@ -1,63 +1,42 @@
 # Parallel Design
 
-## Workflow Topology
+## Split-Process-Merge pattern
 
-Currenlty only linear pipeline (single-way linear) is supported, more complicated graph dataflow as in Apache NIFI,  Tensorflow, has not been implemented. 
+The design of  **parallel-preprocessor** aims to enable researcher to focus on algorithms in his/her domain, no matter whether it is running on GPU or super-computer. This Split-Process-Merge pattern design share the idea with **MapReduce** programming model developed by Google. 
 
-https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#User_Interface
+**MapReduce** is a simplified distributed programming model and an efficient task scheduling model for parallel operations on large-scale data sets (greater than 1TB).  The idea of ​​the MapReduce mode is to decompose the problem to be executed into Map (mapping) and Reduce (simplification). First, the data is cut into irrelevant (independent) blocks by the `Map` program, and then distributed (scheduling) to a large number of computers/threads for parallel computation. Then the result is aggregated and output through the `Reduce` program.
 
-`Cpp-Taskflow` library is by far a faster, more expressive, and easier for drop-in integration for the single-way linear task programming.
+The design of **parallel-preprocessor** is based on the fact, a large data set, such as an assembly of a whole fusion reactor CAD model, can be viewed as a cluster of parts (part: a single self-containing 3D shape that can be design and saved). Each part is an item on which computation may be worth of running in a thread independently. For example, calculating geometrical meta data like volume and center of mass, can be run in parallel without modifying the nearby parts. Developer will only needs to write a function `processItem(index)` for each part. Multi-thread parallelization is conducted at the part level.
 
-## Executor
-Currently, only CPU device is supported, althoug GPU acceleraton will be considered in the future.  It is expected the Processor should has a characteristics of  ` enum DevicePreference {CPU, GPU}`
+In case of computation on one item that may alter the state of other items, which is also called coupled operation, e.g. imprinting a part will alter the data structure of nearby parts in contact, parallel computation is still possible given the fact a part is only in contact with a very small portion of total parts in the assembly. It is possible to parallelize computation for the pair of item `(i, j)`,  and item `(x, y)` at the same time,  if both pairs do not affect each other pair, see also [Benchmarking.md](Benchmarking.md).
 
-+ multiple threading (shared memory), TBB task_group is used, but can be switched to a single-header thread pool implementation.
-  synchronous multi-threading (wait until all threads complete tasks) and asynchronous dispatcher (now default) have been implemented.
-  ![asynchronous dispatcher can fully utilise multi-core CPU](./PPP_asyn_multiple_threading.png)
 
-+ MPI (distributive, NUMA)
-  This is useful for distributive meshing, since memory capacity on a single multi-core node may be not sufficient to mesh large assemblies.
-  [HPX](https://github.com/STEllAR-GROUP/hpx) is a library that unitify the API for multi-threading and distributve parallelization.
-  libraries such as `OpenMP Tasking` and `Intel TBB FlowGraph` in handling complex parallel workloads. TensorFlow has the concept of graph simplication. 
+There are some tasks like memory allocation is not allowed to run in parallel, then there are two functions `preprocess()` and `postprocess()` will do prepare/split and cleanup/merge tasks in the serial mode, the Merge stage is corresponding to the Reduce concept in MapReduce programming model. 
 
-### Thread Pool Executor
-+ currently, intel TBB `task_group`, internally it could be a threadpool, is used
-+ OpenCASCADE has `ThreadPool` class, can be a choice
-  <https://www.opencascade.com/doc/occt-7.4.0/refman/html/class_o_s_d___thread_pool.html>
-+ A simple C++11 Thread Pool implementation, license: Zlib
-  <https://github.com/progschj/ThreadPool>
 
-The class name should be identical to Python's `concurrent.futures` module like `ThreadPool`  `ProcessPoolExecutor`
-https://docs.python.org/3/library/concurrent.futures.html
+## Workflow Topology
 
-### Process Pool Executor
+Currently only linear pipeline (single-way linear) is supported, more complicated graph dataflow as in `Apache NIFI`,  `Tensorflow`, has not been implemented. 
 
-Python's `ProcessPoolExecutor` implements similar API as  `ProcessPoolExecutor`, but all processes are still on the localhost (not distributive). 
-MPI distributive parallel is also on Process level, does not share the memory address.
+https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#User_Interface
 
-### Parallel computation on GPU
+`Cpp-Taskflow` library is by far a faster, more expressive, and easier for drop-in integration for the single-way linear task programming.
 
-GPU parallel is under investigation, it is expected single-source, which can fall back to CPU parallel.
-GPU offloading may also be considered.
+Libraries such as `OpenMP Tasking` and `Intel TBB FlowGraph` in handling complex parallel workloads. TensorFlow has the concept of graph simplication. 
 
-For some simple data type like image and text, it is expected computation can be done on GPU
-1. GPU may be achieved by third-party library implicitly, in that case, only one thread per GPU is used
-  For example, OpenMP may be offloaded to GPU, Nvidia has stdc++ lib to offload some operaton to GPU.
 
-2. write the GPU kernel source and the `GpuExecutor` will schedule the work.
+## Parallel Executor
+Currently, only CPU device is supported, although GPU acceleration will be considered in the future.  It is expected the Processor should has a characteristics of  ` enum DevicePreference {CPU, GPU}`
+
++ multiple threading (shared memory)
+  TBB `task_group` is used as the thread pool, but can be switched to a single-header thread pool implementation. Synchronous multi-threading (wait until all threads complete tasks) and asynchronous dispatcher (now the default) have been implemented.
+  ![asynchronous dispatcher can fully utilise multi-core CPU](./PPP_asyn_multiple_threading.png)
+
++ MPI (distributive, NUMA)
+  This is useful for distributive meshing, since memory capacity on a single multi-core node may be not sufficient to mesh large assemblies.
 
-```cpp
-    enum class DataType
-    {
-        Any = 0,  /// unknown data type
-        Text,
-        Image,
-        Audio,
-        Video,
-        Geometry
-    };
-```
-SYCL is the single source OpenCL, to write new processor in C++.
++ GPU (heterogenous)
+  This is the preferred for simple data types such as image processing AI. 
 
 ## Concurrent data access
 
@@ -71,17 +50,42 @@ The main thread allocate the `std::vector<>` with enough size by `resize(capacit
  currently a template alias is used, for potential replacement for distributive  MPI parallel 
   `template <class T> using VectorType = std::vector<T>;`
 
-### Concurrent data structure with lock underneath
+### Concurrent data structure
+There are high-level concurent data structure with lock/synchronization underneath
+
 <https://github.com/FEniCS/dolfin/tree/master/dolfin/la>
+
 TBB has some concurrent data structure, C++17 standard c++ parallel library seems based on TBB.
 
+[HPX](https://github.com/STEllAR-GROUP/hpx) is a library that unifying the API for multi-threading and distributive parallelization.
+
 
 ## Distributive parallel design
 
-The master node do the split, feeding each worker node with a collection of shapes and meta info (json) for each shape. The master will build the topology (graph, boundbox tree), which will be shared by all nodes.
-Each worker works on the split dataset indepedently like shape checking and meshing  while it sends some meta information back to the master to build global information data structure.  
+The master node do the split (a best strategy for geometry decomposition is worth of research), feeding each worker node with a collection of shapes and meta info (json) for each shape. The master will build the global topology (graph, boundbox tree), which will be shared by all nodes. Each worker works on the split dataset independently like shape checking and meshing  while it sends some meta information back to the master to build global information data structure.  
+
+
+## GPU parallel design
+
+GPU parallel is under investigation, GPU offloading from OpenMP may also be considered, reuse the current ThreadingExecutor. 
+
+Device and technology selection
++ OpenCL/SYCL, 
++ NVIDIA's CUDA/NVCC,  
++ ADM's ROCm/HIP,  
++ Intel's oneAPI/DPC++
+
+single source is preferred, such as SYCL. 
+fall back to CPU:  it is supported by all except CUDA.
+How to support more hardware platform as possible? SYCL,
+multiple GPU support, CUDA is leading is this field.
+
+
+### Coding
+
+1. GPU may be achieved by third-party library implicitly, in that case, only one thread per GPU is used
+    For example, OpenMP may be offloaded to GPU, Nvidia has stdc++ lib to offload some operation to GPU.
+
+2. write the GPU kernel source and the `GpuExecutor` will schedule the work.
 
-PPP may also write the case input files, some part recongination, boundary condition setup,  OpenMC/FEM/CFD solver are considered in this pattern. 
-Natural part shared face during assembly splitting is used for `interprocessor` boundary conditions. The master node is responsible to doe boundary updating.  PreCICE may be used for boundary update. 
 
-http://www.libgeodecomp.org/

From d05f46aa70ab9eb7d0750d1f7a30457a4c7e211c Mon Sep 17 00:00:00 2001
From: qingfengxia <qingfeng.xia@gmail.com>
Date: Thu, 17 Dec 2020 11:36:01 +0000
Subject: [PATCH 2/7] add Geom/Readme.md for impl notes for Geom module

---
 src/Geom/Readme.md | 74 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 73 insertions(+), 1 deletion(-)

diff --git a/src/Geom/Readme.md b/src/Geom/Readme.md
index 0f70019b..e9768a74 100644
--- a/src/Geom/Readme.md
+++ b/src/Geom/Readme.md
@@ -1 +1,73 @@
-## Design of Geom module
\ No newline at end of file
+# Design of Geom module
+
+Feature overview is located in wiki folder.
+
+## CLI design
+
+subcommand style CLI will be enforced in the future. 
+
+https://dzone.com/articles/multi-level-argparse-python
+https://planetask.medium.com/command-line-subcommands-with-pythons-argparse-4dbac80f7110
+
+FIXED:
+`working-dir` cause error, because input file has not been translated into abspath, then change cwd to `working-dir`, so input file can not been found. it has been fixed.
+
+TODO:
+`dataStorage` in config.json, `"workingDir": args.workingDir,` 
+a better name `data-dir` should be used instead of `working-dir`
+
+## Imprint
+
+currently merge is done by a single thread
+
+### merge is delayed until writing result file
+
+https://github.com/ukaea/parallel-preprocessor/issues/23
+
+merge will invalidate `Item()`
+
+## Scaling 
+
+### CAD length units
+In CAD world, the length unit by default im "MM", while in CAE world, ISO meter is the default unit.
+
+Input geometry unit, should be saved in GeometryData class by GeometryReader, and check during write output.
+
+1. OpenCASCADE brep file has no meta data for unit
+
+2. unit or scale for SAT file format
+
+> open your sat file with a text editor, change the first parameter of line 3 to "1" from "1000". This change will change the file unit from Meter to Millimeter.
+
+3. Parasolid-XT	Meters and radians by convention
+https://docs.cadexchanger.com/sdk/sdk_datamodel_units_page.html
+
+### change output unit will cause scaling
+add a new arg "--output-unit M" in python CLI interface
+
+### delayed scaling until writing
+calc scale ratio according to inputUnit and outputUnit
+
+Transform will invalidate mesh triangulation attached to shape !!!
+
+Transform maybe done in parallel, before merge, not sure it is worth of such operation?
+
+
+1. OpenCASCADE STEP writer
+`Interface_Static_SetCVal("Interface_Static_SetCVal("write.step.unit","MM")`
+
+`outputUnit` in GeometryWriter config can scale, based on the `inputUnit`
+
+
+2. OpenCASCADE brep write
+
+`TopoDS_Shape`  this is unitless data structure, to scale, using `gp_Trsf`
+
+```c++
+    gp_Trsf ts;
+    ts.SetScale(origin, scale);
+    return BRepBuilderAPI_Transform(from, ts, true);
+```
+
+## To switch between CAD kernel
+make new GeometryData derived class, put all OccUtils function as class virtual member functions, but there is performance penalty (virtual functions)
\ No newline at end of file

From 61d31167f34d628e8e1edc26fcc02bb18ac95ea7 Mon Sep 17 00:00:00 2001
From: qingfengxia <qingfeng.xia@gmail.com>
Date: Thu, 17 Dec 2020 11:38:08 +0000
Subject: [PATCH 3/7] rename variales and add  ctor() for ProcessorTemplate

---
 src/PPP/ProcessorTemplate.h | 43 +++++++++++++++++++++++++------------
 wiki/BuildOnConda.md        | 20 +++++++++++++++++
 2 files changed, 49 insertions(+), 14 deletions(-)

diff --git a/src/PPP/ProcessorTemplate.h b/src/PPP/ProcessorTemplate.h
index f0fa6929..47922ebd 100644
--- a/src/PPP/ProcessorTemplate.h
+++ b/src/PPP/ProcessorTemplate.h
@@ -4,15 +4,22 @@
 
 namespace PPP
 {
+
     /// \ingroup PPP
     /**
      * \brief template + lambda to build a new processor inplace without declare a new processor type.
      *  this class will not be wrapped for Python, only used in C++, see exmaple in ParallelAccessorTest.cpp
      * `myResultData` is the place to save result data
      *
-     * @param ProcessorType a concrete process class it inherits from
+     * @param ProcessorType a concrete process class it inherits from, in order to share data
      * @param ResultItemType  processed result of VectorType<ResultItemType>, default to bool,
      *                    will be saved to `myOutputData` a vector<ResultItemType>
+     *
+     * TODO: add more template parameter to make a constructor() without parameter, not working
+     *       lambda can capture the processor instance pointer or reference `[&p] () {}`
+     *
+     *       This template must derive from a concrete type Processor,
+     *        not from a template parameter type "ProcessorType"
      */
     template <typename ProcessorType = Processor, typename ResultItemType = bool>
     class ProcessorTemplate : public Processor
@@ -21,23 +28,31 @@ namespace PPP
         // TYPESYSTEM_HEADER();
 
     public:
-        typedef std::function<void(ProcessorType&)> FuncType;
+        /// function type for pre and post processors
+        typedef std::function<void()> PreFuncType;
+        /// function type for per item process, return a ResultType
         typedef std::function<ResultItemType(const ItemIndexType)> ItemFuncType;
-        typedef std::function<ResultItemType(const ItemIndexType, const ItemIndexType)> CoupledItemPairFuncType;
+        ///
+        typedef std::function<ResultItemType(const ItemIndexType, const ItemIndexType)> ItemPairFuncType;
 
     private:
         VectorType<ResultItemType> myResultData;
         std::string myResultName;
 
         ItemFuncType myItemProcessor;
-        CoupledItemPairFuncType myCoupledItemProcessor;
-        FuncType myPreprocessor;
-        FuncType myPostprocessor;
+        ItemPairFuncType myCoupledItemProcessor;
+        PreFuncType myPreprocessor;
+        PreFuncType myPostprocessor;
 
     public:
-        ProcessorTemplate(ItemFuncType&& op, const std::string name = "myProcessResult")
+        ProcessorTemplate(ItemFuncType&& op, const std::string name = "myProcessResult",
+                          ItemPairFuncType _CoupledItemPairFunc = nullptr, PreFuncType _PreFunc = nullptr,
+                          PreFuncType _PostFunc = nullptr)
                 : myItemProcessor(std::move(op))
                 , myResultName(name)
+                , myCoupledItemProcessor(_CoupledItemPairFunc)
+                , myPreprocessor(_PreFunc)
+                , myPostprocessor(_PostFunc)
         {
         }
         ~ProcessorTemplate() = default;
@@ -49,11 +64,11 @@ namespace PPP
         }
 
 
-        void setPreprocessor(FuncType&& f)
+        void setPreprocessor(PreFuncType&& f)
         {
             myPreprocessor = std::move(f);
         }
-        void setPostprocessor(FuncType&& f)
+        void setPostprocessor(PreFuncType&& f)
         {
             myPostprocessor = std::move(f);
         }
@@ -65,7 +80,7 @@ namespace PPP
         }
 
         /// data operation is coupled, if setting this functor
-        void setCoupledItemProcessor(CoupledItemPairFuncType&& f)
+        void setCoupledItemProcessor(ItemPairFuncType&& f)
         {
             myCharacteristics["coupled"] = true;
             myCoupledItemProcessor = std::move(f);
@@ -81,13 +96,13 @@ namespace PPP
                 throw std::runtime_error("function pointer to itemProcessor must NOT be nullptr or empty");
             }
 
-            if (myPreprocessor != nullptr)
-                myPreprocessor(*this);
             /// prepare private properties like `VectorType<T>.resize(myInputData->itemCount());`
             /// therefore accessing item will not cause memory reallocation and items copying
             /// However, it is possible inputData are not set by setInputData() in non-pipeline mode
             if (myInputData)
                 myResultData.resize(myInputData->itemCount());
+            if (myPreprocessor != nullptr)
+                myPreprocessor();
         }
 
         /**
@@ -96,13 +111,13 @@ namespace PPP
         virtual void prepareOutput() override final
         {
             if (myPostprocessor != nullptr)
-                myPostprocessor(*this);
+                myPostprocessor();
             myOutputData->emplace(myResultName, std::move(myResultData));
         }
 
         /**
          * \brief process data item in parallel without affecting other data items
-         * @param index: index to get/set iteam by `item(index)/setItem(index, newDataItem)`
+         * @param index: index to get/set item by `item(index)/setItem(index, newDataItem)`
          */
         virtual void processItem(const ItemIndexType index) override final
         {
diff --git a/wiki/BuildOnConda.md b/wiki/BuildOnConda.md
index d0115d64..49ab3acd 100644
--- a/wiki/BuildOnConda.md
+++ b/wiki/BuildOnConda.md
@@ -99,6 +99,12 @@ The default conda macos conda-build using MacOS SDK 10.09, XCode 11.6, but MacOS
 > /src/PropertyContainer/PropertyContainerTest.cpp:46:22: error: 'any_cast<std::__1::shared_ptr<A> >' is unavailable: introduced in macOS 10.13
 auto data = std::any_cast<std::shared_ptr<myType>>(a);
 
+
+
+libGL is needed to build OCCT and parallel-preprocessor. 
+
+
+
 ### Upload to conda forge (yet done)
 
 Here is guide from conda-forge website:
@@ -115,9 +121,23 @@ https://github.com/qingfengxia/staged-recipes
 
 https://github.com/conda-forge/staged-recipes/pull/12901#issuecomment-709514486
 
+get attention from gitter
+https://gitter.im/conda-forge/conda-forge.github.io
 
 3. later the maintainer (pusher) will have the write control on that repo. 
 
 
 
 
+
+### Conda recipe feedback
+
+> #### **[chrisburr](https://github.com/chrisburr)**  Contributor
+>
+> I guess `occt` is being used as a shared library? If so it should use pin_compatible to make sure an ABI compatible version is used.
+>
+> > Author:
+> >
+> > yes, occt is build by one FreeCAD developer. it should be something like numpy version, must be matched.  shall I use occt == 7.4?  I propabaly needs to follow the update of occt version, somehow, I know occt 7.4 is the only version on conda-forge
+>
+> tbb also has the API compatibility issue
\ No newline at end of file

From 1a3d68e4290623b43775235379fdaab1f792d4cf Mon Sep 17 00:00:00 2001
From: qingfengxia <qingfeng.xia@gmail.com>
Date: Thu, 14 Jan 2021 23:19:45 +0000
Subject: [PATCH 4/7] report BOP check error msg into json file

---
 src/Geom/GeometryShapeChecker.h | 126 ++++++++++++++++++++++----------
 src/PPP/Processor.h             |  29 +++++---
 src/python/geomPipeline.py      |   2 +-
 3 files changed, 106 insertions(+), 51 deletions(-)

diff --git a/src/Geom/GeometryShapeChecker.h b/src/Geom/GeometryShapeChecker.h
index e0261a4b..2f141178 100644
--- a/src/Geom/GeometryShapeChecker.h
+++ b/src/Geom/GeometryShapeChecker.h
@@ -33,22 +33,21 @@ namespace Geom
     using namespace PPP;
     /// \ingroup Geom
     /**
-     * check error and try to fix it within tolerance, otherwise report error
-     * FreeCAD Part workbench, ShapeCheck, BOP check
+     * check shape error and report error
+     * FreeCAD Part workbench, has a GUI feature for ShapeCheck with tree view
      * todo: ShapeHealing toolkit of OCCT has a ShapeAnalysis package
+     * The base `class Processor` now has report infrastructure, save msg into `myItemReports`
      */
     class GeometryShapeChecker : public GeometryProcessor
     {
         TYPESYSTEM_HEADER();
 
     protected:
+        /// configurable parameters
         bool checkingBooleanOperation = false;
-        // need a bool parameter to control:
         bool suppressBOPCheckFailed = false;
 
     private:
-        // VectorType<std::shared_ptr<std::string>> myShapeCheckResults;
-
     public:
         // using GeometryProcessor::GeometryProcessor;
 
@@ -61,29 +60,37 @@ namespace Geom
 
         virtual void processItem(const ItemIndexType i) override final
         {
+            if (itemSuppressed(i))
+                return;
             const TopoDS_Shape& aShape = item(i);
             std::string err = checkShape(aShape);
-            writeItemReport(i, err);
-            if (err.size() == 0 &&
+            auto ssp = std::make_shared<std::stringstream>();
+            if (err.size() > 0)
+                *ssp << err;
+
+            // if basic checkShape has error, then BOPCheck must have fault?
+            if (                          //  err.size() == 0 &&   skip BOP check if preliminary check has failed?
                 checkingBooleanOperation) // Two common not serious errors are skipped in this BOP check
             {
-                bool hasFault = false;
+                bool hasBOPFault = false;
                 try
                 {
-                    hasFault = BOPSingleCheck(aShape); // will not report error message
+                    hasBOPFault = BOPSingleCheck(aShape, *ssp);
                 }
                 catch (const std::exception& e)
                 {
-                    hasFault = true;
+                    hasBOPFault = true;
                     LOG_F(ERROR, "BOP check has exception %s for item %lu ", e.what(), i);
                 }
                 catch (...)
                 {
-                    hasFault = true;
+                    hasBOPFault = true;
                     LOG_F(ERROR, "BOP check has exception for item %lu ", i);
                 }
 
-                if (hasFault)
+                if (ssp->str().size() > 0)
+                    setItemReport(i, ssp);
+                if (hasBOPFault)
                 {
                     auto df = generateDumpName("dump_BOPCheckFailed", {i}) + ".brep";
                     OccUtils::saveShape({item(i)}, df);
@@ -100,15 +107,17 @@ namespace Geom
             }
         }
 
-        /// report, save and display erroneous shape
+        /// `class Processor::report()`, save and display erroneous shape
+        /// if this processor 's config has file path in the "report" parameter
         virtual void prepareOutput() override final
         {
-            GeometryProcessor::prepareOutput(); // report()
+            GeometryProcessor::prepareOutput();
         }
 
     protected:
-        /** basic check, geometry reader may has already done this check
-         * adapted from FreeCAD project:  BOPcheck
+        /** basic check, geometry reader may has already done some of the checks below
+         * adapted from FreeCAD project:
+         * https://github.com/FreeCAD/FreeCAD/blob/master/src/Mod/Part/Gui/TaskCheckGeometry.cpp
          * */
         std::string checkShape(const TopoDS_Shape& _cTopoShape) const
         {
@@ -131,7 +140,7 @@ namespace Geom
                             switch (val)
                             {
                             case BRepCheck_NoError:
-                                // error_msg << ";No error";  // exmty string as a sign of no error
+                                // error_msg << ";No error";  // empty string as a sign of no error
                                 break;
                             case BRepCheck_InvalidPointOnCurve:
                                 error_msg << ";Invalid point on curve";
@@ -250,34 +259,59 @@ namespace Geom
             return error_msg.str();
         }
 
+        /// see impl in BOPAlgo_ArgumentAnalyzer.cpp
+        std::array<const char*, 12> BOPAlgo_StatusNames = {
+            "CheckUnknown",            //
+            "BadType",                 // either input shape IsNull()
+            "SelfIntersect",           // self intersection, BOPAlgo_CheckerSI
+            "TooSmallEdge",            // only for BOPAlgo_SECTION
+            "NonRecoverableFace",      // TestRebuildFace()
+            "IncompatibilityOfVertex", /// TestMergeSubShapes()
+            "IncompatibilityOfEdge",
+            "IncompatibilityOfFace",
+            "OperationAborted",      // when error happens in TestSelfInterferences()
+            "GeomAbs_C0",            // continuity
+            "InvalidCurveOnSurface", //
+            "NotValid"               //
+        };
 
-        /// check single TopoDS_Shape, adapted from FreeCAD and some other online forum
-        // two common, notorious errors should be turn off curveOnSurfaceMode, continuityMode
-        // didn't use BRepAlgoAPI_Check because it calls BRepCheck_Analyzer itself and
-        // it doesn't give us access to it. so I didn't want to run BRepCheck_Analyzer twice to get invalid results.
-        // BOPAlgo_ArgumentAnalyzer can check 2 objects with respect to a boolean op.
-        // this is left for another time.
-        bool BOPSingleCheck(const TopoDS_Shape& shapeIn)
+        const char* getBOPCheckStatusName(const BOPAlgo_CheckStatus& status)
         {
-            bool runSingleThreaded = true; //
-            // bool logErrors = true;
+            size_t index = static_cast<size_t>(status);
+            assert(index >= 0 || index < BOPAlgo_StatusNames.size());
+            return BOPAlgo_StatusNames[index];
+        }
+
+        /** use `BOPAlgo_ArgumentAnalyzer` to check 2 shapes with respect to a boolean operation.
+            two common errors are `curveOnSurfaceMode`, `continuityMode`
+            it is not clear which one cause BoP failure: `selfInterference`, `curveOnSurfaceMode`
+            Note: do NOT use BRepAlgoAPI_Check ( BRepCheck_Analyzer )
+        */
+        bool BOPSingleCheck(const TopoDS_Shape& shapeIn, std::stringstream& ss)
+        {
+            bool runSingleThreaded = true; // parallel is done on solid shape level for PPP, not on subshape level
+
             bool argumentTypeMode = true;
             bool selfInterMode = true; // self interference, for single solid should be no such error
-            bool smallEdgeMode = true;
+            bool smallEdgeMode = true; // only needed for de-feature?
+
+            bool continuityMode = false; // BOPAlgo_GeomAbs_C0, it should not cause BOP failure?
+            bool tangentMode = false;    // not implemented in OpenCASCADE
+
             bool rebuildFaceMode = true;
-            bool continuityMode = false; // BOPAlgo_GeomAbs_C0
-            bool tangentMode = true;     // not implemented in OpenCASCADE
-            bool mergeVertexMode = true;
+            bool mergeVertexMode = true; // leader to `BOPAlgo_IncompatibilityOfEdge`
             bool mergeEdgeMode = true;
             bool curveOnSurfaceMode = false; // BOPAlgo_InvalidCurveOnSurface: tolerance compatability check
 
-            // I don't why we need to make a copy, but it doesn't work without it.
-            // BRepAlgoAPI_Check also makes a copy of the shape.
+            // FreeCAD develper's note: I don't why we need to make a copy, but it doesn't work without it.
+            /// maybe, mergeVertexMode() will modify the shape to check
             TopoDS_Shape BOPCopy = BRepBuilderAPI_Copy(shapeIn).Shape();
             BOPAlgo_ArgumentAnalyzer BOPCheck;
 
-            //   BOPCheck.StopOnFirstFaulty() = true; //this doesn't run any faster but gives us less results.
+            // BOPCheck.StopOnFirstFaulty() = true; //this doesn't run any faster but gives us less results.
+            // BOPCheck.SetFuzzyValue();
             BOPCheck.SetShape1(BOPCopy);
+            // BOPCheck.SetOperation();  // by default, set to BOPAlgo_UNKNOWN
             // all settings are false by default. so only turn on what we want.
             BOPCheck.ArgumentTypeMode() = argumentTypeMode;
             BOPCheck.SelfInterMode() = selfInterMode;
@@ -287,19 +321,31 @@ namespace Geom
             BOPCheck.ContinuityMode() = continuityMode;
 #endif
 #if OCC_VERSION_HEX >= 0x060900
-            BOPCheck.SetParallelMode(!runSingleThreaded); // this doesn't help for speed right now(occt 6.9.1).
-            BOPCheck.SetRunParallel(!runSingleThreaded);  // performance boost, use all available cores
+            BOPCheck.SetParallelMode(!runSingleThreaded);
+            BOPCheck.SetRunParallel(!runSingleThreaded);
+
             BOPCheck.TangentMode() = tangentMode;         // these 4 new tests add about 5% processing time.
-            BOPCheck.MergeVertexMode() = mergeVertexMode;
+            BOPCheck.MergeVertexMode() = mergeVertexMode; // will it modify the shape to check?
             BOPCheck.MergeEdgeMode() = mergeEdgeMode;
             BOPCheck.CurveOnSurfaceMode() = curveOnSurfaceMode;
 #endif
 
-            BOPCheck.Perform();
-            if (!BOPCheck.HasFaulty())
-                return false;
-            return true;
+            BOPCheck.Perform(); // this perform() has internal try-catch block
+            if (BOPCheck.HasFaulty())
+            {
+                const BOPAlgo_ListOfCheckResult& BOPResults = BOPCheck.GetCheckResult();
+                BOPAlgo_ListIteratorOfListOfCheckResult BOPResultsIt(BOPResults);
+                for (; BOPResultsIt.More(); BOPResultsIt.Next())
+                {
+                    const BOPAlgo_CheckResult& current = BOPResultsIt.Value();
+                    if (current.GetCheckStatus() != BOPAlgo_CheckUnknown)
+                        ss << ";" << getBOPCheckStatusName(current.GetCheckStatus());
+                }
+                return true; // has failure
+            }
+            return false; // no fault
         }
+
     }; // end of class
 
 } // namespace Geom
diff --git a/src/PPP/Processor.h b/src/PPP/Processor.h
index 6681a77e..1a821cd3 100644
--- a/src/PPP/Processor.h
+++ b/src/PPP/Processor.h
@@ -130,8 +130,11 @@ namespace PPP
             myItemReports.resize(myInputData->itemCount());
         };
 
-        /// default imp: do nothing
-        virtual void prepareOutput(){};
+        /// default imp: call report()
+        virtual void prepareOutput()
+        {
+            report();
+        };
 
 
         /**  use it only if parallel process is not supported, such as reader and writer
@@ -174,25 +177,26 @@ namespace PPP
 
         /**
          * report after successfully processsing all items, according to verbosity level
-         * write/append into output Information, may also reportor to operator
-         * myItemReports
+         * write/append into output Information, may also report to operator interface
          */
         virtual void report()
         {
+            if (!hasParameter("report"))
+                return;
             Information report;
-            const std::size_t NShapes = myItemReports.size();
-            size_t errorCount = 0;
-            for (std::size_t i = 0; i < NShapes; i++)
+            const std::size_t NItems = myItemReports.size();
+            size_t reportItemCount = 0;
+            for (std::size_t i = 0; i < NItems; i++)
             {
                 if (hasItemReport(i))
                 {
                     const std::string& s = itemReport(i).str();
                     // report[itemName(i)] = std::string(s);
                     report[std::to_string(i)] = std::string(s);
-                    errorCount++;
+                    reportItemCount++;
                 }
             }
-            if (errorCount)
+            if (reportItemCount)
             {
                 auto outFile = dataStoragePath(parameter<std::string>("report"));
                 std::ofstream o(outFile);
@@ -206,7 +210,12 @@ namespace PPP
         {
             myItemReports[i] = std::make_shared<std::stringstream>(std::move(msg));
         }
-        void writeItemReport(const ItemIndexType i, const std::string msg)
+        void setItemReport(const ItemIndexType i, std::shared_ptr<std::stringstream> ssp)
+        {
+            myItemReports[i] = ssp;
+        }
+        /// itemReport(i) must NOT be nullptr (must be set before use this function)
+        void appendItemReport(const ItemIndexType i, const std::string msg)
         {
             itemReport(i) << msg;
         }
diff --git a/src/python/geomPipeline.py b/src/python/geomPipeline.py
index fbf2db41..11338a97 100644
--- a/src/python/geomPipeline.py
+++ b/src/python/geomPipeline.py
@@ -222,7 +222,7 @@ def geom_add_argument(parser):
 GeometryShapeChecker = {  # usually GeometryRead has done check after reading
     "className": "Geom::GeometryShapeChecker",
     "doc": "some config entry in processor is mappable to PPP::Processor::Attribute class",
-    "output": {  # this corresponding to Parameter class, this map to App::Parameter<> class
+    "report": {  # this corresponding to Parameter class, this map to App::Parameter<> class
         "type": "filename",  # renamed to `path`  type, or support both types
         "value": "shape_check_result.json",
         "doc": "save errors message as json file",

From 64844a06fcb69428af170dcf6ff8ff2f7781381f Mon Sep 17 00:00:00 2001
From: qingfengxia <qingfeng.xia@gmail.com>
Date: Fri, 15 Jan 2021 10:23:04 +0000
Subject: [PATCH 5/7] make background progressMonitor working on Windows

---
 src/PPP/Processor.h              | 11 ++++++++---
 src/python/pppMonitorProgress.py | 18 +++++++++++-------
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/src/PPP/Processor.h b/src/PPP/Processor.h
index 1a821cd3..a1c01f0b 100644
--- a/src/PPP/Processor.h
+++ b/src/PPP/Processor.h
@@ -505,9 +505,14 @@ namespace PPP
                 }
                 else
                 {
-                    // by appending `&`, then the command is nonblocking /detached from the main process
-                    Utilities::runCommand("python3 " + py_monitor_path.string() + " " + log_filename + " " + title +
-                                          " &");
+#ifdef WIN32
+                    // start /B  will start an app/cmd in backgroud, without /B option, a new console windows will show
+                    std::string cmd = "start /B python " + py_monitor_path.string() + " " + log_filename + " " + title;
+#else
+                    // On Posix OS by appending `&`, then the command is nonblocking /detached from the main process
+                    std::string cmd = "python3 " + py_monitor_path.string() + " " + log_filename + " " + title + " &";
+#endif
+                    Utilities::runCommand(cmd);
                 }
             }
             catch (...)
diff --git a/src/python/pppMonitorProgress.py b/src/python/pppMonitorProgress.py
index 54e53587..460dabc4 100644
--- a/src/python/pppMonitorProgress.py
+++ b/src/python/pppMonitorProgress.py
@@ -128,14 +128,16 @@ def monitor(self):
         self.elapsed_time_label.setText(
             "elapsed time: " + str(self.elapsed_time) + " seconds"
         )
-        self.progress_bar.setValue(self.progress)
+        self.progress_bar.setValue(int(self.progress))
 
     def parse(self, new_lines):
         p = self.progress
         for l in reversed(new_lines):
-            result = re.search(self.percentage_pattern, l)
+            result = re.search(self.percentage_pattern, l)  # return a match object, or None
             if result:
-                match_float = result.group(1)  # not a list
+                #print("matched regex string: ", result.group(0), " for the line: ", l)  # debug
+                #print(self.percentage_pattern, result)
+                match_float = result.group(1)  # group(0) is the entire match
                 if len(match_float) > 1:
                     p = float(match_float)
                     # print(l, p)
@@ -188,7 +190,7 @@ def generate_log_continuously(log_file):
             except IOError:
                 print("failed to lock file")
         p = (i + 1) * 10.0
-        logf.write(f"complated {p} percent\n")  # do not forget newline
+        logf.write(f"completed {p} percent\n")  # do not forget newline
         logf.flush()  # do not forget this, otherwise file is empty
         if using_lock:
             try:
@@ -203,7 +205,7 @@ def generate_log_continuously(log_file):
 
 if __name__ == "__main__":
     windows_title = "progress"
-    percentage_pattern = r" ([0-9,.]+) percent"
+    percentage_pattern = r"\s([0-9,.]+) percent"
     if len(sys.argv) < 2:
         print(USAGE)
         # sys.exit(1)
@@ -219,8 +221,10 @@ def generate_log_continuously(log_file):
         log_file = sys.argv[1]
         if len(sys.argv) > 2:
             windows_title = sys.argv[2]
-        if len(sys.argv) > 3:
-            percentage_pattern = sys.argv[3]
+        if len(sys.argv) > 3:  # the third arg may be just "&"
+            if sys.argv[3].find("percent") >0:
+                percentage_pattern = sys.argv[3]
+        print("DEBUG: using the regex pattern to parse percentage: ", percentage_pattern)
 
     app = QApplication(sys.argv)
     GUI = ProgressMonitor(log_file, windows_title, percentage_pattern)

From fc6343740a8eb7158705e92136fdb80072410f5f Mon Sep 17 00:00:00 2001
From: qingfengxia <qingfeng.xia@gmail.com>
Date: Mon, 18 Jan 2021 14:30:42 +0000
Subject: [PATCH 6/7] dump subshape with specific BOP fail error

---
 src/Geom/GeometryShapeChecker.h | 39 ++++++++++++++++++++++++---------
 1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/src/Geom/GeometryShapeChecker.h b/src/Geom/GeometryShapeChecker.h
index 2f141178..4db0c9af 100644
--- a/src/Geom/GeometryShapeChecker.h
+++ b/src/Geom/GeometryShapeChecker.h
@@ -75,22 +75,20 @@ namespace Geom
                 bool hasBOPFault = false;
                 try
                 {
-                    hasBOPFault = BOPSingleCheck(aShape, *ssp);
+                    hasBOPFault = BOPSingleCheck(i, *ssp); // why most of shape return true even no error msg???
                 }
                 catch (const std::exception& e)
                 {
                     hasBOPFault = true;
-                    LOG_F(ERROR, "BOP check has exception %s for item %lu ", e.what(), i);
+                    LOG_F(ERROR, "BOP check has std::exception %s for item %lu ", e.what(), i);
                 }
                 catch (...)
                 {
                     hasBOPFault = true;
-                    LOG_F(ERROR, "BOP check has exception for item %lu ", i);
+                    LOG_F(ERROR, "BOP check has other exception for item %lu ", i);
                 }
 
-                if (ssp->str().size() > 0)
-                    setItemReport(i, ssp);
-                if (hasBOPFault)
+                if (hasBOPFault && ssp->str().size() > 0)
                 {
                     auto df = generateDumpName("dump_BOPCheckFailed", {i}) + ".brep";
                     OccUtils::saveShape({item(i)}, df);
@@ -101,9 +99,11 @@ namespace Geom
                     }
                     else
                     {
-                        LOG_F(ERROR, "BOP check found fault for item %lu, ignore", i);
+                        LOG_F(ERROR, "BOP check found fault for item %lu, ignore this error", i);
                     }
                 }
+                if (ssp->str().size() > 0)
+                    setItemReport(i, ssp);
             }
         }
 
@@ -118,6 +118,8 @@ namespace Geom
         /** basic check, geometry reader may has already done some of the checks below
          * adapted from FreeCAD project:
          * https://github.com/FreeCAD/FreeCAD/blob/master/src/Mod/Part/Gui/TaskCheckGeometry.cpp
+         * passing BRepCheck_Analyzer is a must for boolean operation
+         * https://dev.opencascade.org/doc/overview/html/specification__boolean_operations.html#specification__boolean_10_1
          * */
         std::string checkShape(const TopoDS_Shape& _cTopoShape) const
         {
@@ -287,8 +289,9 @@ namespace Geom
             it is not clear which one cause BoP failure: `selfInterference`, `curveOnSurfaceMode`
             Note: do NOT use BRepAlgoAPI_Check ( BRepCheck_Analyzer )
         */
-        bool BOPSingleCheck(const TopoDS_Shape& shapeIn, std::stringstream& ss)
+        bool BOPSingleCheck(const ItemIndexType i, std::stringstream& ss)
         {
+            const TopoDS_Shape& aShape = item(i);
             bool runSingleThreaded = true; // parallel is done on solid shape level for PPP, not on subshape level
 
             bool argumentTypeMode = true;
@@ -305,7 +308,7 @@ namespace Geom
 
             // FreeCAD develper's note: I don't why we need to make a copy, but it doesn't work without it.
             /// maybe, mergeVertexMode() will modify the shape to check
-            TopoDS_Shape BOPCopy = BRepBuilderAPI_Copy(shapeIn).Shape();
+            TopoDS_Shape BOPCopy = BRepBuilderAPI_Copy(aShape).Shape();
             BOPAlgo_ArgumentAnalyzer BOPCheck;
 
             // BOPCheck.StopOnFirstFaulty() = true; //this doesn't run any faster but gives us less results.
@@ -335,17 +338,33 @@ namespace Geom
             {
                 const BOPAlgo_ListOfCheckResult& BOPResults = BOPCheck.GetCheckResult();
                 BOPAlgo_ListIteratorOfListOfCheckResult BOPResultsIt(BOPResults);
-                for (; BOPResultsIt.More(); BOPResultsIt.Next())
+                for (size_t j = 0; BOPResultsIt.More(); BOPResultsIt.Next(), j++)
                 {
                     const BOPAlgo_CheckResult& current = BOPResultsIt.Value();
                     if (current.GetCheckStatus() != BOPAlgo_CheckUnknown)
+                    {
                         ss << ";" << getBOPCheckStatusName(current.GetCheckStatus());
+                        dumpSubshape(i, j, current);
+                    }
                 }
                 return true; // has failure
             }
             return false; // no fault
         }
 
+        void dumpSubshape(const ItemIndexType i, const ItemIndexType subId, const BOPAlgo_CheckResult& result) const
+        {
+            const auto& faultyShapes1 = result.GetFaultyShapes1();
+            TopTools_ListIteratorOfListOfShape faultyShapes1It(faultyShapes1);
+
+            for (size_t k = 0; faultyShapes1It.More(); faultyShapes1It.Next(), k++)
+            {
+                auto df = generateDumpName("dump_subshape_BOPCheckFailed", {i, subId, k}) + ".brep";
+                const auto& subshape = faultyShapes1It.Value();
+                OccUtils::saveShape(subshape, df);
+            }
+        }
+
     }; // end of class
 
 } // namespace Geom

From 8a1bbf06997d23a2f019ad31decb6cadd4c68570 Mon Sep 17 00:00:00 2001
From: qingfengxia <qingfeng.xia@gmail.com>
Date: Mon, 18 Jan 2021 14:33:54 +0000
Subject: [PATCH 7/7] add dumped subshape stat in post-process.py

---
 src/python/analyzeProcessedResult.py | 31 ++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/src/python/analyzeProcessedResult.py b/src/python/analyzeProcessedResult.py
index 88882135..d9835e5c 100644
--- a/src/python/analyzeProcessedResult.py
+++ b/src/python/analyzeProcessedResult.py
@@ -8,19 +8,21 @@
 import scipy.io
 
 """
-USAGE: this_script.py  path_to_processed_result_folder
+
 """
 
+USAGE = "this_script.py  path_to_processed_result_folder"
+
 if len(sys.argv) > 1:
     case_folder = sys.argv[1]
     if not os.path.exists(case_folder):
         print("input argument: result folder does not exist!", case_folder)
         sys.exit(-1)
 else:
-    print("no input argument for the result folder, use the default")
+    print("no input argument as the result folder, use the default")
     case_folder = "../build/ppptest/test/"
     case_folder = "/home/qxia/OneDrive/UKAEA_work/iter_clite_analysis/"
-    case_folder = "/home/qxia/Documents/StepMultiphysics/parallel-preprocessor/result/mastu_processed/"
+    case_folder = "/mnt/windata/MyData/StepMultiphysics/ppp_validation_geomtry/mastu_processed/"
 
 # in that case folder, ppp will generate those 2 files by GeometryPropertyBuilder
 matched_files = glob.glob(case_folder + os.path.sep + "*metadata.json")
@@ -30,7 +32,7 @@
 
 
 matrix_filename = case_folder + os.path.sep + "myFilteredMatrix.mm"
-# "myCouplingMatrix.mm" is the final result exluding  NoCollision type
+# "myCouplingMatrix.mm" is the final result excluding  NoCollision type
 collisionInfo_filename = case_folder + os.path.sep + "myCollisionInfos.json"
 clearance_threshold = 1
 
@@ -38,6 +40,9 @@
 # fixing weak_interference has some log entry
 log_filename = case_folder + os.path.sep + "debug_info.log"
 
+shape_check_filename =  case_folder + os.path.sep + "shape_check_result.json"
+dumped_subshapes = glob.glob(case_folder + os.path.sep + "dump_subshape_BOPCheckFailed*.*")
+dumped_shapes = glob.glob(case_folder + os.path.sep + "dump_BOPCheckFailed*.*")
 
 def load_data(file_name):
     with open(file_name) as json_file:
@@ -45,6 +50,18 @@ def load_data(file_name):
         return data
 
 
+def shape_check_stat(filename):
+    if not os.path.exists(filename):
+        print(filename, "not exist!")
+        return
+    cinfo = load_data(filename)
+    N = 0
+    for i, info in cinfo.items():
+        errlist = info.split(";")
+        print(f"Item #{i} has error: ", errlist)
+        N += 1
+    print(f" {N} shapes has failed in BOP check")
+
 def metadata_stat(file_name):
     if metadata_filename:
         nb_suppressed = 0
@@ -113,7 +130,7 @@ def print_stat(mystat, N):
         print(key, ": number", mystat[key], ", ratio", mystat[key] / total_op)
 
 
-def mm_stat(matrix_filename):
+def matrix_stat(matrix_filename):
     # collisionInfo.json is sufficient to analysis
     if os.path.exists(matrix_filename):
         mat = scipy.io.mmread(matrix_filename)
@@ -140,6 +157,8 @@ def log_stat(log_filename):
 ###############################################
 if __name__ == "__main__":
     metadata_stat(metadata_filename)
+    shape_check_stat(shape_check_filename)
     collision_stat(collisionInfo_filename)
     log_stat(log_filename)
-    mm_stat(matrix_filename)
+    matrix_stat(matrix_filename)
+