Support for Spark Dataset added (#70)

* Spark 2.1 support * Sample codegen on top of DS * working sample * v2 * Stable jcuda codegen * Removed println * Removed the hardcoded size for partitions * code cleanup * Pinned Memory * Pinned Memory for GPUOUT * Array support * Array support 1 * Array Support * support for const array * made logical plan to produce & consume objects * incorporated review comments (#46) * incorporated review comments * include GpuDSArrayMult.scala sample file * GPU Caching Feature Added for GpuEnabler Dataset APIs (#49) * incorporated review comments * enable cache prelim * push cache setting to codegen part prelims * cached gpuPtrs handled in codegen part prelims * uncacheGPU for dataset included prelims * GPU dimensions support with provision for multiple stage execution prelims * handle constants - prelims * code package movement * handle constants - prelims2 * add comments - prelims2 * api's simplified * reorder gpuoutput in the args list * variable can be both GPUINPUT & GPUOUTPUT * variable can be both GPUINPUT & GPUOUTPUT bug fixes * started with testcase addition for GPU operations on Dataset * new testcases addition * cache bug fixes & new testcases added * bugs related to multithreading fixed * code comments added * code caching added for autogenerated code * cleaned up examples * cleaned up examples * Alloc host memory if the cached data is part of output * perf issue patched * minor mistake * include compare perf example * Performance & other misc patches (#50) * guava dependency wrong version fixed * prelim commit 1 * bug fixes * performance bug fixes * patch for perfDebug sample prog (#53) * API to load data into GPU (#56) * patch for perfDebug sample prog * Added loadGpu API * resolve conflicts * Added loadGpu API * resolve conflicts * samples invoke loadGpu * Performance optimization on speed and memory (#59) * auto caching added between GPU calls * toggle autocache gpu * logging added * testcase added for gpuonly cache * optimize mem alloc * reduce mem requirement * handle buffer underflow * optimize code flow * Support for Multi Dimension as input GPU Grid dimension (#60) * Additional GPU kernels added (#62) * auto caching added between GPU calls * toggle autocache gpu * logging added * testcase added for gpuonly cache * optimize mem alloc * reduce mem requirement * optimize code flow * sample program for logistic reg added * kmeans partial code drop * stabilize kmeans example * gpu memory limit check added * stabilize kmeans example * GpuKMeans example read from file * GpuKMeans example dump cluster index * minor patches * introduce gpuptr meta info & bug fixes * child gpuptrs cleanup bug fixed * GpuKMeans variants added * optimize loadGpu * optimize loadGpu * optimize loadGpu * optimize loadGpu * modify perfDebug to remove sleep * code cleanup * remove duplicate testcase * #Issue63: Support for CUDA8.0 and related jcuda libraries (#64) * changes for jcuda 8.0 * Address performance issue in map GPU functions (#67) * Heterogeneous Environment Support (#69) * Heterogeneous Environment Support * changes for Heterogeneous env and null dataset assertion * GPUEnabler version changed to 2.0.0 * GPUEnabler version changed to 2.0.0 * Update README.md
IBMSparkGPU · Oct 23, 2017 · 159bc74 · 159bc74
1 parent 05236aa
commit 159bc74
Show file tree

Hide file tree

Showing 689 changed files with 16,999 additions and 5,427 deletions.
diff --git a/README.md b/README.md
@@ -7,42 +7,45 @@ The following capabilities are provided by this package,
 * convert the data from partitions to a columnar format so that
   it can be easily fed into the GPU kernel.
 * provide support for caching inside GPU for optimized performance.
+* New: Supports Spark Datasets starting ver. 2.0.0.
 
 
 ## Requirements
 
-This package is compatible with Spark 1.5+ and scala 2.10
+This package is compatible with Spark 1.5+ and scala 2.10+
 
 
 | Spark Version |  Scala Version | Compatible version of Spark GPU |
 | ------------- |-----------------|----------------------|
+| `2.1+`        | `2.11`          |`2.0.0`               |
 | `1.5+`        | `2.10`          |`1.0.0`               |
 
+
 ## Linking
 
 You can link against this library (for Spark 1.5+) in your program at the following coordinates:
 
 Using SBT:
 
 ```
-libraryDependencies += "com.ibm" %% "gpu-enabler_2.10" % "1.0.0"
+libraryDependencies += "com.ibm" %% "gpu-enabler_2.11" % "2.0.0"
 ```
 
 Using Maven:
 
 ```xml
 <dependency>
     <groupId>com.ibm</groupId>
-    <artifactId>gpu-enabler_2.10</artifactId>
-    <version>1.0.0</version>
+    <artifactId>gpu-enabler_2.11</artifactId>
+    <version>2.0.0</version>
 </dependency>
 ```
 
 This library can also be added to Spark jobs launched through `spark-shell` or `spark-submit` by using the `--packages` command line option.
 For example, to include it when starting the spark shell:
 
 ```
-$ bin/spark-shell --packages com.ibm:gpu-enabler_2.10:1.0.0
+$ bin/spark-shell --packages com.ibm:gpu-enabler_2.11:2.0.0
 ```
 
 Unlike using `--jars`, using `--packages` ensures that this library and its dependencies will be added to the classpath.
@@ -54,8 +57,8 @@ The `--packages` argument can also be used with `bin/spark-submit`.
 * Support x86_64 and ppc64le
 * Support OpenJDK and IBM JDK
 * Support NVIDIA GPU with CUDA (we confirmed with CUDA 7.0)
-* Support CUDA 7.0 and 7.5 (should work with CUDA 6.0 and 6.5)
-* Support scalar variables in primitive scalar types and primitive array in RDD
+* Support CUDA8.0, CUDA 7.0 and 7.5 (should work with CUDA 6.0 and 6.5)
+* Support scalar variables in primitive scalar types and primitive array in Spark RDD & Dataset.
 
 ## Examples
 
@@ -64,6 +67,8 @@ The recommended way to load and use GPU kernel is by using the following APIs, w
 The package comes with a set of examples. They can be tried out as follows,
 `./bin/run-example GpuEnablerExample`
 
+Sample programs can be found [here](https://github.com/IBMSparkGPU/GPUEnabler/blob/master/examples/src/main/scala/com/ibm/gpuenabler/).
+
 The Nvidia kernel used in these sample programs is available for download
 [here](https://github.com/IBMSparkGPU/GPUEnabler/blob/master/examples/src/main/resources/GpuEnablerExamples.ptx).
 The source for this kernel can be found [here](https://github.com/IBMSparkGPU/GPUEnabler/blob/master/examples/src/main/resources/GpuEnablerExamples.cu).
@@ -73,34 +78,43 @@ The source for this kernel can be found [here](https://github.com/IBMSparkGPU/GP
 
 ```scala
 // import needed for the Spark GPU method to be added
-import com.ibm.gpuenabler.CUDARDDImplicits._
-import com.ibm.gpuenabler.CUDAFunction
+import com.ibm.gpuenabler.CUDADSImplicits._
+import com.ibm.gpuenabler.DSCUDAFunction
 
 // Load a kernel function from the GPU kernel binary 
-val ptxURL = SparkGPULR.getClass.getResource("/GpuEnablerExamples.ptx")
-
-val mapFunction = new CUDAFunction(
-        "multiplyBy2",      // Native GPU function to multiple a given no. by 2 and return the result
-        Array("this"),      // Input arguments 
-        Array("this"),      // Output arguments 
-        ptxURL)
-
-val reduceFunction = new CUDAFunction(
-        "sum",                  // Native GPU function to sum the input argument and return the result
-        Array("this"),          // Input arguments 
-        Array("this"),          // Output arguments
-        ptxURL)
+val ptxURL = "/GpuEnablerExamples.ptx"
+
+val mulFunc = DSCUDAFunction(
+      "multiplyBy2",        // Native GPU function to multiple a given no. by 2 and return the result
+      Seq("value"),         // Input arguments 
+      Seq("value"),         // Output arguments 
+      ptxURL)
+
+val dimensions = (size: Long, stage: Int) => stage match {
+  case 0 => (64, 256, 1, 1, 1, 1)
+  case 1 => (1, 1, 1, 1, 1, 1)
+}
+val gpuParams = gpuParameters(dimensions)
+
+val sumFunc = DSCUDAFunction(
+      "suml",
+      Array("value"),
+      Array("value"),
+      ptxURL,
+      Some((size: Long) => 2),
+      Some(gpuParams), outputSize=Some(1))
 
 // 1. Apply a transformation ( multiple all the values of the RDD by 2)
 //    (Note: Conversion of row based formatting to columnar format which is understandable
 //           by GPU is done internally )
 // 2. Trigger a reduction action (sum up all the values and return the result)
-val output = sc.parallelize(1 to n, 1)
-        .mapExtFunc((x: Int) => 2 * x, mapFunction)  
-        .reduceExtFunc((x: Int, y: Int) => x + y, reduceFunction)  
+
+val output = ss.range(1, N+1, 1, 10)
+        .mapExtFunc(_ * 2, mulFunc)
+        .reduceExtFunc(_ + _, sumFunc)
 ```
 
-### Java API
+### Java API (Supported only on RDD)
 
 ```java
 // import needed for the Spark GPU method to be added
@@ -164,6 +178,4 @@ Note:
 ## Testing
 To run the tests, you should run `mvn test`.
 
-## On-going work
-* Leverage existing schema awareness in DataFrame/DataSet
-* Provide new DataFrame/DataSet operators to call CUDA Kernels
+
diff --git a/bin/run-example b/bin/run-example
@@ -31,14 +31,19 @@ else
   exit 1
 fi
 
-EXAMPLE_MASTER=${MASTER:-"local[*]"}
+if [ -n "$1" ]; then
+  EXAMPLE_ARGS="$@"
+  EXAMPLE_ARGS=`echo $EXAMPLE_ARGS| sed "s/ \+/|/g"`
+fi
+
 
 if [[ ! $EXAMPLE_CLASS == com.ibm.gpuenabler* ]]; then
   EXAMPLE_CLASS="com.ibm.gpuenabler.$EXAMPLE_CLASS"
 fi
 
-echo "Executing : mvn -q scala:run -DmainClass=$EXAMPLE_CLASS -DaddArgs=\"$EXAMPLE_MASTER\""
+echo "Executing : mvn -q scala:run -DmainClass=$EXAMPLE_CLASS -DaddArgs=\"$EXAMPLE_ARGS\""
 
+echo $EXAMPLES_DIR
 cd $EXAMPLES_DIR
 
-mvn -q scala:run -DmainClass=$EXAMPLE_CLASS -DaddArgs="$EXAMPLE_MASTER"
+mvn -q scala:run -DmainClass=$EXAMPLE_CLASS -DaddArgs="$EXAMPLE_ARGS"
diff --git a/compile.sh b/compile.sh
@@ -27,6 +27,9 @@ if [[ $CUDA != 0 ]]; then
   elif [[ $CUDAVER == "7.0" ]]; then
     echo "Identified CUDA version is 7.0a"
     CUDAVER="jcuda70a"
+  elif [[ $CUDAVER == "8.0" ]]; then
+    echo "Identified CUDA version is 8.0"
+    CUDAVER="jcuda80"
   else
     echo "Not a supported version. Installation will fallback to default"
   fi