You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I used the example of GpuKMeans to write a KMeans application over MNIST dataset that contains 60000 images, on the other hand, I used org.apache.spark.ml.clustering API over the same dataset in order to compare performance...
this is what I got as results for 20 iterations:
CPU: 16.54 s
GPU: 41,1 s
It's 2.48 slowdown !!!*
So how can I make KMeans application run faster on GPU ?
Ps: those results are obtained on spark-shell fired over 40 cores and M60 GPU
Best regards
aguerzaa
The text was updated successfully, but these errors were encountered:
Are you using mini-batches during training ?
GPUEnabler works best when it can contain the entire dataset in GPU memory and cacheGpu is used between transformations just to keep the data movement between GPU & CPU to the minimal.
Data movement are quite costly and it can easily overtake the computational gain we get from running on GPU.
No all data is first cached in GPU because all the data can be fit in GPU
memory of 7 Gb and cacheGpu is used between transformations just like the
example *GpuKMeans.scala*
Best regards
aguerzaa
On Wed, Feb 21, 2018 at 8:18 AM, Josiah Samuel ***@***.***> wrote:
Are you using mini-batches during training ?
GPUEnabler works best when it can contain the entire dataset in GPU memory
and cacheGpu is used between transformations just to keep the data movement
between GPU & CPU to the minimal.
Data movement are quite costly and it can easily overtake the
computational gain we get from running on GPU.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#78 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AQRvaHqT7fXDvtbotaZIfIuR_SWuLNVMks5tW8NEgaJpZM4SLl0h>
.
Hello!
I used the example of GpuKMeans to write a KMeans application over MNIST dataset that contains 60000 images, on the other hand, I used org.apache.spark.ml.clustering API over the same dataset in order to compare performance...
this is what I got as results for 20 iterations:
CPU: 16.54 s
GPU: 41,1 s
It's 2.48 slowdown !!!*
So how can I make KMeans application run faster on GPU ?
Ps: those results are obtained on spark-shell fired over 40 cores and M60 GPU
Best regards
aguerzaa
The text was updated successfully, but these errors were encountered: