Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maybe a better implementation of ruby binding for Apache Spark #33

Open
chyh1990 opened this issue Apr 12, 2016 · 2 comments
Open

Maybe a better implementation of ruby binding for Apache Spark #33

chyh1990 opened this issue Apr 12, 2016 · 2 comments

Comments

@chyh1990
Copy link

Hi,

I have written a new prototype for ruby spark binding

https://github.com/chyh1990/jruby-spark

Although this implementation only works on JRuby, I think this approach is more promising:

  • REAL closure/lambda serialization, with elegant syntax

https://github.com/chyh1990/jruby-spark/blob/master/examples/pagerank.rb

  • use JVM infrastructure, run on YARN with the standard job submission workflow
  • reuse Java/Scala API, we can get Streaming/SQL/GraphX support nearly for free

https://github.com/chyh1990/jruby-spark/blob/master/examples/sqltest.rb

  • Easier to maintain even without merging into mainline spark

The prototype is preliminary, but the concept is proved. I think ruby would be a
more elegant binding language for spark than python. I'm looking forward for more
participants!

@ondra-m
Copy link
Owner

ondra-m commented Apr 15, 2016

Do you have some install guide?

I tried rake package but go:

jruby-spark/src/main/scala/org/apache/spark/jruby/JRubyIteratableAdaptor.scala:6: error: object jruby is not a member of package org
...

@gnilrets
Copy link

It's unfortunate that @chyh1990 is not very responsive right now, but I was able to get a spark session going after doing a few extra things - chyh1990/jruby-spark#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants