Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concatenated jobs save state #110

Open
jakebiesinger opened this issue Dec 18, 2013 · 4 comments
Open

concatenated jobs save state #110

jakebiesinger opened this issue Dec 18, 2013 · 4 comments
Labels

Comments

@jakebiesinger
Copy link
Contributor

@anbangx @Elmira88 @JavierJia @Nan-Zhang

Elmira found a pretty major problem when running without -saveIntermediateResults recently: the vertex value is unchanged between iterations. I had thought that the entire dataset would be scrubbed by going through the Input/OutputFormat adapters between jobs (Node -> P4VertexValue -> Node -> RayVertexValue -> Node -> ...).

Turns out that's false and the only reason we haven't noticed before is because of a bug in pregelix by which concatenated jobs all use the generic base class VertexValue. I've submitted the bug report to @sigmod and there's a temporary solution: we manually scrub the data on the first iteration.

In the meantime, I recommend using -saveIntermediateResults to make sure that the state gets wiped between jobs. Specifying this flag will force an HDFS write and therefore a trip through the Input/OutputFormat adapters.

@jakebiesinger
Copy link
Contributor Author

This affects #101. When the REVERSE job is run, the seed vertex is considered already "visited" since it was touched in the FORWARD job.

@JavierJia
Copy link
Collaborator

I didn't quite get the problem. The concatenate job save state seemed make sense, since they are connected together.
We may add some explicit clear state iteration before the FORWARD job ?

@jakebiesinger
Copy link
Contributor Author

I didn't quite get the problem. The concatenate job save state seemed make sense, since they are connected together.

Ah, but the values are different classes. How can we expect to call RayVertexValue.readFields() on a stream that was written by P4PathMergeVertexValue.write()? It's kind of amazing that this has worked at all so far... We were lucky (or unlucky-- depends how you look at it) to have all these Value classes inherit from the same base class... and because of a pregelix bug, that base class was being used (but only in the first iteration? not sure).

We may add some explicit clear state iteration before the FORWARD job ?

Yes, I think on the first iteration, we'll have to add a "soft-reset" which will just clear the state.

The pregelix bug was just fixed so as a result, our jobs may not work now. I'll investigate today.

@JavierJia
Copy link
Collaborator

Got it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants