-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[profile:matplotlib/jbmc ERROR] hit command failed #112
Comments
what sort of filesystem is /mnt/home? Have you tried running the script from an IPython debugger session? |
I think I know what the issue is --- the |
IPython debugging session gives:
|
And what's in: |
It ends like this:
There don't seem to be any other errors previously. |
That's a file system error. You might try disabling symbolic links in On Mon, Sep 30, 2013 at 3:26 PM, Ondřej Čertík [email protected]:
|
If you can point me to that, that would be awesome. What sort of filesystem error is it? |
Almost certainly: running ['hit', u'create-links', u'/tmp/hashdist-run-job-Vc2DXI/1_in0.json'] Is raising an IOError. We should probably catch that Exception (or provide You can try disabling the links with the following modification: 9 builder/recipes.py
On Mon, Sep 30, 2013 at 3:29 PM, Ondřej Čertík [email protected]:
|
Bleh, that didn't format well, it should look like this:
|
Weird, I think I may be seeing something similar if the destination already On Mon, Sep 30, 2013 at 3:35 PM, Aron Ahmadia [email protected] wrote:
|
Ok, on my work computer I am now getting the same error:
and
So it has nothing to do with changing the |
Ok, it's now rebuilding everything again with your patch. We'll see if it fixes it. It only happens with matplotlib, not with other things... |
Ok, so the patch does not fix it:
with:
So we still need to figure out a proper patch. Until then it's a high priority issue, since I can't work with hashdist anymore until I figure out a workaround. Something must have happened in the last patches, since things have been working for me perfectly before. |
I will take a look. A On Wednesday, October 2, 2013, Ondřej Čertík wrote:
|
Just to clarify, this is breaking when 'create-links' gets called for either png or matplotlib? To what level can you reproduce this? What commits of hashdist and hashstick are you using? I'll try to reproduce this on my local OS X box. |
This is 100% reproducible on my machine. I know it fails for matplotlib. It seems to work for some other packages. Do you have any ideas how to debug it? I can do the debugging. |
Have you tried running:
You could even run that in gdb or IPython for a better trace. |
I think I know. This patch: diff --git a/hashdist/core/links.py b/hashdist/core/links.py
index c7cc77b..6e751a5 100644
--- a/hashdist/core/links.py
+++ b/hashdist/core/links.py
@@ -254,19 +254,26 @@ def execute_links_dsl(rules, env={}, launcher_program=None
logger : Logger
"""
+ print "I am here"
actions = dry_run_links_dsl(rules, env)
for action in actions:
action_desc = "%s%r" % (action[0].__name__, action[1:])
try:
+ print "1"
if action[0] is make_launcher:
make_launcher(*action[1:], launcher_program=launcher_program)
else:
action[0](*action[1:])
+ print "2"
logger.debug(action_desc)
+ print "3"
except OSError, e:
# improve error message to include operation attempted
+ print "exception 1"
msg = str(e) + " in " + action_desc
logger.error(msg)
exc_type, exc_val, exc_tb = sys.exc_info()
+ print "exception 2"
raise OSError, OSError(e.errno, msg), exc_tb
+ print "OK" produces:
So the problem is in the lines:
I'll keep digging. |
If you |
On Thu, Oct 3, 2013 at 12:37 PM, ahmadia [email protected] wrote:
|
I think I've nailed it. This patch: diff --git a/hashdist/core/links.py b/hashdist/core/links.py
index c7cc77b..0b970ce 100644
--- a/hashdist/core/links.py
+++ b/hashdist/core/links.py
@@ -254,19 +254,29 @@ def execute_links_dsl(rules, env={}, launcher_program=None
logger : Logger
"""
+ print "I am here"
actions = dry_run_links_dsl(rules, env)
for action in actions:
action_desc = "%s%r" % (action[0].__name__, action[1:])
try:
+ print "1"
+ print action
if action[0] is make_launcher:
+ print "1a"
make_launcher(*action[1:], launcher_program=launcher_program)
else:
+ print "1b"
action[0](*action[1:])
+ print "2"
logger.debug(action_desc)
+ print "3"
except OSError, e:
# improve error message to include operation attempted
+ print "exception 1"
msg = str(e) + " in " + action_desc
logger.error(msg)
exc_type, exc_val, exc_tb = sys.exc_info()
+ print "exception 2"
raise OSError, OSError(e.errno, msg), exc_tb
+ print "OK" produces
|
Because easy-install.pth already exists and hit is trying to link in? On Thu, Oct 3, 2013 at 2:40 PM, Ondřej Čertík [email protected]:
|
That would be a bug :) Also, why is /tmp getting wiped out? We should definitely have the control to never delete anything that hashdist does. Perhaps we should raise separate issues now that we've identified the problems? |
I think the egg for diff --git a/packages.yml.linux b/packages.yml.linux
index 5672fd0..fe859ca 100644
--- a/packages.yml.linux
+++ b/packages.yml.linux
@@ -20,6 +20,7 @@
recipe: distutils
url: https://pypi.python.org/packages/source/r/readline/readline-6.2.4.1.tar.
key: tar.gz:4ahynyb57zjopukqftwfyzahbmzgehef
+ unpack_egg: true
deps: [python, distribute]
- package: pyzmq Then it works!!! |
Ok. So the
After the fix:
|
Why would the readline installer try to hijack easy-install.pth or site.py? Anyway, thanks for chasing this one down @certik and sorry I wasn't more On Thu, Oct 3, 2013 at 2:49 PM, Ondřej Čertík [email protected]:
|
That's done by setuptools, resp. distribute, so that the egg can be imported by Python automagically.
Just import hooks for this specific package. So each setuptools package has a specific hook in it --- and so it only works if you install things into an existing profile using
No, it is very intentional. But when I remove the "copy" hack, i.e. use symlinks, then it still fails with:
|
So for the last problem, we need to apply: --- a/hashdist/core/links.py
+++ b/hashdist/core/links.py
@@ -219,16 +219,24 @@ def dry_run_links_dsl(rules, env={}):
where `func` is one of `os.symlink`, :func:`silent_makedirs`,
`shutil.copyfile`.
"""
+ print "X1"
assert os.path.sep == '/'
+ print "X2"
actions = []
excluded = set()
makedirs_cache = set()
+ print "X3"
for rule in rules:
+ print "X4"
if 'select' in rule:
+ print "X5a"
_glob_actions(rule, excluded, makedirs_cache, env, actions)
else:
+ print "X5b"
_single_action(rule, excluded, makedirs_cache, env, actions)
+ print "X6"
+ print "X7"
return actions and we get
So this line fails: _glob_actions(rule, excluded, makedirs_cache, env, actions) |
Python Eggs were such a terrible idea... I assume you are doing a similar egg-install with matplotlib? On Thu, Oct 3, 2013 at 2:55 PM, Ondřej Čertík [email protected]:
|
We unpack all eggs. That way one can install things like mayavi and so on. Matplotlib does not use eggs. |
I tend to only install from source. I don't understand why Mayavi would be On Thu, Oct 3, 2013 at 3:03 PM, Ondřej Čertík [email protected]:
|
Mayavi and related packages use eggs. You can't install eggs with hashdist. That is unless you use the |
Ok, how do I enable exception printing in "hit"? It's a pain to debug it... Now it fails on this line:
and |
It raises:
|
We can add egg support, but I don't consider it essential right now. I'm On Thu, Oct 3, 2013 at 3:06 PM, Ondřej Čertík [email protected]:
|
You want |
Yeah, I just realized: diff --git a/builder/recipes.py b/builder/recipes.py
index 3140720..121abdf 100644
--- a/builder/recipes.py
+++ b/builder/recipes.py
@@ -26,11 +26,11 @@ def add_profile_install(ctx, pkg_attrs, build_spec):
rules += [
{"action": "relative_symlink",
- "select": "$ARTIFACT/lib/python*/site-packages/mpl_toolkits/**",
+ "select": "$ARTIFACT/lib/python*/site-packages/mpl_toolkits/**/*",
"prefix": "$ARTIFACT",
"target": "$PROFILE"},
{"action": "exclude",
- "select": "$ARTIFACT/lib/python*/site-packages/mpl_toolkits/**"},
+ "select": "$ARTIFACT/lib/python*/site-packages/mpl_toolkits/**/*"},
{"action": "relative_symlink",
"select": "$ARTIFACT/lib/python*/site-packages/*",
"prefix": "$ARTIFACT", @dagss --- how do we enable proper exception printing? At least to a log file. This debugging is a madness. |
Some places there's more printing with In general, adding patches to do more printing is fair game. There's many places to add such printing, I don't know if it makes sense that I try to anticipate it, it's much easier if you add the printing where you need it to be. |
There should be a printing to a log file when exception occurs and this log file should stay around. Currently the exception gets swallowed. |
Agreed on logging swallowed exceptions. On Thu, Oct 3, 2013 at 3:29 PM, Ondřej Čertík [email protected]:
|
I just hit this error at cloud.sagemath.com:
It used to work previously. I set the priority to high, because this bug prevents usage of hashstack.
Part of the bug is that the error message is not helpful --- there should be some obvious way to debug this.
The text was updated successfully, but these errors were encountered: