Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jobs failing #12

Closed
MikeeRead opened this issue Sep 25, 2020 · 5 comments
Closed

Jobs failing #12

MikeeRead opened this issue Sep 25, 2020 · 5 comments

Comments

@MikeeRead
Copy link

Hi Dan & Rick

I'm still trying to download large chunks of MAST's PanStarrs_DR2.

I'm running two scripts one using casjobs.py and the batch queue and the other using mastcasjobs as the quick queue.

I seem to be getting more and more errors of the type

casjobs:
SubmitJob failed with status: 500
System.Data.SqlClient.SqlException: Could not find prepared statement with handle -1.
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)

mastcasjobs
ExecuteQuickJob failed with status: 500
System.Data.SqlClient.SqlException: Could not find prepared statement with handle -1.

Bernie Shiao keeps giving things a kick and then things work for a bit (at the start stuff would work for several days but recently it's failing pretty much every day.

Within my script I'm basically running a query and downloading the results and looping through large tables. My "understanding" is that my python calls to *casjobs is just resulting in http requests and there's nothing I need to do in terms of closing out / finalizing before issuing the next interation. The error maybe suggest of running out of resources.

Again this isn't really any issue with the *casjobs.py but just hoping you might have some insight into how I can get things running more stable.

thanks
Mike

@dfm
Copy link
Owner

dfm commented Sep 26, 2020

Pinging @rlwastro :D

@MikeeRead
Copy link
Author

In case either of you can spot anything in my batch queue script it's at

http://www-wfau.roe.ac.uk/www-data/mar/q.txt

Currently it's not crashing but just going really slowly as it's taking some 40 minutes to download each 300MB output file. My network elsewhere seems OK and anyhow the slow network is a separate issue I suspect. Looking at the web UI the download file is generated quickly but then I tried a direct wget of the file and that confirmed that it's the transfer itself that's slow.
Cheers
Mike

@MikeeRead
Copy link
Author

It stayed working for a few hours then then both scripts (long and short) started throwing the same sort of errors.

I can submit queries via the web UI. Do the python packages call different endpoints then the web UI?
Cheers
Mike

SubmitJob failed with status: 500
System.Data.SqlClient.SqlException: Could not find prepared statement with handle -1.
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)

@rlwastro
Copy link
Collaborator

rlwastro commented Sep 28, 2020

MAST is generally having a lot of issues with multiple services (not just Casjobs), which may be contributing to the instability you are seeing. Something is hammering our databases and is slowing down all services. I'll let Bernie know about the issue (if he does not already know about it). But until the general problem is fixed, you may continue to see these issues. Sorry for the trouble!

@MikeeRead
Copy link
Author

Hi Rick
Thanks hopefully it's that and not something I'm doing wrong. I put in a help desk email about slow downloads in case that is a separate issue.
Cheers
Mike

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants