-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
knowledge-collaboratory: meta_knowledge_graph/ endpoint in production does not respond (504 Gateway Time-out) #593
Comments
@CaseyTa is this something your team can take a look at? |
@vemonet Could you please check deployments to Test and Prod? I see CI is responsive for /meta_knowledge_graph, /query, and /health, but both Test and Prod are failing with 504 gateway timeouts. |
Hi @CaseyTa -- I notice that the /query endpoint in production has returned a 504 error for about 25% of the times that ARAX has called it in the past month or so -- maybe a related issue? (the above is taken from: https://arax.ncats.io/devLM/kptest/ with "production" selected) |
Hi, I recently fixed this time-out issue in dev and ITRB CI, as @CaseyTa mentioned it, but I don't think it was pushed to dev and prod yet I am making the request right now |
Fixed in test and prod now. Thanks, all! |
Hi @CaseyTa . Just following up on this in anticipation of next week's Relay, as this issue is still present in CI (we've logged over 700 failures in the past 3 days). Let me know if you need any info from our team. Thanks! |
Hi @isbluis it seems like https://collaboratory-api.ci.transltr.io is permanently returning 502 Bad Gateway I am not sure why though, the latest commit deployed on our development server don't face the same error: https://api.collaboratory.semanticscience.org/docs It would be nice if we could check ourselves which commit is deployed on CI, and if we could ourselves trigger deployment of the last commit (it seems like CI does not always automatically re-deploy the last commit), and ideally we should be able to see the logs of the container |
Hi @vemonet . Yea, we have also had some issues in ARAX trying to get a view into what is happening in those CI instances. I am not in ARS or configure servers, so I am of limited help in this regard. Maybe request a bounce and see if that fixes it? (yea, the typical IT "solution" :) ) Thanks. |
This appears to be resolved. Closing. |
Hello again, |
Hi @CaseyTa . Just following up on this in anticipation of this week's closing of the code window, as this issue is still present in TEST (we've logged almost 3500 failures since Feb 16). |
@isbluis Thanks for the notice. ITRB were able to resolve the issue for us. |
Hi @CaseyTa . It appears that the issue is still present, as ARAX keeps being unable to retrieve a meta_knowledge_graph from TEST, CI, and even sometimes PRODUCTION instances. It seems that the issue is that it takes a very long time to get a response, so we just time out and move on without expanding. As an example, you can try this on the command-line: I don't know the source of the latency, but this appears specific to infores:knowledge-collaboratory . Hope this helps! |
@vemonet Could we add a cache to the meta_knowledgegraph endpoint? |
Hi, everytime I check the metaKG it takes consistently 3 to 4s to answer. But since it queries the Nanopub network endpoint (public endpoint), the time might change depending on the load Yes, @CaseyTa you can add a cache for the meta_kg endpoint |
@vemonet Thanks, I did test the meta_kg endpoints a few weeks ago and observed response times consistently around 30 sec, but I am seeing 3-5 sec response times right now. Will see if we can get caching added in this or next sprint cycle. |
We will track this issue in the Knowledge Collab repo so that we can take it off the hands of the TAQA group. |
This has been an issue again in ci and dev since about May 15. Both other endpoints (prod, test) appear to be fine. |
This is currently happening in PRODUCTION as of this weekend. Perhaps related to TRAPI 1.5 vs. 1.4? |
MaastrichtU-IDS/knowledge-collaboratory#17 addresses this issue. Tested working in CI. |
Fixes deployed and working in ITRB-Test now. |
ARAX is currently unable to expand to knowledge-collaboratory in production, since the published endpoint (https://collaboratory-api.transltr.io/meta_knowledge_graph) times out and does not return a valid meta_knowledge_graph.
To repro:
wget https://collaboratory-api.transltr.io/meta_knowledge_graph
Or try via the OpenAPI page:
https://smart-api.info/ui/89054eff6ee6d91641d278d9ffdb3993#/trapi/Get_the_meta_knowledge_graph_of_the_Nanopublication_network_meta_knowledge_graph_get
The text was updated successfully, but these errors were encountered: