-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to crawler stop running? #16
Comments
working on it. |
After setting up GOPA, its in Start stage and no way to stop it. Also not sure if its indexing something |
hi Medcl, i am so excited to find GOPA as it seems to be promising for internal site search that i am trying to build - however i can get it to work - can you please help ? |
are you building from the source, or download from the lastest release? @Jasmi77 |
@Jasmi77 the master branch is under heavy development, I suggest you download the v0.10 released package (https://github.com/infinitbyte/gopa/releases/tag/v0.10.0) and read this README https://github.com/infinitbyte/gopa/tree/v0.10.0,note this version only support SQLite as persist database(for tasks). |
Hi - I'm new to ElasticSearch and have been experimenting with Gopa. I'm also having trouble understanding how to 'stop' the crawler. I've pointed it at a dev version of our site, and it seems to find just over 100 documents, but continues to generate a lot of tasks. It seems to be continuously crawling the site over and over. The site is fairly static, so what I would like to do is have Gopa crawl the site once, and then we can re-index as content is updated. Is it possible to configure Gopa to do that? Or to know when it has finished its initial crawl? |
Hi, @daveX99
|
@medcl : I'm sure you are busy, so I appreciate your quick response. I played a bit with the parameters to limit the URLs and that fixed my problem. There are some oddities in the links on the site I am indexing, and that was causing a weird recursion in gopa. Once I set the parameters under I will probably need to play with the configuration in gopa.yml some more to fine tune the indexing. Is there any documentation on how those keys/values work? Thanks again, |
the documents is a issue, and few tips of the configuration:
|
@medcl : I will keep that in mind. I am still learning the basics of how all this fits together. At this point, I am able to index the site with gopa and get the data into elasticsearch. Indexing does not take more than a few minutes now. If I have further questions, I will create a new question to the issue queue so that this one is not filled with off-topic issues. Thanks again for your quick replies! |
How to crawler stop running?
The text was updated successfully, but these errors were encountered: