-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Authentication mechanism on the REST API of scrapyrt #68
Comments
hey @aleroot
no, nothing built in. You can do it in different ways. One way is to put scrapyrt behind some other webserver, for example nginx and configure rate limiting and auth in nginx. Other option is to write some python code and overriding scrapyrt default resource. There is option to create your own "resources" so basically your own request handlers. You can do it by subclassing CrawlResource and overriding some methods, e.g. render_GET then calling super() adding resources is described here http://scrapyrt.readthedocs.io/en/latest/api.html#resources for example you can write resource like this class AleRootCrawlResource(CrawlResource):
def render_GET(self, request, **kwargs):
# your code goes here e.g. fetch basic auth header etc
...
return super(AleRootCrawlResource, self).render_GET(
request, **kwargs) I'll think about adding some more extensive examples to docs with basic auth header, it could be useful for others. |
Hi, I know this thread is a bit old but bear with me. I had explored this solution and created my own resource, but when I tried to add it according to the documentation, it was only possible for me by referring to a specific settings.py file in the command line like this:
and it worked in my local environment, now the main CrawlResorce is the one I coded. But I tried to do the same over Heroku on the Procfile as follows:
and the ScrapyRT part still uses the default resource. I do not know if I cannot start Scrapy with variables on Heroku or if there is another way to override the resources safely. Git here: https://github.com/oscarcontrerasnavas/nist-webbook-scrapyrt-spider |
Basically I want to prevent unauthorized clients from accessing the scrapyrt API.
I would want to secure a scrapyrt API, is there anything built in handling an authorization mechanism ?
What kind of approach do you suggest ?
In addition I would like to understand if there is some mechanism to limit the number of maximum request per single client.
The text was updated successfully, but these errors were encountered: