- scrapy slot use proxy host name and port
- fix python 3.9 build issue.
mean_backoff_time
stats are always returned as float, to make saving stats in databases easier.
- Fixed incorrect "proxies/good" stats values.
Proxy information is added to scrapy stats:
- proxies/unchecked
- proxies/reanimated
- proxies/dead
- proxies/good
- proxies/mean_backoff
ROTATING_PROXY_LIST_PATH
option allows to pass file name with a proxy list.
ROTATING_PROXY_BACKOFF_CAP
option allows to change max backoff time from the default 1 hour.
- fixed proxy authentication issue.
- fixed OverflowError during backoff computation.
- redirects with empty bodies are no longer considered bans (thanks Diga Widyaprana).
ROTATING_PROXY_BAN_POLICY
option allows to customize ban detection for all spiders.
max_proxies_to_try
request.meta key allows to overrideROTATING_PROXY_PAGE_RETRY_TIMES
option per-request.
- Update default ban detection rules: scrapy.exceptions.IgnoreRequest is not a ban.
- changed
ROTATING_PROXY_PAGE_RETRY_TIMES
default value - it is now 5.
- improved default ban detection rules;
- log ban stats.
Initial release