爬小红书出现频繁访问的错误 #447

Machoman6 · 2024-10-02T05:56:31Z

Traceback (most recent call last):
File "D:\pythonProject.venv\MediaCrawler-main\Lib\site-packages\tenacity_asyncio.py", line 50, in call
result = await fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\clpq\MediaCrawler-main\media_platform\xhs\client.py", line 99, in request
raise DataFetchError(data.get("msg", None))
media_platform.xhs.exception.DataFetchError: 访问频次异常，请勿频繁操作或重启试试

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "D:\clpq\MediaCrawler-main\main.py", line 55, in
asyncio.get_event_loop().run_until_complete(main())
File "C:\Users\zxnb\AppData\Local\Programs\Python\Python312\Lib\asyncio\base_events.py", line 687, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "D:\clpq\MediaCrawler-main\main.py", line 45, in main
await crawler.start()
File "D:\clpq\MediaCrawler-main\media_platform\xhs\core.py", line 78, in start
await self.search()
File "D:\clpq\MediaCrawler-main\media_platform\xhs\core.py", line 138, in search
await self.batch_get_note_comments(note_id_list)
File "D:\clpq\MediaCrawler-main\media_platform\xhs\core.py", line 252, in batch_get_note_comments
await asyncio.gather(*task_list)
File "D:\clpq\MediaCrawler-main\media_platform\xhs\core.py", line 258, in get_comments
await self.xhs_client.get_note_all_comments(
File "D:\clpq\MediaCrawler-main\media_platform\xhs\client.py", line 288, in get_note_all_comments
comments_res = await self.get_note_comments(note_id, comments_cursor)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\clpq\MediaCrawler-main\media_platform\xhs\client.py", line 249, in get_note_comments
return await self.get(uri, params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\clpq\MediaCrawler-main\media_platform\xhs\client.py", line 116, in get
return await self.request(method="GET", url=f"{self.host}{final_uri}", headers=headers)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\pythonProject.venv\MediaCrawler-main\Lib\site-packages\tenacity_asyncio.py", line 88, in async_wrapped
return await fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\pythonProject.venv\MediaCrawler-main\Lib\site-packages\tenacity_asyncio.py", line 47, in call
do = self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\pythonProject.venv\MediaCrawler-main\Lib\site-packages\tenacity_init.py", line 326, in iter
raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x24f2e98f8c0 state=finished raised DataFetchError>]

luyixiao31 · 2024-10-04T02:49:45Z

你爬了多少条出现这个错误

xiaou61 · 2024-10-05T13:31:31Z

我也遇到了是爬取小红书评论的时候大概2000多条这个没办法了，只能说换ip了

xukaizhao · 2024-10-09T07:53:39Z

我就爬了三十多条就不行了

97wgl · 2024-10-12T03:25:55Z

测试了一下，好像是20个搜索词就会被限制。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

爬小红书出现频繁访问的错误 #447

爬小红书出现频繁访问的错误 #447

Machoman6 commented Oct 2, 2024

luyixiao31 commented Oct 4, 2024

xiaou61 commented Oct 5, 2024

xukaizhao commented Oct 9, 2024

97wgl commented Oct 12, 2024

爬小红书出现频繁访问的错误 #447

爬小红书出现频繁访问的错误 #447

Comments

Machoman6 commented Oct 2, 2024

luyixiao31 commented Oct 4, 2024

xiaou61 commented Oct 5, 2024

xukaizhao commented Oct 9, 2024

97wgl commented Oct 12, 2024