Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTPX Connection Error #195

Open
ErSauravAdhikari opened this issue May 28, 2024 · 5 comments
Open

HTTPX Connection Error #195

ErSauravAdhikari opened this issue May 28, 2024 · 5 comments
Labels

Comments

@ErSauravAdhikari
Copy link

Have been receiving HTTPX connection error consistently while trying to scrape for long timeline.

2024-05-27T18:29:19.901628010Z [2024-05-27 18:29:19,900: ERROR/ForkPoolWorker-9] Task apps.twa.tasks.start_custom_query_processing[847bd934-5425-4c18-919f-951cf67d00ac] raised unexpected: ConnectError('')
2024-05-27T18:29:19.901662837Z Traceback (most recent call last):
2024-05-27T18:29:19.901670714Z   File "/usr/local/lib/python3.10/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
2024-05-27T18:29:19.901680490Z     yield
2024-05-27T18:29:19.901689082Z   File "/usr/local/lib/python3.10/site-packages/httpx/_transports/default.py", line 373, in handle_async_request
2024-05-27T18:29:19.901697917Z     resp = await self._pool.handle_async_request(req)
2024-05-27T18:29:19.901703327Z   File "/usr/local/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 216, in handle_async_request
2024-05-27T18:29:19.901722343Z     raise exc from None
2024-05-27T18:29:19.901727898Z   File "/usr/local/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 196, in handle_async_request
2024-05-27T18:29:19.901733636Z     response = await connection.handle_async_request(
2024-05-27T18:29:19.901738949Z   File "/usr/local/lib/python3.10/site-packages/httpcore/_async/http_proxy.py", line 317, in handle_async_request
2024-05-27T18:29:19.901744617Z     stream = await stream.start_tls(**kwargs)
2024-05-27T18:29:19.901749692Z   File "/usr/local/lib/python3.10/site-packages/httpcore/_async/http11.py", line 383, in start_tls
2024-05-27T18:29:19.901756681Z     return await self._stream.start_tls(ssl_context, server_hostname, timeout)
2024-05-27T18:29:19.901766166Z   File "/usr/local/lib/python3.10/site-packages/httpcore/_backends/anyio.py", line 68, in start_tls
2024-05-27T18:29:19.901775624Z     with map_exceptions(exc_map):
2024-05-27T18:29:19.901780842Z   File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
2024-05-27T18:29:19.901786414Z     self.gen.throw(typ, value, traceback)
2024-05-27T18:29:19.901791619Z   File "/usr/local/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
2024-05-27T18:29:19.901797197Z     raise to_exc(exc) from exc
2024-05-27T18:29:19.901802405Z httpcore.ConnectError
2024-05-27T18:29:19.901807636Z 
2024-05-27T18:29:19.901812645Z The above exception was the direct cause of the following exception:
2024-05-27T18:29:19.901818152Z 
2024-05-27T18:29:19.901823044Z Traceback (most recent call last):
2024-05-27T18:29:19.901828172Z   File "/usr/local/lib/python3.10/site-packages/celery/app/trace.py", line 453, in trace_task
2024-05-27T18:29:19.901833767Z     R = retval = fun(*args, **kwargs)
2024-05-27T18:29:19.901842271Z   File "/usr/local/lib/python3.10/site-packages/celery/app/trace.py", line 736, in __protected_call__
2024-05-27T18:29:19.901852625Z     return self.run(*args, **kwargs)
2024-05-27T18:29:19.901858432Z   File "/app/apps/twa/tasks.py", line 120, in start_custom_query_processing
2024-05-27T18:29:19.901863986Z     raise e
2024-05-27T18:29:19.901870726Z   File "/app/apps/twa/tasks.py", line 115, in start_custom_query_processing
2024-05-27T18:29:19.901876443Z     loop.run_until_complete(run_scraper())
2024-05-27T18:29:19.901881694Z   File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
2024-05-27T18:29:19.901887351Z     return future.result()
2024-05-27T18:29:19.901892534Z   File "/app/apps/twa/tasks.py", line 108, in run_scraper
2024-05-27T18:29:19.901897954Z     await scraper.save_tweets_to_db()
2024-05-27T18:29:19.901903209Z   File "/app/logic/scraper/base_tweet_scraper.py", line 91, in save_tweets_to_db
2024-05-27T18:29:19.901908722Z     async for tweet in tweets_gen:
2024-05-27T18:29:19.901922674Z   File "/app/logic/scraper/base_tweet_scraper.py", line 77, in fetch_ticker_tweets
2024-05-27T18:29:19.901933195Z     async for tweet in self.api.search(query):
2024-05-27T18:29:19.901938668Z   File "/usr/local/lib/python3.10/site-packages/twscrape/api.py", line 156, in search
2024-05-27T18:29:19.901944460Z     async for rep in gen:
2024-05-27T18:29:19.901950020Z   File "/usr/local/lib/python3.10/site-packages/twscrape/api.py", line 151, in search_raw
2024-05-27T18:29:19.901955616Z     async for x in gen:
2024-05-27T18:29:19.901960841Z   File "/usr/local/lib/python3.10/site-packages/twscrape/api.py", line 117, in _gql_items
2024-05-27T18:29:19.901966346Z     rep = await client.get(f"{GQL_URL}/{op}", params=encode_params(params))
2024-05-27T18:29:19.901971886Z   File "/usr/local/lib/python3.10/site-packages/twscrape/queue_client.py", line 202, in get
2024-05-27T18:29:19.901977416Z     return await self.req("GET", url, params=params)
2024-05-27T18:29:19.901982946Z   File "/usr/local/lib/python3.10/site-packages/twscrape/queue_client.py", line 233, in req
2024-05-27T18:29:19.901988552Z     raise e
2024-05-27T18:29:19.901993529Z   File "/usr/local/lib/python3.10/site-packages/twscrape/queue_client.py", line 213, in req
2024-05-27T18:29:19.902002833Z     rep = await ctx.clt.request(method, url, params=params)
2024-05-27T18:29:19.902012607Z   File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1574, in request
2024-05-27T18:29:19.902018228Z     return await self.send(request, auth=auth, follow_redirects=follow_redirects)
2024-05-27T18:29:19.902023604Z   File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1661, in send
2024-05-27T18:29:19.902029053Z     response = await self._send_handling_auth(
2024-05-27T18:29:19.902034250Z   File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1689, in _send_handling_auth
2024-05-27T18:29:19.902039840Z     response = await self._send_handling_redirects(
2024-05-27T18:29:19.902045094Z   File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1726, in _send_handling_redirects
2024-05-27T18:29:19.902050753Z     response = await self._send_single_request(request)
2024-05-27T18:29:19.902056844Z   File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1763, in _send_single_request
2024-05-27T18:29:19.902082264Z     response = await transport.handle_async_request(request)
2024-05-27T18:29:19.902095556Z   File "/usr/local/lib/python3.10/site-packages/httpx/_transports/default.py", line 372, in handle_async_request
2024-05-27T18:29:19.902101973Z     with map_httpcore_exceptions():
2024-05-27T18:29:19.902107170Z   File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
2024-05-27T18:29:19.902112650Z     self.gen.throw(typ, value, traceback)
2024-05-27T18:29:19.902117825Z   File "/usr/local/lib/python3.10/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
2024-05-27T18:29:19.902128738Z     raise mapped_exc(message) from exc
2024-05-27T18:29:19.902134145Z httpx.ConnectError
@ErSauravAdhikari
Copy link
Author

Screenshot 2024-05-28 at 9 31 27 AM

Can we make it so that instead of crashing down, we use a different account for same request

@ErSauravAdhikari
Copy link
Author

What might have happened was in a pool of accounts, each account is linked to a proxy and some proxies may end up not working (temporarily).

It will work at a later date, we can ignore this account for a while, and try with different account.

@ErSauravAdhikari
Copy link
Author

I can add the retry mechanism client side, but I won't have access to which account was being used.

@ErSauravAdhikari
Copy link
Author

If I did then I could have a retry logic to change the proxy by assigning a failover proxy for this.

Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant