Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't fetch anymore collections (Opensea cloudflare update) #31

Closed
SKreutz opened this issue Jan 23, 2022 · 14 comments
Closed

Can't fetch anymore collections (Opensea cloudflare update) #31

SKreutz opened this issue Jan 23, 2022 · 14 comments

Comments

@SKreutz
Copy link

SKreutz commented Jan 23, 2022

Hey everybody,

I noticed that opensea increased the number of times it checked my browser when opening it.
I then noticed that the script can't fetch anymore collections.

✅ === OpenseaScraper.rankings() ===
=== OpenseaScraper.rankings() ===
...fetching 1 pages (= top 100 collections)
...opening url: https://opensea.io/rankings?sortBy=one_day_volume
...🚧 waiting for cloudflare to resolve
...exposing helper functions through script tag
...scrolling to bottom and fetching collections.
...🥳 DONE. Total Collections fetched: 0
scraped 0 collections:

Is this the same for you guys? Does anyone have an idea on how to fix this?

Thanks

@khalilsiu
Copy link

Same here.. try looking for solutions

@dcts
Copy link
Owner

dcts commented Jan 24, 2022

I can confirm the issue. Cloudflare is taking a very long time to resolve. I tested and could observe the following behavior:

  • it eventually resolves, but it can take up to multiple minutes.
  • the cloudflare page reloads multiple times. Increasing the overall timeout for all methods works, but not sure how usefull the scraping is when it takes ~2 mins to resolve.

Deploying a quickfix soon that works with extended execution time of up to 2 mins (not optimal though). If anyone has ideas how to resolve cloudflare please share.

I found this repo, will test and report soon:
https://github.com/JimmyLaurent/cloudflare-scraper

@dcts
Copy link
Owner

dcts commented Jan 24, 2022

ok cloudflare-scraper repo I tested does not work, it broke since cloudflare started to reload the page (which we also experience):
JimmyLaurent/cloudflare-scraper#39 (comment)

Unfortunately the waiting for 2 mins somehow also does not work consistently. Cant figure out why for now. Not sure what else to do.

I have to say sometimes cloudflare is getting more rigid and its a temporary thing, so chances are opensea experiences a ton of traffic and thats why cloudflare is more rigid, but we'll have to see if this issue persist long term.

Sorry theres no fix for now :( if anyone has ideas please share!

@khalilsiu
Copy link

would disabling timeout on waitForSelector work?
the error im getting is
waiting for selector .cf-browser-verification to be hidden failed: timeout 30000ms exceeded - guess they extended at a variable time..
also perhaps should wait for #__next tag to show?

@dcts
Copy link
Owner

dcts commented Jan 24, 2022

waitForSelector has a default timeout for 30000ms (30 secs). I tried extending to 2 mins but for me it still did not work. I also wanted to wait for the opensea page to appear, but with no luck. You can try around yourself:

// REPLACE
await page.waitForSelector('.cf-browser-verification', {hidden: true});
// WITH
await page.waitForSelector('#__next', {timeout: 120000});

hidden: true means puppeteer will wait for the selector to disappear. This was the logic before: wait until cloudflare class disappears, but we could also just wait for the opensea selector to appear. But as said above, for me it did not work consistently.

@SKreutz
Copy link
Author

SKreutz commented Jan 24, 2022

ok cloudflare-scraper repo I tested does not work, it broke since cloudflare started to reload the page (which we also experience): JimmyLaurent/cloudflare-scraper#39 (comment)

Unfortunately the waiting for 2 mins somehow also does not work consistently. Cant figure out why for now. Not sure what else to do.

I have to say sometimes cloudflare is getting more rigid and its a temporary thing, so chances are opensea experiences a ton of traffic and thats why cloudflare is more rigid, but we'll have to see if this issue persist long term.

Sorry theres no fix for now :( if anyone has ideas please share!

Thank you for your quick responses. It seems that it works again without changing anything. I think you were right with the assumption that cloudflare changed something because opensea had a lot of traffic. I'll keep this issue open a few more days and I'll report to you if something changed again

@dcts
Copy link
Owner

dcts commented Jan 24, 2022

Ah great, thanks for the report, its good to know that this issue happens. I know there is a solution to bypass cloudflare but its not that simple to do. They also change stuff a lot so its always a cat and mouse game to keep up with the changes.

Its good that it works again and please do report after you test over the next few days!

@khalilsiu
Copy link

khalilsiu commented Jan 26, 2022

the problem is still there when i am using cloud service to run the puppeteer.
i see that we are already using the stealth plugin.. :(
cloudflare keeps waiting like forever.

@SKreutz
Copy link
Author

SKreutz commented Jan 27, 2022

Did you try using a proxy? If you ran this on a cloud service it's probably just blacklisted by opensea.

I didnt run into anymore problems on Mac. If I try to run the script on Linux (computer in the same network) it stops right here without any errors like it's frozen.

✅ === OpenseaScraper.rankings() ===
=== OpenseaScraper.rankings() ===
...fetching 1 pages (= top 100 collections)
...opening url: https://opensea.io/rankings?sortBy=one_day_volume
...🚧 waiting for cloudflare to resolve
...exposing helper functions through script tag
...scrolling to bottom and fetching collections.

I use the same code 1:1 on mac and It worked on Linux a week ago. I'm clueless

@dcts
Copy link
Owner

dcts commented Jan 28, 2022

@khalilsiu

Which cloud service did you use? I am useing firebase functions (which basically is google cloud) and it works for me. If you want I can share my setup.

@dcts
Copy link
Owner

dcts commented Jan 28, 2022

@SKreutz can you open another issue for the linux bug?

@khalilsiu
Copy link

@khalilsiu

Which cloud service did you use? I am useing firebase functions (which basically is google cloud) and it works for me. If you want I can share my setup.

I'm using compute engine for that, perhaps i should try with firebase in that case
I havent tried using a proxy yet..
It would be great if you can share your setup with me @dcts

@khalilsiu
Copy link

Did you try using a proxy? If you ran this on a cloud service it's probably just blacklisted by opensea.

I didnt run into anymore problems on Mac. If I try to run the script on Linux (computer in the same network) it stops right here without any errors like it's frozen.

✅ === OpenseaScraper.rankings() === === OpenseaScraper.rankings() === ...fetching 1 pages (= top 100 collections) ...opening url: https://opensea.io/rankings?sortBy=one_day_volume ...🚧 waiting for cloudflare to resolve ...exposing helper functions through script tag ...scrolling to bottom and fetching collections.

I use the same code 1:1 on mac and It worked on Linux a week ago. I'm clueless

Exactly, I took screenshots during the froze and it happens that the waiting page is being redirected to every time, so it is stuck within that waiting room forever.

@SKreutz
Copy link
Author

SKreutz commented Jan 28, 2022

I'm closing this issue since the cloudflare problem is solved and opening a new one

@SKreutz SKreutz closed this as completed Jan 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants