Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

为什么不使用m.weibo.cn接口呢 #58

Open
Germey opened this issue Jul 7, 2017 · 6 comments
Open

为什么不使用m.weibo.cn接口呢 #58

Germey opened this issue Jul 7, 2017 · 6 comments

Comments

@Germey
Copy link

Germey commented Jul 7, 2017

比如 https://m.weibo.cn/api/container/getIndex?type=uid&value=2145291155&containerid=1076032145291155&page=14
此种接口不需要登录,而且信息是格式化的,不需要额外提取,各种接口和信息都比较全。

想问下你使用weibo.cn而不用m.weibo.cn有什么原因呢?

@ws0zzg4569
Copy link

尝试了一下似乎是可以获取到data 有什么限制么 比如ip访问限制

@yz21606948
Copy link

是可以的 ~ 限制什么都一样吧 ~ 我就是参考作者的,然后抓取m.weibo.cn的 ~ https://github.com/yz21606948/sinaSpider ~ 大规模的我没有测试过,但是我抓了几个小时没问题。

@Germey
Copy link
Author

Germey commented Aug 11, 2017

限制还是有的,如果不登陆直接请求,短时间会出现403,加上Cookies会解决403问题,小规模单机单job爬取没问题,如果增加job数,每秒达到约10次左右会出现414,此时需要切换代理解决

@ws0zzg4569
Copy link

这个containerID是干嘛的···

@sixs
Copy link

sixs commented Sep 13, 2017

weibo.cn这个接口只能抓到250页5000粉怎么解决呢?

@reconnecting
Copy link

确实只有250页的数据,剩下的就无法访问了。如果没有限制了,岂不是可以爬一个大v就一亿+的粉丝数据了??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants