Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

waf_verify_bot 有区分大小写或支持正则表达式吗 #147

Open
xyz5s opened this issue Nov 19, 2024 · 4 comments
Open

waf_verify_bot 有区分大小写或支持正则表达式吗 #147

xyz5s opened this issue Nov 19, 2024 · 4 comments

Comments

@xyz5s
Copy link

xyz5s commented Nov 19, 2024

ngx_waf: https://hub.docker.com/layers/addsp/ngx_waf-prebuild/ngx-1.25.4-module-current-glibc
nginx version: nginx/1.25.4
conf: waf_verify_bot strict GoogleBot googlebot BingBot BaiduSpider YandexBot ;

error:
curl -I 127.0.0.1 -H 'User-Agent: BaiduSpider'
HTTP/1.1 200 OK
Server: nginx/1.25.4
Date: Tue, 19 Nov 2024 06:25:13 GMT
Content-Type: text/html
Content-Length: 615
Last-Modified: Tue, 19 Nov 2024 05:54:38 GMT
Connection: keep-alive
ETag: "673c281e-267"
Accept-Ranges: bytes

curl -I 127.0.0.1 -H 'User-Agent: Baiduspider'
HTTP/1.1 403 Forbidden
Server: nginx/1.25.4
Date: Tue, 19 Nov 2024 06:25:10 GMT
Content-Type: text/html
Content-Length: 153
Connection: keep-alive

curl -I 127.0.0.1 -H 'User-Agent: googlebot'
HTTP/1.1 200 OK
Server: nginx/1.25.4
Date: Tue, 19 Nov 2024 06:25:44 GMT
Content-Type: text/html
Content-Length: 615
Last-Modified: Tue, 19 Nov 2024 05:54:38 GMT
Connection: keep-alive
ETag: "673c281e-267"
Accept-Ranges: bytes

curl -I 127.0.0.1 -H 'User-Agent: Googlebot'
HTTP/1.1 403 Forbidden
Server: nginx/1.25.4
Date: Tue, 19 Nov 2024 06:25:48 GMT
Content-Type: text/html
Content-Length: 153
Connection: keep-alive

log:
2024/11/19 06:25:10 [alert] 81602#81602: *12 ngx_waf: [FAKE-BOT][Baiduspider] while logging request, client: 127.0.0.1, server: localhost, request: "HEAD / HTTP/1.1", host: "127.0.0.1"
2024/11/19 06:25:48 [alert] 81602#81602: *15 ngx_waf: [FAKE-BOT][GoogleBot] while logging request, client: 127.0.0.1, server: localhost, request: "HEAD / HTTP/1.1", host: "127.0.0.1"

@xyz5s
Copy link
Author

xyz5s commented Nov 19, 2024

waf_verify_bot strict GoogleBot googlebot BingBot BaiduSpider YandexBot ;

curl -I 127.0.0.1 -H 'User-Agent: googlebot'
HTTP/1.1 200 OK
Server: nginx/1.25.4
Date: Tue, 19 Nov 2024 07:19:37 GMT
Content-Type: text/html
Content-Length: 615
Last-Modified: Tue, 19 Nov 2024 05:54:38 GMT
Connection: keep-alive
ETag: "673c281e-267"
Accept-Ranges: bytes

@ADD-SP
Copy link
Owner

ADD-SP commented Nov 20, 2024

char* ngx_http_waf_verify_bot_conf(ngx_conf_t* cf, ngx_command_t* cmd, void* conf) {
static char* s_google_bot_ua_regexp[] = {
"Googlebot",
"Google Favicon",
"Googlebot-News",
"Googlebot-Image",
"Googlebot-Video",
"Google-Read-Aloud",
"AdsBot-Google",
"AdsBot-Google-Mobile",
"AdsBot-Google-Mobile-Apps",
"APIs-Google",
"googleweblight",
"Storebot-Google",
"DuplexWeb-Google",
// "FeedFetcher-Google",
"Mediapartners-Google",
NULL
};

可能需要手动编辑一下源代码来实现,目前硬编码的正则是大小写敏感的。


不过这个需求听起来比较奇怪,是有什么特殊的用例么?因为搜索引擎的 UA 通常都不会随便变化的,

@xyz5s
Copy link
Author

xyz5s commented Nov 21, 2024

waf难道不是有部分功能来防爬的吗,如果这个不算bug,我是不是都可以伪装UA绕过waf

@ADD-SP
Copy link
Owner

ADD-SP commented Nov 24, 2024

waf难道不是有部分功能来防爬的吗,如果这个不算bug,我是不是都可以伪装UA绕过waf

@xyz5s 请参考严格模式,可以通过反向 DNS 验证 bot 身份。

https://add-sp.github.io/ngx_waf-docs/zh-cn/advance/directive.html#waf-verify-bot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants