-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Nodriver] CDP DownloadWillBegin event not working #2060
Comments
You need handlers for both cdp.page. |
Thanks this worked: import asyncio
import nodriver as uc
from nodriver import cdp
binded_tabs = []
async def bind_handlers(browser):
global binded_tabs
while True:
await asyncio.sleep(0.01)
for tab in browser.tabs:
if tab not in binded_tabs:
tab.add_handler(cdp.page.DownloadWillBegin, lambda event: print('Download event => %s' % event.guid))
binded_tabs.append(tab)
async def crawl():
browser = await uc.start(headless=False)
asyncio.create_task(bind_handlers(browser))
await browser.get("https://www.python.org/ftp/python/3.13.0/python-3.13.0-amd64.exe")
await browser.get("https://code.visualstudio.com/sha/download?build=stable&os=win32-x64-user", new_tab=True)
while True:
await asyncio.sleep(0.2) # Keep the event loop alive
if __name__ == '__main__':
uc.loop().run_until_complete(crawl()) However sometimes when clicking a download button it redirects to a new tab entirely from where download will begin. In this case it doesn't detect download. I want to be able to add handlers to every opened tab current or future. How can I do this? Is there a cdp event for this as well? I checked out cdp.browser and it has DownloadWillBegin event class but when I use cdp.browser.DownloadWillBegin to above code the function to be called on download start event which in this case is a basic lambda logging function is not called. My aim is to detect download at browser level across every tab currently opened or future tabs. |
Yes there are 2 "domains" where this download events could be set but afaik it was just redundant but you could try setting it on the browser (for nodriver you then set it on browser.connection.add_handler) |
I haven't tried the method you suggested: browser.connection.add_handler with cdp.browser.DownloadWillBegin but tab.add_handler with cdp.page.DownloadWillBegin seems to work as I expected confirmed by test below: import time
import asyncio
import nodriver as uc
from nodriver import cdp
binded_tabs = []
async def bind_handlers(browser):
global binded_tabs
while True:
await asyncio.sleep(0.01)
for tab in browser.tabs:
if tab not in binded_tabs:
tab.add_handler(cdp.page.DownloadWillBegin, lambda event: print('Download event => %s' % event.guid))
binded_tabs.append(tab)
async def crawl():
browser = await uc.start(headless=False)
asyncio.create_task(bind_handlers(browser))
tab1 = await browser.get("https://code.visualstudio.com/sha/download?build=stable&os=win32-x64-user")
tab2 = await browser.get("https://journals.lww.com/anesthesia-analgesia/fulltext/2024/05000/special_communication__response_to__ensuring_a.2.aspx", new_tab=True)
time.sleep(5)
pdf_button = await tab2.find("//button[contains(., 'PDF')]")
await pdf_button.click()
while True:
await asyncio.sleep(0.2) # Keep the event loop alive
if __name__ == '__main__':
uc.loop().run_until_complete(crawl()) |
@abhash-rai I tried this and didn't work. It's funny because when I start the browser with uc.start and change the download path with set_download_path and cdp.browser.set_download_behavior and click on elements e ask to download by hand, it download on the path that I've asked but if I do with automation, I get the modal asking where to save. Any idea @ultrafunkamsterdam? |
Although tab.add_handler(cdp.page.DownloadWillBegin, lambda event: print('Download event => %s' % event.guid)) worked in most cases, it fails to detect download start events on cases like where pressing anchor tag element redirects to a new tab which starts download and then quickly closes the tab. @ultrafunkamsterdam I tried to follow what you said on setting it on the browser (for nodriver you then set it on browser.connection.add_handler) I tried these: browser.connection.add_handler(cdp.page.DownloadWillBegin, lambda event: print('Download event => %s' % event.guid))
browser.connection.add_handler(cdp.browser.DownloadWillBegin, lambda event: print('Download event => %s' % event.guid)) but it doesn't work. Here's my full code: import time
import asyncio
import nodriver as uc
from nodriver import cdp
async def crawl():
browser = await uc.start(headless=False)
time.sleep(2)
browser.connection.add_handler(cdp.page.DownloadWillBegin, lambda event: print('Download event => %s' % event.guid))
browser.connection.add_handler(cdp.browser.DownloadWillBegin, lambda event: print('Download event => %s' % event.guid))
await browser.get("https://code.visualstudio.com/sha/download?build=stable&os=win32-x64-user")
while True:
await asyncio.sleep(0.2) # Keep the event loop alive
if __name__ == '__main__':
uc.loop().run_until_complete(crawl()) I have been stuck with this problem for weeks. I want to be able to detect download start across all tabs opened currently or in the future. I would really appreciate help on this. |
@JoaoSobhie I think you should disable ' |
Hello. I just wanted to make a nodriver script that can detect downloads and its status. I tried to print logging "Download Started..." whenever tab download begins but it doesn't print the log which leads me to believe the event in my script is not setup correctly. I would also be grateful if anyone can add the functionality to detect download completion if download was started. Thanks in advance!
The text was updated successfully, but these errors were encountered: