-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crawler does not seem to work on websites that use shadowDOM #552
Comments
Hi, It indeed doesn't seem like it is possible to access to dom via query selectors with the As long as you can query selector something from the console, our scraper will be able to get it so you will be able to use DocSearch! |
Cannot select the content of shadowDOM through css selector or xpath. To select shadowDOM content like a css selector, need to extend the css selector, such as using 'body gem-book >> gem-book-sidebar >> gem-active-link >> a[href]'.split('>>').reduce(
(p, c, index, arr) => {
const isLastSelector = index === arr.length - 1;
return p.map((e) => [...e.querySelectorAll(c)].map((ce) => (isLastSelector ? ce : ce.shadowRoot))).flat();
},
[document],
); This is also an example of use in the browser, if it is |
Hi, I viewed the source code today, i found only a little update can support ShadowDOM. Can use custom downloaders to pull all DOM: # pseudocode
driver.execute_script("return document.documentElement.getInnerHTML();") Will get result: <head>...</head>
<body>
<gem-book>
<template shadowroot="open">
... content
</template>
</gem-book>
</body>
|
Hello Algolia Devs,
I tried to add search function to my website, but I got the reply that no content can be crawled. Is it because my website uses shadowdom?
Thank you.
The text was updated successfully, but these errors were encountered: