Replies: 2 comments 3 replies
-
You can try this one https://microsoft.github.io/genaiscript/reference/scripts/html/#converttablestojson Or even use the playwright integration to load the page ina browser and "surgically" extract the tables you need. https://microsoft.github.io/genaiscript/reference/scripts/browser/#_top The default max token is 4000 because it is so easy for a tool to return a massive result and overflow the context window but it could be bumped up so that if you use a long context model, you get the full output. |
Beta Was this translation helpful? Give feedback.
3 replies
-
in 1.81.0 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Everytime I want to fetch a page and send it to the LLM it exploses the token limit
As example I would like to pass documentation of this documentation page to generate a JSON schema. HTML to text or HTML to markdown breaks the table structure so LLM doesn't understand properly the documentation of fields
What strategy do you suggest to minify the HTML without compromising structure ?
Beta Was this translation helpful? Give feedback.
All reactions