-
-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large file lookups are slow #228
Comments
try fuzi |
as ridiculous as it sounds, we're switching to rust FFI using UniFFI and the scraper crate (built on html5ever). Reason being
In preliminary tests, on large pages with lots of parsing this method outperforms swiftsoup by about 15 times, and without any sort of concurrency (we heavily used concurrency to mitigate swiftsoup's speed). The jury is out on small pages, conversion is very much incremental (800ms parsing was very much an emergency) |
Amazing, if you wrap that into an SPM package then please do share |
Setup was convoluted and poorly documented, I've written a tutorial on how we setup UniFFI here. Note instead of a swift wrapper for the scraper crate, all the business logic is within rust, so unfortunately it's not generalizable to a package. This is because rust is cool and FFI has some overhead. I can say that FFI has been a joy to use. It's a miracle how well it works once configured, there's absolutely no indication that what you're calling is a rust function. A couple limitations to be aware of though:
Overall I do not regret it. Edit: Just to substantiate these claims This is largely verbatim copy and paste code with the exception of swiftsoup running many operations async – 14.34x difference. |
Thanks for sharing the writeup I also came across https://github.com/antoniusnaumann/cargo-swift which looks promising lol-html is also interesting for not just parsing but also transforming https://shadowfacts.net/2022/swift-rust/ |
I have a large html file, about 13m, and it takes way too long to find the modifications. Is there any way to quickly find changes?
The text was updated successfully, but these errors were encountered: