Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Webdriver Dependency / Linux Support / Update Node modules #1

Open
cycle-five opened this issue Jun 7, 2022 · 2 comments
Open

Comments

@cycle-five
Copy link

Hey Aiden, I've been looking for something like this and it looks like you've done a great job. I would be interested in running this API on a linux server though, and remove the dependency on the webdriver and just use http requests. Were there any fundamental issues you ran into that caused you to go the webdriver route? A lot of the node modules are deprecated or outdated so I'd like to at least update that and get this running as is.

@aidenvalentine
Copy link
Owner

When I first started working on this project I had to learn everything about automating forms from square one. I tried iMacros, Selenium for KNIME, PhantomJS, etc. to get it done. So this would have been from a time before I get into some even more advanced techniques. I did use Selenium Grid to have multiple browser instances available for parallel processing, and it didn't tie up the main NodeJS thread.

With WebDriver the user can intercept the request and fill in any missing data on the form. Recently, post-Girls Do Porn & PornHub vs. New York Times expose of the incident caused a dramatic change in performer identification industry-wide. Most sites now require the uploader to upload all performer IDs and Model Releases along with the form.

So having a browser window popup allows the user to upload any files & documents like IDs/2257, verify the description & title don't get flagged for banned words which would result in it being rejected, etc. It was built as a workflow side-kick for producers, and a starting API for automation.

It should definitely be possible to use something like NPM Request and send raw POST requests. You'd have to login to the site, which may require cracking a Captcha. You can see an example of how to do that here : https://github.com/aidenvalentine/xvideos-user-stats-tracker/blob/f80c80251443f9167c9fb09e453667b24974774c/xvideos-captcha-bypass-get-earnings.js#L86

@cycle-five
Copy link
Author

Awesome, thanks for the insight, that all makes a lot of sense. Cracking captcha definitely isn't the plan. The way http operates though it should certainly be possible to detect and redirect such captchas or needed verification asynchronously to the end user without have to run a browser instance on the backend. I imagine that with a static IP per account that should only happen once at the start, and rarely afterwards, so the uploading on the api side should be able to operate completely idempotently.

I'm going to try to rip out the webdriver, update the modules and get manyvids working in the next few days. That should be a good proof of concept and one of the more important features I'm looking for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants