Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Publish as a python package on PyPI #14

Open
aceberle opened this issue Jun 10, 2021 · 1 comment
Open

[Feature request] Publish as a python package on PyPI #14

aceberle opened this issue Jun 10, 2021 · 1 comment

Comments

@aceberle
Copy link
Contributor

I'm still catching up on how the python world works, but with the recent changes made to deadseeker I think it would be relatively easy to publish it as a python package on PyPI so that it can be installed as a standalone Python CLI utility.

I think most of the work would involve creating the recommended project files and possibly pulling in some sort of command-line argument parsing utility to allow the inputs to be provided via command line instead of environment variables (although as a first pass at this, environment variables would work fine too).

But I'm thinking something like this:

>pip install deadseeker
>python -m deadseeker --includeprefix="https://a.com/" --always-get-onsite https://a.com/

Use case

I am using cloudflare in front of my static site to help with edge caching to speed the site response time up. When I publish changes to the site, I have a workflow set up that uses another action to send a cache purge request to cloudflare and then I am using broken-links-crawler-action to crawl the site to repopulate the cache. So far as I can tell, this works perfectly now.

My client-side caching headers specify that the pages should only be cached for an hour, and I was considering setting up a periodic process (cron job) on my webserver to re-crawl the site every hour so that the cache refresh hit is less likely to impact an actual user and I figured that the logic in broken-links-crawler does this quite well. If the logic was also provided as a python package then I could install it on the webserver and run it via cron.

I know that this can be easily done by just cloning the project onto my webserver and running the code directly from there (and I might just do that today to solve my use-case for now), but I thought it might make it easier for other people if it was a published package. I may just be looking for a reason to publish a package on PyPI just to know how to do it, so feel free to ignore this request if it is so far outside the purpose of the project.

@ScholliYT
Copy link
Owner

Good idea @aceberle. I already though about separating the core logic from the GitHub action itself. This would also allow to publish the broken links crawler as standalone docker image with clean Environment variable input (instead of the INPUT_ required by GitHub Actions). This docker image could also solve your use case (although a bit overkill compared to a cron job) and many other like using the crawler in a GitLab pipeline.

I will have a look on how to build a python package and publish it to pypi because I neither did that before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants