-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce pa11y runtimes #3829
Reduce pa11y runtimes #3829
Conversation
This work attempts to solve the problem of overly long CI runs, in which single PRs were taking 25+ minutes, because each PR scans every page with pa11y-ci. See #3752. This commit adds a Jekyll::Hook which keeps track of which destination files (built HTML files) should be checked with pa11y-ci for each PR. The files to check are based on which files changed in the last commit. The design of the code lets you swap out different "differs", which are the objects responsible for getting the list of files. This design makes the code easier to test, and makes it more adaptable. One problem I'm seeing so far is that we only check what files were changed *in the last commit*. This means that, if you're ever running CI locally, you'd have to commit your changes to have them picked up by the "differ". We should probably change this in a future commit, and have a "differ" for local development and a "differ" for CI. To summarize, the differs should differ. We also have yet to implement the case where a stylesheet or something site-wide changes.
This commit adds site-wide pa11y checking. If a "global" folder (like _includes or _sass) sees a file change, we add a few files from several folders across the site, and include them in the list of files for pa11y to check. This also switches file "differs" based on the environment. If CI=true, then we diff the files from the last commit. Otherwise, we assume we're developing locally and get the files currently changed. This feels like draft code, the design feels off. There's constants set at file load/runtime, and with the new class involved, the file is a lot harder to read.
Fixes design issues in the previous commit. This commit: - Abstracts and simplifies the page-identification strategy, reducing code about special cases - Makes it easier to test, now that file paths can be injected - Adds minimal tests
54b7a2c
to
1e8d18a
Compare
I don't think it's necessary to do the `./serve-pa11y-ci` step anymore since we're scanning a list of files locally
- Ignore PDF documents, the SiteSampler was picking one up from the _principles directory - Extract the file checker logic into a function since another hook was needed that ran the same code - Add a :pages hook to pickup when files in the pages directory change, :documents wasn't doing that - Add rubular example of the RegEx being used to check for changes to files in the SITEWIDE_FOLDERS array
Tagging @beechnut - I know you're not on TLC crew quite yet but you did a lot of great work on this and wanted to make sure you had a chance to look at it. |
I also wonder if it's detrimental to never run a full pa11y scan. Should we consider adding a scheduled GitHub action to run a full scan maybe every two weeks? |
@caleywoods Yes, part of the scope of work is to add a nightly full scan. We shouldn't remove full scans from the pipeline without replacing them beforehand or in the same PR. |
- Allow the pa11y target file and differ to be injected into the class when instantiated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I've opened #3832 to track this and I plan on working on it this week. |
Pull request summary
Attempts to address #3752.
We're currently running pa11y scans using the sitemap which scans all of the URLs on the site which results in slowness. These changes use a two pronged approach to reducing the pa11y runtimes.
The Jekyll plugin
The first part is a Jekyll plugin that uses Jekyll hooks (
:documents
,:pages
, and:site
currently) to output a file namedpa11y_targets
to the filesystem during the build process. The plugin has the following behavior::documents
Jekyll hook and be included in the list of files to be scanned by pa11y. This takes care of the majority of file changespages/
directory, the:pages
Jekyll hook captures those to be included in the list of files to be scanned by pa11y (I'm not sure why:documents
doesn't get these)assets
,_includes
, or_sass
, the plugin does a sampling of pages across the site. These pages are: the homepage, three files that use each layout from_config.yml
'sdefaults
, as well as the first, middle, and last blog archive pages (/blog/page/:num
)pa11y_targets
file and each path is separated by a newlineThere's RSpec tests to go along with this plugin.
The CircleCi config
The second part is a tweak to
.circleci/config.yml
when the Jekyll build step runs. This change does two things:pa11y_targets
file be base64 encoded and exported into an environment variable calledPA11Y_TARGETS
in CircleCI's$BASH_ENV
area which is stateful between job stepsPA11Y_TARGETS
environment variableResults
Without these changes, the full build takes around 25 minutes in CircleCI with the pa11y scan representing the bulk of that time taking around 20-21 minutes. Testing these changes showed CircleCI full builds in under 4 minutes with the pa11y scan portion being reduced to < 10 seconds in most cases. These are big gains for the common scenario of single file changes when someone is creating or editing a blog post.
_layouts/styled-container.html
was changed, 3m 49s runtime (Note: this scan was prior to the:pages
hook being added,_layouts/styled-container.html
applies to three pages in thepages/
directory but adding three more files to the pa11y scan would not have caused a significant increase in the pa11y scan time)This could still have some long running build times/pa11y scans depending on what files are touched. An example would be
_layouts/post.html
, when I checked locally, changing that particular layout resulted in pa11y scanning a little over 500 files.