Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Really optimise activation speed #121

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

vitch
Copy link
Contributor

@vitch vitch commented Jul 4, 2022

What Changed & Why

Building on #119 and trying to fix #120 this PR introduces a couple of new config options for this plugin:

  • fetchOnlyRelevantRevisions - If set then this changes the behaviour of fetchInitialRevisions so that it only continues hitting ListObjects until it has found the currently active revision. This is done on the assumption that the main use-case for fetchInitialRevisions is to be able to build other plugins which record the previous and current version when you activate
  • disableFetchRevisions - if set then this short-circuits fetchRevisions within this plugin. This is because fetchRevisions is called after activate and I'm not sure what the list of revisions is needed for (see related conversation in Activate task becomes very slow when you have a lot of deployed items #120). Not loading it can speed things up considerably.

If I use this branch and set both flags for one of our apps and environments (which has ~60,000 deployments in S3) then the activate command completes in 139 seconds on my computer (down from 590s on #119 and 835s prior to that - an overall improvement of over 80%).

The way to get quicker than this is to track the currently active version more explicitly somewhere so we don't have to loop over S3 at all - at that point it feels like maybe it is a different plugin though...

Related issues

#120

PR Checklist

I'm happy to do the below stuff but thought it was worth submitting this for discussion at this point... How do people feel about the changes? Do they make sense? Would they be useful to anyone other than us? Any better names for the config options?

  • Add tests
  • Add documentation
  • Prefix documentation-only commits with [DOC]

People

All maintainers? :)

Kelvin Luck added 4 commits July 4, 2022 21:53
There are cases where we might want to page through the list of objects available
on S3 but we don't need to keep going once we've found what we are looking for.
This commit updates `listAllObjects` to accept an `until` argument - if this
is provided then when a new page of results is loaded we check if `some` of them
match the `until` - if so we don't bother loading another page.

A concrete use-case for this is in ember-cli-deploy#120 - sometimes we just want to
find the currently active revision so we only need to loop through the "pages"
on S3 until we do.
This loads the head object and then calls `listAllObjects` with the new
`until` parameter and stops as soon as it has found an object which
matches the `ETag` of the active revision.

It returns the active revision in the same format that `fetchRevisions`
would - for backward compatibility with use-cases which were depending
on that.
If you set this option in your S3 config then we will only load enough revisions from
S3 to find the currently active version and then we will stop.
The assumption here is that the main use-case for `fetchInitialRevisions` is to be
able to do some sort of changelog or audit log from another plugin where we would
want to know the details of the previously active revision.
Since looping over revisions from S3 can be slow when you have too many of them
stored (see ember-cli-deploy#120) we provide an option to short circuit that.
If you pass this flag then `fetchRevisions` doesn't attempt to fetch _any_ revisions
from S3. This is useful because fetching the revisions from S3 can take a long time
(see ember-cli-deploy#120) and if you don't have any reason to use them then this time is wasted.

In our use case I have hooked this flag up to an environment variable. e.g. - in `deploy.js`:

```
disableFetchRevisions: process.env.DISABLE_FETCH_REVISIONS
```

Then we can set that environment variable when we are calling the `activate`
command and leave it off at any other time (so e.g. `ember deploy:list` works as
expected - if you have the time to wait)
@vitch vitch force-pushed the really-optimise-activation-speed branch from f8c3208 to cc321ef Compare July 4, 2022 21:29
It's possible you are deploying or activating an app which hasn't
previously been activated. In these cases we won't be able to
find a currently active revision and we shouldn't error
@lukemelia
Copy link
Contributor

Assuming 1) these new flags default to false and 2) the new code paths have decent test coverage to prevent regressions, I'm supportive of these changes.

@vitch
Copy link
Contributor Author

vitch commented Jul 5, 2022

Cool - I'll see if I can put together some test coverage...

I'm wondering about another flag (maybe disableFetchInitialRevisions although I'm open to input on naming for all of them) which completely short circuits fetchInitialRevisions. In our usage we can potentially get the information from the last active revision from elsewhere (our audit log) so we don't need to do any ListObject at all... Thoughts?

@lukemelia
Copy link
Contributor

Here's another out-of-the-box idea: instead of these flags, we allow configuration of a fetchInitialRevisionsFunc that is used instead of the included implementation. Then you could customize this to use your "cache" or make it a no-op, and other people could potentially adapt it as they see fit. WDYT?

@vitch
Copy link
Contributor Author

vitch commented Jul 7, 2022

Interesting idea... I guess we'd want a fetchRevisionsFunc as well. And we would want the base one to accept until like on this PR (otherwise people would need to recreate that logic themselves).

At that point I guess rather than the config funcs I could just make an ember-cli-deploy-s3-index-optimised which extended this one and overwrote those functions with the behaviour I wanted. Which would be completely opt-in... And the majority of the code would still live in this package...

Thoughts?

@lukemelia
Copy link
Contributor

At that point I guess rather than the config funcs I could just make an ember-cli-deploy-s3-index-optimised which extended this one and overwrote those functions with the behaviour I wanted. Which would be completely opt-in... And the majority of the code would still live in this package...

That's certainly an option that is open to you. Personally, I think the cost of maintenance would be markedly lower if you just needed to ensure that your configured functions continue to work as desired rather than deal with general dependency updates and tracking potential superclass changes.

@vitch
Copy link
Contributor Author

vitch commented Jul 8, 2022

I was thinking of it more from the point of view of purity of this addon... It felt a bit like the start of a slippery slope to "too many options" adding config for the choice of overriding two of the core functions...

I'm happy to add it here and see how it looks?

@lukemelia
Copy link
Contributor

Go ahead and try it. I would describe ember-cli-deploy's general philosophy in this area as "purity in core, pragmatism in plugins"

@vitch
Copy link
Contributor Author

vitch commented Jul 8, 2022

Linked a small optimisation and a new approach above.

One thing I've just realised is that there is no way from the context to find out which command was run (e.g. ember deploy:list or ember deploy:activate) - unless I'm missing something? With this limitation, it means I can't configure anything such that ember deploy:list still works while still optimising for the upload / activation use cases?

Ignore that - I just remembered that I did it via an environment variable which is read in the config for the previous implementation...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Activate task becomes very slow when you have a lot of deployed items
2 participants