Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to make performance test tolerant to the device where it is run? #16

Open
optimistex opened this issue Mar 21, 2024 · 1 comment
Open

Comments

@optimistex
Copy link

I did not find a way to make it tolerant and came up with my solution to measure the performance relatively to the performance of built-in JSON.parse.

Any other idea/recommendation?

const measurementBase = await benchmark.record(
  () => JSON.parse('{"config":[{"key":"email","value":"email"},{"key":"mqiPassword","value":"mqiPassword"}]}'),
  { iterations: 1000 }
);

const measurement = await benchmark.record(
  () => cloneAndSanitize(test),
  { iterations: 1000, minUnder: measurementBase.min * 50 }
);

expect(measurement.totalDuration).toBeLessThan(measurementBase.totalDuration * 20);
@mtkennerly
Copy link
Owner

I think this would be a good feature to add :) I can think of some different ways to incorporate this into MeasureOptions, but I'm not sure which is the best trade-off. I'm open to others' thoughts on this.

Option 1: Compare directly to baseline function

This is similar to your example (test function is faster than 50x the baseline):

const options = {
  iterations: 1000,
  baseline: await measure(
    () => JSON.parse('{"config":[{"key":"email","value":"email"},{"key":"mqiPassword","value":"mqiPassword"}]}'),
    { iterations: 1000 },
  ),
};

const measurement = await benchmark.record(
  () => cloneAndSanitize(test),
  { ...options, baselineMultiplier: 50 },
);

Let's say that baseline has a mean of 10 ms and measurement has a mean of 200 ms. We'll verify that 200 < 10 * 50.

  • Pro: This is simple to implement.
  • Con: Thinking in terms of "this function is faster than 50x some other function" is not as straightforward as "this function is faster than X ms".
  • Con: Since the comparisons are implicit, you can't explicitly use meanUnder/etc, or at least I haven't thought of a nice way to make them work.

Option 2: Compare baseline function with itself on different systems

const options = {
  iterations: 1000,
  baseline: {
    reference: new Measurement([10, 11, 20, ...]), // snapshot from `measure` on some test system
    current: await measure(
      () => JSON.parse('{"config":[{"key":"email","value":"email"},{"key":"mqiPassword","value":"mqiPassword"}]}'),
      { iterations: 1000 },
  ),
};

const measurement = await benchmark.record(
  () => cloneAndSanitize(test),
  { ...options, meanUnder: 100 },
);

Let's say baseline.reference has a mean of 20 ms, baseline.current has a mean of 30 ms, and measurement has a mean of 140 ms. 20 / 30 = 0.66..., so we'll verify that 140 * 0.66... < 100.

  • Pro: You can write the expected times in terms of your main development system, which is more straightforward than measuring relative to some other function.
  • Con: Having to record/update the reference snapshot might be annoying.
  • Con: This works best when a single person is setting the thresholds based on their system. If multiple developers are setting thresholds based on their own systems, then the thresholds won't be consistent to the baseline. You'd probably want to run it without thresholds first on a standard system (e.g., GitHub workflow), then set the thresholds based on the those measurements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants