Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds zlib v1.3.1 into the base docker image #26

Merged
merged 3 commits into from
Nov 11, 2024
Merged

Conversation

vladimir-harrison
Copy link
Contributor

@vladimir-harrison vladimir-harrison commented Nov 6, 2024

As per this slack thread: https://harrison-ai.slack.com/archives/C03S32UBFL2/p1730757854360519

Benchmark on randomly chosen files from the CTC cache:

1.2.11
np.mean(times)=1.9286178301761645, np.std(times)=0.06485639280564721, np.min(times)=1.8074659667909145, np.max(times)=2.2263533268123865, len(times)=101

1.2.13
np.mean(times)=1.664339867079317, np.std(times)=0.08087228980916963, np.min(times)=1.5270744245499372, np.max(times)=2.0380705669522285, len(times)=101

1.3.1
np.mean(times)=1.6679777112780232, np.std(times)=0.05479970442945612, np.min(times)=1.5598770808428526, np.max(times)=1.8912039399147034, len(times)=101

@vladimir-harrison vladimir-harrison requested a review from a team as a code owner November 6, 2024 02:03
@vladimir-harrison
Copy link
Contributor Author

I have built and pushed the image here: annalise-registry.hpc.harrisonai.io/tmp_zlib_fix

Test results with this image:

np.mean(times)=1.6538614263887157, np.std(times)=0.04069674772039104, np.min(times)=1.5238033179193735, np.max(times)=1.7589446865022182, len(times)=101

Test results with the current latest image:

np.mean(times)=1.950177812303352, np.std(times)=0.03374697578064131, np.min(times)=1.8691252954304218, np.max(times)=2.103273082524538, len(times)=101

Test code:

from pathlib import Path
import gzip
import numpy as np
import time
from tqdm.auto import tqdm

times = []
paths = sorted(Path("/home/coder/repos/ctc-ai-dev/cache-tmp").glob("*.transform.gz"))

print(gzip.zlib.ZLIB_RUNTIME_VERSION)
for p in tqdm(paths):
    start = time.monotonic()
    with gzip.open(p, "rb") as f:
        t = f.read()
    times.append(time.monotonic() - start)

print(f"{np.mean(times)=}, {np.std(times)=}, {np.min(times)=}, {np.max(times)=}, {len(times)=}")

@vladimir-harrison vladimir-harrison changed the title [WIP] Adds zlib v1.3.1 into the base docker image Adds zlib v1.3.1 into the base docker image Nov 11, 2024
base/Dockerfile Outdated Show resolved Hide resolved
@vladimir-harrison vladimir-harrison merged commit 9ea8944 into main Nov 11, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants