Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory retention causing page to crash #437

Open
DJSab opened this issue Jun 29, 2019 · 39 comments
Open

Memory retention causing page to crash #437

DJSab opened this issue Jun 29, 2019 · 39 comments

Comments

@DJSab
Copy link

DJSab commented Jun 29, 2019

I'm having a serious issue about memory retention on Chrome the page dies at some point. The plan is to upload large files that could go up more than 150GB but a 10GB file is not even possible it could be higher tho depending on computer specs. Please help me out! Thank you.

      signerUrl: {},
      aws_key: '{},
      bucket: {},
      cloudfront: false,
      computeContentMd5: true,
      progressIntervalMS: 1000,
      partSize: 100 * 1024 * 1024,
      sendCanonicalRequestToSignerUrl: true,
      maxConcurrentParts: 10000,
      cryptoMd5Method: function (data) { return AWS.util.crypto.md5(data, 'base64'); },
      cryptoHexEncodedHash256: function (data) { return AWS.util.crypto.sha256(data, 'hex'); },
      s3FileCacheHoursAgo: 1,
      allowS3ExistenceOptimization: true
@thomaslange24
Copy link

I am facing the same issue and according to the following, this should be related to the md5 method or the sha256 hash:
https://github.com/TTLabs/EvaporateJS/issues/358
https://github.com/asmcrypto/asmcrypto.js/issues/69
The following recommends using incremental md5 for processing large amount of data:
https://github.com/satazor/js-spark-md5
But i don't know how to use that, or if evaporate just supports it.

@jakubzitny
Copy link
Collaborator

jakubzitny commented Jul 1, 2019

Well, you're providing the md5 function yourself. I can tell you what we're doing, since the spark and aws lib methods are problematic.

I've shared a code that uses node's crypto here. I think webpack is able to translate it into browser code. Can you please try it?

@thomaslange24
Copy link

Well, i never used webpack before and i don't really understand how to do this. Can you give me an example?

@dtelad11
Copy link

dtelad11 commented Aug 7, 2019

I am observing a similar problem when uploading a 8.3gb file. I tried the SparkMD5 and CryptoJS hash functions and different partSize (going as low as 1mb). I tried blueimpMD5 but that crashes even before the upload begins.

In all cases, memory use keeps increasing, so it's probably a memory leak. With Chrome, the browser crashes after approximately one gigabyte is uploaded. Firefox manages to upload the file but the browser interface slows down dramatically after 5gb or so, becoming completely unresponsive around the 8gb mark (thankfully the upload continues).

/* Different MD5 methods. */
cryptoMd5Method = (data) => btoa(SparkMD5.ArrayBuffer.hash(data, true));
cryptoMd5Method = (data) => CryptoJS.MD5(CryptoJS.lib.WordArray.create(data)).toString(CryptoJS.enc.Base64);
cryptoMd5Method = (data) => btoa(blueimpMD5(String.fromCharCode.apply(null, new Uint8Array(data)), null, true));
      evaporateOptions: {
        aws_key: ...,
        awsRegion: ...,
        bucket: ...,
        computeContentMd5: true,
        cryptoMd5Method,
        cryptoHexEncodedHash256: sha256,
        partSize: 100 * 1024 * 1024,
        logging: true,
      },

@Jrubensson
Copy link

@dtelad11: Same here, tried the same hash functions with the same results. Did you find a solution to this?

@dtelad11
Copy link

Use Firefox ...

@jakubzitny
Copy link
Collaborator

Thanks for the debug @dtelad11, this might be a bit harder to debug. I'll try to allocate some time and have a look at it. Any chance you tried debugging it in Chrome DevTools?

There is a possible part of the code where the reference to md5d stuff is not deleted. The code is pretty old and ugly though :P I'll see what I can do.

Any help is welcome 🙂

@dtelad11
Copy link

Thanks @jakubzitny for looking into this! I did not try to debug this in Chrome DevTools, I'm afraid. Full disclosure, I'm not very adapt at JS (I'm a data scientist), so that's as far as I could go with debugging. Sorry I couldn't be of more use!

@mtltechtemp
Copy link

@jakubzitny Hi, any news ?
I can't find where the excess references are...
I tried to remove some, but I still have the same problem.

@VincentCharpentier
Copy link

VincentCharpentier commented Nov 7, 2019

Same issue here.
This is causing serious business concern; Do you know if there is a previous version of the lib that is not impacted by this issue ? That could be a temporary workaround (sadly, I don't think I can tell my users to use Firefox).

I'm willing to help fix the issue but I need a bit more insights.
@jakubzitny can you elaborate on the part where you mentioned "md5d" references ?

@jakubzitny
Copy link
Collaborator

Thanks for the comments. Regarding the md5 part, I am not really sure, I did not write the code and it is messy. And I did not manage to find proper time for this.

Could we start with creating a minimal repro where this is always happening @VincentCharpentier @mtltechtemp?

@VincentCharpentier
Copy link

I'll try to do that this week

@VincentCharpentier
Copy link

VincentCharpentier commented Nov 12, 2019

Ok, so I made this sandbox:
https://codesandbox.io/s/adoring-keller-qi93e

This is only a form to upload with any AWS and signer config.
I cannot put my own private AWS stuff in the code, that's why I made it that way.
Sadly I cannot test it properly since my signer url is protected with a strict CORS policy.

I would appreciate if someone could test it and let me know if memory usage is indefinitely increasing during upload

Sorry I cannot provide more for now 😞

@VincentCharpentier
Copy link

Do you know if there's a way to do the signing process locally ? I could add a field for the private key also and sign chunks from the client

@jakubzitny
Copy link
Collaborator

Yes it is possible, you should use the customAuthMethod.

What is the ~file size that you're seeing the leaks, please?

@VincentCharpentier
Copy link

With Google Chrome I can notice memory usage increasing due this process "Google Chrome Helper (Renderer)"; on a Mac, but I got similar reports from customer with Windows.
Then it depends on your computer specs how long it takes for the leak to crash your browser.
With a 4Gb file you should be able to see it I think

@VincentCharpentier
Copy link

Were you able to reproduce the issue with the setup ?

@VincentCharpentier
Copy link

I tried to update the demo code to use a local customAuthMethod, but I get a strange error when I try to upload:

initiate error: [...] TypeError: Failed to execute 'setRequestHeader' on 'XMLHttpRequest': Value is not a valid ByteString.
at eval (evaporate.js? [sm]:1136)
at new Promise ()
at InitiateMultipartUpload.SignedS3AWSRequest.sendRequestToAWS (evaporate.js? [sm]:1104)
at eval (evaporate.js? [sm]:1194)

@jakubzitny can you help ?
you can check the demo here: https://codesandbox.io/s/adoring-keller-qi93e

@jakubzitny
Copy link
Collaborator

Thanks for the effort @VincentCharpentier, I didn't really get to this..

I've never seen this error, it seems to be failing on

xhr.setRequestHeader('Authorization', self.signer.authorizationString());

which would mean the signer is set up incorrectly, but it is quite weird..

@VincentCharpentier
Copy link

I'm only using the private key to generate hmac in customAuthMethod.
I don't see what kind of Authorization header evaporate is trying to compute there, or is it something computed from the public key ? That doesn't sound like something AWS would allow security-wise.

Do you see anything wrong in the customAuthMethod below ? (priv_key and awsRegion are in the scope)

import crypto from "crypto";

function hmac(key, value) {
  return crypto
    .createHmac("sha256", key)
    .update(value)
    .digest();
}

function customAuthMethod(
    signParams,
    signHeaders,
    stringToSign,
    signatureDateTime,
    canonicalRequest
  ) {
    return new Promise(resolve => {
      const timestamp = signatureDateTime.substr(0, 8);
      const date = hmac(`AWS4${priv_key}`, timestamp);
      const region = hmac(date, awsRegion);
      const service = hmac(region, "s3");
      const signing = hmac(service, "aws4_request");

      resolve(hmac(signing, stringToSign));
    });
}

@jakubzitny
Copy link
Collaborator

Okay, that doesn't look very good. The key in your hmac() function is supposed to be the AWS secret, not date, region and service that you're providing to it.

You should prodivde the secret to all things you're hasing, right? Do you see it @VincentCharpentier?

@VincentCharpentier
Copy link

I'm not really providing date, region etc as key, I'm providing the result of previous hmac calls as key. This code is already used in production on the api to sign requests and is working.
basically I'm doing this:

function customAuthMethod(
  signParams,
  signHeaders,
  stringToSign,
  signatureDateTime,
  canonicalRequest
) {
  return hmac(
    hmac(
      hmac(
        hmac(
          hmac(
            `AWS4${priv_key}`,
            signatureDateTime.substr(0, 8)
          ),
          awsRegion),
        's3'
      ),
      'aws4_request',
    ),
    stringToSign,
  );
}

@VincentCharpentier
Copy link

But honestly I don't know where that signing method comes from and where to find the relevant documentation used to write it

@jakubzitny
Copy link
Collaborator

You can remove the email.

And yeah, the signing is incorrect, you want to sign the content just once. So just create it and sign it. The signing method is the thing that you have, that's the hmac. And the private key is the key from amazon.

@Jrubensson
Copy link

Jrubensson commented Jan 7, 2020

Has anyone found a solution to this, or perhaps an alternative? We are experiencing this as well. I tried digging through the code, but could not find a solution, but I do as well think this relates to previous hashes not being cleaned up.

@VincentCharpentier
Copy link

I'm sorry I don't have any more time for this issue atm; I think I will just migrate to another implementation at some point but that's gonna be a very long process

@VincentCharpentier
Copy link

Just a note here, Chrome 79 (latest right now) seems to handle upload well without memory leaks.
Can anyone confirm this ?

@Jrubensson
Copy link

Tested it here on my end. Still experience the problem in Windows with Chrome, even with Chrome 79. The issue does not seem to be relevant on macOS though.

@mattmoreira
Copy link
Contributor

Hi. Just confirmed this issue in Brave 1.3, equivalent to Chromium 80.

I'm uploading a profile of the memory, so we can properly debug what is causing this, but testing with a 100mb file, we can clearly see that the memory ramps up until the end of the upload. If we stretch this behavior to a bigger file, in theory, we should eventually see a crash.

To check this performance profile, rename the .txt to .json, please.
Profile-20200310T001454.txt

Tomorrow I'll try to take a closer look at it.

@mattmoreira
Copy link
Contributor

Here are my findings after some investigation:

Screen Shot 2020-03-10 at 22 05 08
Screen Shot 2020-03-10 at 22 11 33

On macOS Catalina, on both Brave 1.3 (Chromium 80) and Chromium 78, there's no memory leak. My hypothesis of yesterday was debunked when I profiled the application again, as once the upload of the part finishes, the memory goes back to normal, except in cases where the request takes longer than expected, which leads to an increase in memory, like one of the screenshots show.

I'll try to execute this on Windows now.

@mattmoreira
Copy link
Contributor

mattmoreira commented Mar 10, 2020

On Windows 8.1 (ugh) and Chrome 72, the dev tools reported more than 90 samplings missing, and because of that, the graph of the performance ends while in the initial uploads:

Screen Shot 2020-03-11 at 00 49 30

What I witnessed was that when I tried to upload a file of 400mb, once I finished the profiling after 12 minutes of recording, the devtools window crashed and I lost my data. So if I would bet, I'd say that memory leak is really playing a role here.

So, I'm not able to right now reach into a conclusion, as I need more data. Tomorrow I'll investigate more, as today I spent almost an hour trying to make localstack and the port forwarding of my Windows VM to work together.

@VincentCharpentier
Copy link

Thanks for the awesome efforts put into this @mattmoreira !
I'm not comfortable using the profiler, but anyway let us know if we can help

@mattmoreira
Copy link
Contributor

mattmoreira commented Mar 11, 2020

@VincentCharpentier,

I just tried in Windows 8.1 with Chromium 72, took profilings with a duration of 60, 60 and 110 seconds, and apparently there are no issues with a memory leak. Possibly my crash yesterday happened due to the long duration of my profiling, plus the fact that I ran it in a 2 Gb VM.

It would help a lot if you could do small profilings of your upload, so we can better understand what's going on because as it currently stands, I'm unable to reproduce the issue. Once you profile it, please, save it and post it on this thread.

Also, please use this link, as I've fixed a problem that the previous sandbox had, as it was trying to upload fakepath, and not the actual file content.

I'd recommend as well using localstack to test this, as then you won't have to pay for Amazon's usage and it will be easier to run the test, as authentication isn't necessary.

Installing localstack:
pip install localstack

Starting localstack:
localstack start

Creating a bucket with localstack:
AWS_ACCESS_KEY_ID='123' AWS_SECRET_ACCESS_KEY='xyz' aws --endpoint-url=http://localhost:4572 s3 mb s3://test

Data to use in the form fields:
AWS_ACCESS_KEY_ID: 123
AWS_SECRET_ACCESS_KEY: xyz
BUCKET_NAME: test
AWS_URL: http://localhost:4572

@VincentCharpentier
Copy link

Hi,

I just asked our support team and it seems that it's been a long time without customer complaining about this.

Without any code change on our side, I would assume this was fixed with browser updates.
Not sure what we should do about this issue for now

@mattmoreira
Copy link
Contributor

I'd say that we should give it another try.

@VincentCharpentier, do you know what was the OS of the affected users? Also, could you please download Chromium 72 and try it on Windows 7?

Calling @mtltechtemp, @dtelad11, and @Jrubensson, as you all experienced the same issue. Are you able to reproduce this with the link that I provided? If so, what are your computer specs, browser version, and OS version?

@Jrubensson
Copy link

Jrubensson commented Mar 13, 2020

@mattmoreira: First of all, thank you for working on this. Very much appreciated.

I will try to give it a test later today. The specs I had were I saw the issue was with Windows 10 and Chrome 79. It was tested on a computer running 16 GB of ram with a Ryzen 2600k CPU.

@bikeath1337
Copy link
Collaborator

All, I wrote the second re-write and went through the same convulsions as you are regarding memory usage. The fact of the matter is that EvaporateJS relies on the browser's implementation for doing the heavy lifting. When we looked at potential memory leaks about 2-3 years ago, I concluded that the EvaporateJS code was correctly releasing memory and that to dig into memory issues for each browser, was a fool's errand. I am not claiming that much can be improved, but in the end, to make the multi-part transfers work, we have to ask the browser to allocate a bunch of memory and trust that when we release it, the browser does it's part. I concluded that some browser implementations hadn't catered to the the demands of multi-part uploads and perhaps were a little lax in their own testing strategies.

And thank you all for your support of the project.

@mattmoreira
Copy link
Contributor

Given the outlined scenario, I suggest that we close this issue, as we're unable to advance any further.

To avoid this kind of situation in the future, we could maybe establish that to open memory-related issues, they should have a performance profiling and memory snapshot attached.

This way, even if we're unable to reproduce the proper failure, we'll be able to see where it is happening.

@dtelad11
Copy link

As an EvaporateJS customer, I want to thank everyone on this thread for the substantial amount of work invested. I did not expect the rabbit hole to go so deep. I wholeheartedly agree that the package cannot be held responsible for browser-specific memory issues.

Personally, I am still seeing memory leaks with Chrome. However, we have informed our customers that this issue lies with Google and that they should just use Firefox for large files. This is enough of a solution for us.

Once again, my deepest gratitude for your investigation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants