Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some files fail to unzip regardless of approach #245

Open
pgbce opened this issue Sep 27, 2021 · 1 comment
Open

Some files fail to unzip regardless of approach #245

pgbce opened this issue Sep 27, 2021 · 1 comment

Comments

@pgbce
Copy link

pgbce commented Sep 27, 2021

Hello @ZJONSSON, definitely looking for some support with this library. I've scoured all the issues and have seen all sorts of variations on approach with unzipping files from s3.

** A single zipped file that contains many individual files and folders.

*I do want to mention that this is not running within a lambda.
*I've tested with multiple zipped files. Any zipped files that are less than 3gb, seem to be unzipped without issues
*Zipped files that are 90gb and above, some small files fail but other large files are extracted perfectly fine.

So far, I've worked with approach 1

getObject({})
	.createReadStream.
	.pipe(unzipper.Parse({forceStream:true, verbose:true}))

and with approach 2

await unzipper.Open.s3(s3Client, {})

There seem to be two issues that I face whichever way I go.

  1. I receive OR Z_BUF_ERROR with approach 1
  2. Catch error: Error: invalid stored block lengths with approach 2
at Zlib.zlibOnError [as onerror] (node:zlib:190:17)
errno: -3,
code: 'Z_DATA_ERROR'

*Be warned, my js is a little rusty.

async function runner() {
  try {
    let filePath = myKey.split('.').slice(0, -1).join('.')
    filePath += '/'
    console.log('Working with: ', myKey)
    console.log('File Destination: ', filePath)

    // const stream = s3
    //   .getObject({
    //     Bucket: myBucket,
    //     Key: myKey,
    //   })
    //   .createReadStream()
    //   .pipe(unzipper.Parse({ forceStream: true, verbose: true }))
    let s3Stream = await unzipper.Open.s3(s3, {
      Bucket: myBucket,
      Key: myKey,
    }).catch((error) => {
      console.log('catching:: ', error)
    })

    console.log('counter:: ', s3Stream.files.length)

    for await (let file of s3Stream.files) {
      let fileName = file.path
      let type = file.type
      let fileSize = file.uncompressedSize

      console.log('FileName: ', fileName)
      console.log('FileType: ', type)
      console.log('FileSize: ', fileSize)

      let match = fileName.match('MACOS')
      let res = match !== null ? match.length : null
      console.log('Match: ', res)

      if (type === 'File' && res != 1) {
        let smallerFileSizeCalc = 100 * 1024 * 1024 //100MB

        let params = {
          Bucket: myBucket,
          Key: filePath + fileName,
		  Body: file.stream()
        }
     
        params.ContentLength = fileSize
        
        if (fileSize != 0) {
          if (fileSize <= smallerFileSizeCalc) {           			
            // I've utilized s3.upload as well. 
            s3.putObject(params)
              .promise()
              .then((res) => {
                console.log('pubObject res: ', res)
              })
              .catch((err) => {
                console.log('pubObject err: ', err)
              })
          } else {
            console.log('About to upload')
            console.log('fileName: ', fileName)
            console.log('partSize: ', options.partSize)        

            const upload = new aws.S3.ManagedUpload({
              partSize: '5 * 1024 * 1024 * 1024',
              // queueSize: 1,
              leavePartsOnError: true,
              params: params,
              service: s3,
            })

            upload.on('httpUploadProgress', function (progress) {
              console.log(progress)
            })

            upload.send(function (err, data) {
              if (err) {
                console.log('Sending err: ')
                console.error(err)
              }
              console.log('Data after upload: ', data)
            })
          }
        }
      }
    }
  } catch (error) {
    console.log('Catch error: ', error)
  }
}

runner()
@pgbce
Copy link
Author

pgbce commented Feb 7, 2022

@ZJONSSON - any chance of providing some insight?
Still experiencing issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant