Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(3.8.0 - 3.9.3) ParallelCluster Build Image Failing during Installation of Minitar Ruby Gem Dependency #6405

Open
himani2411 opened this issue Aug 19, 2024 · 3 comments

Comments

@himani2411
Copy link
Contributor

himani2411 commented Aug 19, 2024

The issue

During pcluster build-image, ParallelCluster installs Berkshelf gems for dependency management. Minitar is a dependency of Berkshelf. On Tuesday, August 6, Minitar released 2 different versions, v1.0.0 and last version of minitar v0.12 with the deprecated archive-tar-minitar v0.12.

The latest minitar v1.0.0 removed the Archive::Tar namespace and archive/tar path as they have been deprecated since 2017. And moved the executable in a separate minitar-cli package. These changes in minitar v 1.0.0+ causes failure in Berkshelf installation and hence the ParallelCluster Build Image fails.

Reference to Berkshelf Installation Failure chef/berkshelf#26

Error details

ParallelCluster fails to build images. The error surfaced by ParallelCluster Build Image is:

pcluster describe-image --image-id $IMAGE_ID | jq ".imagebuilderImageStatusReason"

Error occurred during operation 'Workflow Execution ID: 'wf-xxxxxx-xxxxxx-xxxxxx-xxxxxx-xxxxxx' failed with reason: Document arn:<APRTITION>:imagebuilder:<REGION>:<ACCOUNT_ID>:component/parallelclusterimage-*/<parallelcluster-version>/1 failed!'

And, below is the snippet of the failure in ParallelCluster Build Image Logs which you can capture using pcluster export-image-logs - AWS ParallelCluster

 LANG=en_US.UTF-8 sudo /opt/cinc/embedded/bin/berks vendor /etc/chef/cookbooks --delete || (echo 'Vendoring cookbook failed.' && exit 1)
done;
<internal:/opt/cinc/embedded/lib/ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:85:in `require': cannot load such file -- archive/tar/minitar (LoadError)
 from <internal:/opt/cinc/embedded/lib/ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
 from /opt/cinc/embedded/lib/ruby/gems/3.1.0/gems/berkshelf-8.0.7/lib/berkshelf/packager.rb:1:in `<top (required)>'
 from /opt/cinc/embedded/lib/ruby/gems/3.1.0/gems/berkshelf-8.0.7/lib/berkshelf/berksfile.rb:1:in `require_relative'
 from /opt/cinc/embedded/lib/ruby/gems/3.1.0/gems/berkshelf-8.0.7/lib/berkshelf/berksfile.rb:1:in `<top (required)>'
 from /opt/cinc/embedded/lib/ruby/gems/3.1.0/gems/berkshelf-8.0.7/lib/berkshelf.rb:222:in `require_relative'
 from /opt/cinc/embedded/lib/ruby/gems/3.1.0/gems/berkshelf-8.0.7/lib/berkshelf.rb:222:in `<top (required)>'
 from /opt/cinc/embedded/lib/ruby/gems/3.1.0/gems/berkshelf-8.0.7/lib/berkshelf/cli.rb:1:in `require_relative'
 from /opt/cinc/embedded/lib/ruby/gems/3.1.0/gems/berkshelf-8.0.7/lib/berkshelf/cli.rb:1:in `<top (required)>'
 from <internal:/opt/cinc/embedded/lib/ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
 from <internal:/opt/cinc/embedded/lib/ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
 from /opt/cinc/embedded/lib/ruby/gems/3.1.0/gems/berkshelf-8.0.7/bin/berks:3:in `<top (required)>'
 from /opt/cinc/embedded/bin/berks:25:in `load'
 from /opt/cinc/embedded/bin/berks:25:in `<main>'

Affected versions

This issue impacts ParallelCluster versions from 3.8.0 to 3.9.3, across all OSes, schedulers.

Mitigation

You can find a detailed explanation and the mitigation of the problem in Wiki here

@judouk
Copy link

judouk commented Aug 21, 2024

I also got bit by this yesterday
I have a requirement to deploy CentOS 7 images so using the latest ParallelCluster is not possible

I followed the wiki page and was unable to complete the workarounds listed in Option 2

First of all,

/opt/cinc/embedded/bin/gem install --local --no-document berkshelf:{{ BerkshelfVersion }}

is not correct. In all versions, the line is

/opt/cinc/embedded/bin/gem install --no-document berkshelf:{{ BerkshelfVersion }}

When building the image, I get an error

The 'minitar' executable is no longer bundled with 'minitar'. If you are expecting this executable, make sure you also install 'minitar-cli'

At this point, the image build fails

I tried modifying the lines to be

/opt/cinc/embedded/bin/gem install --no-document minitar-cli:0.12
/opt/cinc/embedded/bin/gem install --no-document minitar:0.12
/opt/cinc/embedded/bin/gem install --local --no-document berkshelf:{{ BerkshelfVersion }}

and a number of different combinations to the above but with no avail; I still get the same error

How did you manage to get this to build?
(I'm trying with v3.9.3)

@alfred-stokespace
Copy link

I just tried to build a new AMI recently (I'm on PCluster 3.8) and I hit the same error.

<internal:/opt/cinc/embedded/lib/ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:85:in `require': cannot load such file -- archive/tar/minitar (LoadError)
	from <internal:/opt/cinc/embedded/lib/ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
	from /opt/cinc/embedded/lib/ruby/gems/3.1.0/gems/berkshelf-8.0.7/lib/berkshelf/packager.rb:1:in `<top (required)>'
	from /opt/cinc/embedded/lib/ruby/gems/3.1.0/gems/berkshelf-8.0.7/lib/berkshelf/berksfile.rb:1:in `require_relative'
	from /opt/cinc/embedded/lib/ruby/gems/3.1.0/gems/berkshelf-8.0.7/lib/berkshelf/berksfile.rb:1:in `<top (required)>'

I took a look at the suggested mitigation options.

Option 1 (Upgrade). I'd prefer not to upgrade to 3.10 as I have several clusters being managed by 3.8 and I don't feel comfortable jeopardizing my otherwise working installation of PCluster3 (3.8) as I typically have my users create clusters for discrete purposes rather than have a single cluster that everyone uses. keeping PCLusterUI working is desirable.

Option 2 (hack the yaml). I don't understand how this option relates to my installation. When I installed PCluster3, I opted for the PCluster UI via CloudFormation Template (https://parallelcluster-ui-release-artifacts-us-east-1.s3.us-east-1.amazonaws.com/parallelcluster-ui.yaml). So, I have no idea what relationship the CFT has to the parallelcluster.yaml and since I didn't install with pip the installation steps described, don't make sense.

Help?

I'm presently seeing if I can hack the image build steps to patch the file in a manner similar to the mitigation step?
Or maybe I install a new PCluster3 UI of the 3.10 variety?

@salemgolemugoo
Copy link
Contributor

Having the same issue on 3.9.2 + rhel8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants