Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] vmpublish.rb: publish public images #154

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@
/productization
kickstarts/generated/*.ks
scripts/Gemfile.dev.rb
config/creds/*.json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just ignore the entire directory instead of only json files?

7 changes: 6 additions & 1 deletion bin/release-build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,10 @@ if [[ $# != 1 ]]; then
fi

log_file="/build/logs/${1}.log"
nohup time ruby ${BUILD_DIR}/scripts/vmbuild.rb --type release --upload --reference $1 --copy-dir ${1%%-*} > $log_file 2>&1 &
copy_dir="${1%%-*}"

nohup sh -c \
"echo time ruby ${BUILD_DIR}/scripts/vmbuild.rb --type release --upload --reference $1 --copy-dir $copy_dir \
&& time ruby ${BUILD_DIR}/scripts/vmpublish.rb --copy-dir $copy_dir" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is publishing a step in the build process? It feels like it is. That's why it feels odd to me that we need to do this multiline shell command. Can we instead change vmbuild.rb to accept a --publish option with any other details we need?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.. just like how --upload is done.

Copy link
Contributor Author

@geertj geertj Sep 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it to a separate segment on purpose. Happy to reconsider that but let me explain my reasoning here first to see if it make sense.

These were my thoughts:

  1. There can be multiple locations to which you upload a single image. Once we support AWS we will get into this situation, where we have to upload the same image to multiple regions. We'd have to have a nested loop in the build loop and I though that would be more complicated and make things really slow.
  2. It allows some sanity tests before publishing. I In my view "publish" is a more impactful operation than "upload". The upload puts stuff on our file server from which people can download it. But "publish" will immediate make the image the live default choice for users of those services. For example, if you use "vagrant init manageiq/darga" that would immediately take the latest version by default. Same for the cloud launcher on GCE (we are working on this now). So doing some quick sanity testing on the build before a publish would not be a bad thing. (Note that this means we shouldn't call the publish script unconditionally from the "release-build.sh" script. That was an oversight.)
  3. Separation makes things more reliable. It allows the build phase to fail, without having some publish locations updated and others not. And it also allows the publish phase to be idempotent. If the publish fails, simply re-run the command. It won't build anything, but re-upload the most recently built set of images. The way the script works, it will overwrite anything with the specified version that is already there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Don't we have the same problem even with the current approach? It seems each target has a hard-coded path to a .json file and we publish once per target.
  2. As you said, that's a valid point only if publish isn't a part of the release-build.sh!
  3. I think this can go either way... It might make sense that we can publish some not all. But since we're trying to publish 'release' builds only, if a release build failed for some, we might want to not publish any until we know what the problem is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes, although in vmpublish.rb I could loop over the publish destination and not the image type.
  2. I propose to remove it from there :)
  3. Agreed that we don't want to publish if parts of the build is broken. I was referring to the fact that the easiest way to do the upload is to add it in the inner build loop in vmbuild.rb before we rename the image. That would mean we interleave things: build X - upload X - build Y, upload Y. We could put the publish loop of course completely at the end of vmbuild.rb.

The strongest argument for separating the build and publish phases is 2 - so that you can do some sanity testing before you publish something to a potentially large number of users. What do you think about that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see it both ways. My main concern is that there is some complexity because the build code isn't handing off to the publish code so we have to find the right images, and do some things that the build code would just know.

So, how about this... can we make the publish code separate but still offer the option to run it for all successful builds? I think that's what this shell code change here is saying. I'd rather the build code call publish code based on the results of each build. We can add publish CLI code later once the interface for publish is done.

> $log_file 2>&1 &
echo "${1} release build kicked off, see log @ $log_file ..."
21 changes: 21 additions & 0 deletions config/creds/README.creds
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
This directory will hold credentials files on the build host.

All JSON files in this directory are ignored via gitignore(5) to reduce the
risk that credentials accentally get commit.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get committed


The following files are currently used:

* gce.json

A service account JSON file for the Google Cloud Platform. This file can be
generated as describe here:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

described

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call it gce.json but say Google Cloud Platform. Which one should it be for consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImageFactory calls it GCE so probably that's the name to use.

https://cloud.google.com/storage/docs/authentication#generating-a-private-key
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there something we need to install on the build environment to talk to google here? Maybe this needs to be added to the base README.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Google API client is needed for GCE, which is a transitive dependency of the new ImageFactory and documented here:

https://github.com/redhat-imaging/imagefactory/blob/master/imagefactory_plugins/GCE/README.gce

I think our README can be cleaned up. Some of the dependencies that are mentioned are not needed anymore in recent imagefactory. I got some PRs accepted that turn some hard dependencies soft dependencies (which we don't need/use).

What if we agree that I'll simplify the README in a follow up PR (shortly, ~few weeks), would that be OK?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with a very short addition to the README indicating these new features require ... that we can later cleanup and add more details.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need these details to update our build vms anyway, but yes, that can wait as long as we have the basics.


* atlas.json

A JSON file with an Atlas token, as follows:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, what do we need to install and can we update the README with that information?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above this is all done in imagefactory. I'll get the README updates shortly.


{
"username": "<username>",
"token": "<token"
}
15 changes: 15 additions & 0 deletions scripts/config.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
require 'pathname'

BUILD_BASE = Pathname.new("/build")
GPG_DIR = Pathname.new("/root/.gnupg")
CFG_DIR = BUILD_BASE.join("config")
CREDS_DIR = CFG_DIR.join("creds")
FILESHARE_DIR = BUILD_BASE.join("fileshare")
REFS_DIR = BUILD_BASE.join("references")
IMGFAC_DIR = BUILD_BASE.join("imagefactory")
IMGFAC_CONF = CFG_DIR.join("imagefactory.conf")
STORAGE_DIR = BUILD_BASE.join("storage")

FILE_SERVER = ENV["BUILD_FILE_SERVER"] # SSH Server to host files
FILE_SERVER_ACCOUNT = ENV["BUILD_FILE_SERVER_ACCOUNT"] # Account to SSH as
FILE_SERVER_BASE = Pathname.new(ENV["BUILD_FILE_SERVER_BASE"] || ".") # Subdirectory of Account where to store builds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we need to put these constants within the Build namespace. Meaning, they should reside in module Build. We can do this later but we really should have them in that namespace and all code that uses it, if it's in that namespace, can access it without needing to specify the namespace, meaning we can use BUILD_BASE within the namespace instead of Build::BUILD_BASE. When we put it in the Build namespace, it needs to live in a build directory, so scripts/build/config.rb...

Because it's relevant here but not required... I created #158 for the following comment:

Future PR: Note, we really need to move files to the right places and re-namespace things...

vmbuild.rb             -> build.rb
cli.rb                 -> build/cli.rb
git_checkout.rb        -> build/git_checkout.rb
kickstart_generator.rb -> build/kickstart_generator.rb
productization.rb      -> build/productization.rb
target.rb              -> build/target.rb
productization.rb      -> build/productization.rb
uploader.rb            -> build/uploader.rb

Also, rename scripts to lib.

19 changes: 3 additions & 16 deletions scripts/vmbuild.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
require_relative 'cli'
require_relative 'uploader'
require_relative 'target'
require_relative 'config'

$log = Logger.new(STDOUT)

Expand All @@ -30,19 +31,6 @@
build_label = "#{cli_options[:type]}-#{cli_options[:manageiq_ref]}"
end

BUILD_BASE = Pathname.new("/build")
GPG_DIR = Pathname.new("/root/.gnupg")
CFG_DIR = BUILD_BASE.join("config")
FILESHARE_DIR = BUILD_BASE.join("fileshare")
REFS_DIR = BUILD_BASE.join("references")
IMGFAC_DIR = BUILD_BASE.join("imagefactory")
IMGFAC_CONF = CFG_DIR.join("imagefactory.conf")
STORAGE_DIR = BUILD_BASE.join("storage")

FILE_SERVER = ENV["BUILD_FILE_SERVER"] # SSH Server to host files
FILE_SERVER_ACCOUNT = ENV["BUILD_FILE_SERVER_ACCOUNT"] # Account to SSH as
FILE_SERVER_BASE = Pathname.new(ENV["BUILD_FILE_SERVER_BASE"] || ".") # Subdirectory of Account where to store builds

if !cli_options[:local] && cli_options[:build_url]
build_repo = cli_options[:build_url]
cfg_base = REFS_DIR.join(cli_options[:build_ref])
Expand Down Expand Up @@ -101,8 +89,7 @@ def verify_run(output)
file_rdu_dir_base = FILE_SERVER_BASE.join(directory)
file_rdu_dir = file_rdu_dir_base.join(directory_name)

fileshare_dir = BUILD_BASE.join("fileshare")
stream_directory = fileshare_dir.join(directory)
stream_directory = FILESHARE_DIR.join(directory)
destination_directory = stream_directory.join(build_label == "test" ? "test" : directory_name)

$log.info "Creating Fileshare Directory: #{destination_directory}"
Expand Down Expand Up @@ -152,7 +139,7 @@ def verify_run(output)
FileUtils.mkdir_p(destination_directory)
file_name = "#{name}-#{target}-#{build_label}-#{timestamp}-#{manageiq_checkout.commit_sha}.#{target.file_extension}"
destination = destination_directory.join(file_name)
$log.info `mv #{source} #{destination}`
$log.info `ln #{source} #{destination}`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this change for?

Copy link
Contributor Author

@geertj geertj Sep 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need the imagefactory image to continue to exist in /build/storage, so that I can imagefactory to publish the image in a separate step (becomes moot if we collapse the steps). I'm using a hard link instead of a copy so that I can more efficiently find back the image UUID from the image in /build/fileshare.

Thought: I could use a soft link, that may even be more efficient actually.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Ok.


if !File.exist?(destination)
$log.warn "Cannot find the target file #{destination}"
Expand Down
80 changes: 80 additions & 0 deletions scripts/vmpublish.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
require 'logger'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name this publish.rb? We really need to rename vmbuild.rb to build.rb since that file adds the Build module.

require 'pathname'
require 'trollop'
require 'json'

require_relative 'config'


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extra whitespace here indicates in my mind that the methods below belong in a namespace, such as Build if publishing is a step in the build process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep this separate, that's fine, but this should be in a Publish module.

def find_imgfac_image(image)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this method if the build process "hands" over the image to the publishing step directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed if we publish in the inner build loop. See the comments on why I separated the steps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The files in /build/storage get cleaned up daily (if you follow our instruction to setup build machine). If the steps are separate and publish can happen later, *.body files might be gone.

f = File.new(image)
Dir.chdir(STORAGE_DIR) do
Dir.glob('*.body') do |filename|
return filename.split('.').first if File.new(filename).stat.ino == f.stat.ino
end
end
end


def upload_imgfac_image(uuid, target, reference, version)
if target == 'vagrant'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conditional might we need to be refactored in the future. It's fine for now though.

creds_file = CREDS_DIR.join('atlas.json')
params = "atlas @ignored #{creds_file} --id #{uuid}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, we'll need instructions on installing atlas in the README, same for gce below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

params += " --parameter atlas_box_name #{reference} --parameter atlas_box_version #{version}"
elsif target == 'gce'
creds_file = CREDS_DIR.join('gce.json')
params = "gce @manageiq #{creds_file} --id #{uuid}"
params += " --parameter gce_object_name manageiq-#{reference}-#{version}.tar.gz"
params += " --parameter gce_image_name manageiq-#{reference}-#{version} --parameter gce_image_family centos-7"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to use AwesomeSpawn to build these comandline.

See run and run! here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, I don't think we use AwesomeSpawn yet in this repo so maybe we shouldn't. cc @carbonin @simaishi thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may not gain a whole lot by using AwesomeSpawn here. Because of how the parameters are listed, we will likely just have something like :params => %w(all the params here) which doesn't do a lot for readability. Also we don't seem to be doing much escaping here (the other reason to use it) so I feel like we could keep it as is.

end

if !File.exist?(creds_file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check happen earlier, before atlas and gce commands? Otherwise, if the file doesn't exist, we'll try to run those commands with a missing file.

$log.info "credentials file #{creds_file} does not exist, skipping #{target} publish"
return
end

Dir.chdir(IMGFAC_DIR) do
# Show progress on STDOUT
command = "./imagefactory provider_image #{params}"
$log.info "running command: #{command}"
if !system(command)
$log.error "imagefactory exited with status #{$?.exitstatus}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, we should use AwesomeSpawn run, and look at the standard out and standard error for logging. Here, we don't log any errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

system runs the command with stdout/err connected to the terminal. I chose that on purpose because imagefactory provides progress information on stdout (these uploads can take a while) and I wanted to know who things were going during development. Also stderr would show on the terminal (or in nohup.out). And on a nonzero exit status, I exit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might ok because I don't know that this feature should introduce AwesomeSpawn as a dependency.

exit 1
end
end
end


$log = Logger.new(STDOUT)

dir_desc = "Directory builds were copied to"
type_desc = "Build type (stable or latest)"

options = Trollop.options(ARGV) do
banner "Usage: vmpublish.rb [options]"
opt :copy_dir, dir_desc, :type => :string, :short => 'd', :default => "master"
opt :type, type_desc, :type => :string, :short => 't', :default => "stable"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as above, does publish belong as a step in the build process, if so, we should move the ARGV options there.

end

copy_dir = FILESHARE_DIR.join(options[:copy_dir], options[:type])

Dir.foreach(copy_dir) do |filename|
next if ['.', '..'].include? filename

parts = filename.split('.').first.split('-')
next unless parts.length >= 6 # only release builds, which are build from a tag (name-version)
next unless parts[0] == 'manageiq'
target = parts[1]
next unless ['vagrant', 'gce'].include? target
reference = parts[2]
version = parts[3..parts.length-3].join('-')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this is doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It parses back the image file name "manageiq-{ref}-{data}-{commit}" back to its components, so that I can use these are parameters to the upload.


image = copy_dir.join(filename)
uuid = find_imgfac_image(image)

$log.info "processing image: #{image}"
$log.info "target = #{target}, reference = #{reference}, version = #{version}"
$log.info "imagefactory image = #{uuid}"

upload_imgfac_image(uuid, target, reference, version)
end