- Masterclass Type: Core
- Maximum Expected Time To Complete: 2 hours
- Table of Contents
- Introduction
- Hardware and Software Requirements
- Exercises
- 1. Introduction to Docker
- 2. Installing Docker
- 3. The
docker
Command Line Interface - 4. Introduction to Dockerfiles
- 5. Building a Docker Image
- 6. Docker Containers
- 7. Tags and Caching in Docker Builds
- 8. Running Containers in Detached, Command and Interactive Modes
- 9. Bind Mounts and Volumes
- 10. Container Privileges and Permissions
- 11. Docker Registries and Repositories
- Conclusion
Docker is a set of services that use OS level virtualization to run self-contained instances of software. These instances are known as containers, and include everything that the software running within them needs to function apart from the Linux kernel.
This Masterclass is an aside to the rest of the balena Masterclasses, in that it focuses specifically on Docker, from the basics to some 'deep dives' into how containers (and other components) function and are constructed.
It is anticipated that after reading the basic sections of this Masterclass, the reading of the Services Masterclass will allow anybody to develop balena applications and services.
It is assumed that the reader has access to the following:
- A locally cloned copy of this repository,
Docker Masterclass.
Either:
git clone https://github.com/balena-io/docker-masterclass.git
- Download ZIP file (from 'Clone or download'->'Download ZIP') and then unzip it to a suitable directory
- A suitable text editor for developing code on your development platform (eg. Visual Code)
- A suitable shell environment for command execution (such as
bash
)
All of the following exercises assume that you are running the Docker CLI from
a suitable Unix based shell. The exercises include commands which can be run
in such a shell, and are represented by a line prefixed with $
. Information
returned from execution of a command may be appended under the line to show
what might be returned. For example:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
Docker allows the execution of software in self-enclosed packages called
'containers'. Containers are essentially small execution environments, and
can range from being tiny environments where only a few binaries are included,
to fully blown Linux distributions such as Debian or Ubuntu where many packages
can be included for the container environment to use (eg. via the use of
apt-get
in the case of Debian or Ubuntu).
As such, Docker features some very useful benefits:
- The ability to run applications/binaries in a sandboxed execution environment where malicious code will not have access to the host OS.
- Easy and rapid deployment to a machine that Docker is installed on, with the assurance that the containers will run in the same way as on any other system with Docker installed.
- Rapid development due to the quick building and recreation of containers, as opposed to having to reinstall an OS and dependencies on a bare-metal or Viritualised system.
Containerization is an alternative form of providing host OS separation to that of an execution environment, where other forms include Virtualization, where machine resources (such as CPU cores, memory and disk space) are partitioned off from the host OS and used to run a guest OS instead. Unlike Virtualization, Containerization uses host OS resources as it needs them at runtime. As such, there is no need to pre-allocate resources.
Docker runs as a service, how this occurs depends on the host OS:
- Linux - Runs as an OS daemon service (such as a
systemd
unit file), which is then sent commands via thedocker
anddocker-compose
clients. - macOS - A Xen Hypervisor Virtual Machine runs the Docker service, and a UI and CLI client connects to it.
- Windows - A Virtual Machine (called MobyVM) is run, with the UI and CLI client connecting to it, or via WSL.
At this point, it's worth introducing some terminology, so that we can easily refer to concepts in the future. It's worth noting that some of the concepts described in the terminology will be described later in the exercises.
- Layer - A layer is part of a Docker Image or Container. It represents a file system layout, comprised of both directories and files. Layers are created during the building of a Docker image, or whilst a container is currently running.
- Image - A Docker image is a self-contained object representing a file system,
comprised of Docker layers. The number of layers in an image vary from a
minimum of 1. Images are created from Dockerfile 'recipes', which contain a
number of directives that allow the image to be created by carrying out
operations such as copying files into the image, executing commands to
configure the image (such as
apt-get
from a Debian base image, or running bash scripts to generate configuration files) and determining which binary will be executed when the image is run as a container. - Container - A Docker container is a running instance of an image. When an image is selected to be run, a new container is created that uses the layers from the Docker image as the base filesystem (the image acts as a 'template' for the container and the container never alters the image), and then runs any command that has been given. Should the container create any new files, or update any files that existed in the Docker image that is not located in a bound volume, a new layer that is owned specifically by that instance of the container is created. Container layers persist for the duration of the container's life, but are removed on container removal.
- Volume - A volume is persistent storage that exists as a directory structure
on the host OS, which can be bound to a container on startup. This can work
in several different ways, such as binding a specific directory or file from
the host OS into a specific location in the container (for example binding
/etc/hosts
from the host OS to/etc/hosts
in the container) or binding a named volume (a Docker managed directory on the host OS) into a specific location in the container (for example using a volume calledshared-volume
into/usr/share
in the container). - Network - A Docker network is functionality to connect running containers to each other and/or other network interfaces running on the host OS. These can vary from simply using the host OS's networking interfaces, to user-defined networking bridges which are only used by particular containers. All containers that require incoming/outgoing network traffic require a Docker network of some description.
- Repository - A Docker repository is storage for Docker images. When a Docker image is built, it is stored in the local Docker repository. However, usually you'll want to store the image in a location where other users can retrieve it. Docker repositories can be private (so that only users with login credentials can retrieve images from them), public (so that any user may retrieve an image from them) or a mixture of both. The Docker CLI includes commands that allow you to pull/push images from/to repositories.
How you install Docker onto your local host OS is dependent on the OS itself and the distribution that you're running. Docker has instructions available that allow the user to easily carry this out here.
An important note here. When installing Docker under Linux, the Docker daemon
(usually, unless changed) runs as root
(under Windows and macOS, this is not
an issue). As such, you'll need to follow the instructions to add any user that
also wants to run the docker
command to the docker
group. If you don't do
this, only the root
user will be able to use Docker under Linux. For
convenience, the command to do this is:
$ sudo usermod -aG docker <your-user>
Where <your-user>
is the username for the user to run as non-root
.
However ensure that this is the latest command under the Docker installation
instructions before doing so. You'll need to log out of the shell and then
back in for the changes to take effect.
Under macOS and Windows this is slightly different, and the installing user will be asked to enter their credentials so that the Docker Hypervisor/VM has permission to run in the background.
Once you've installed Docker, you should be able to open a new shell terminal and verify that Docker is running correctly by running:
$ docker version
Client: Docker Engine - Community
Version: 19.03.5
API version: 1.40
Go version: go1.12.12
Git commit: 633a0ea
Built: Wed Nov 13 07:22:34 2019
OS/Arch: darwin/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.5
API version: 1.40 (minimum version 1.12)
Go version: go1.12.12
Git commit: 633a0ea
Built: Wed Nov 13 07:29:19 2019
OS/Arch: linux/amd64
Experimental: true
containerd:
Version: v1.2.10
GitCommit: b34a5c8af56e510852c35414db4c1f4fa6172339
runc:
Version: 1.0.0-rc8+dev
GitCommit: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
docker-init:
Version: 0.18.0
GitCommit: fec3683
Whilst the Docker daemon runs continuously on your host OS, it can be controlled
via the docker
CLI command. This contains all the functionality to build
images, run containers, create volumes and networks, push and pull images to
remote repositories, etc.
We'll be using docker
for almost all of the exercises, so it's worth having
a quick look at all the available commands:
$ docker help
Usage: docker [OPTIONS] COMMAND
A self-sufficient runtime for containers
Options:
--config string Location of client config files (default "/Users/heds/.docker")
-c, --context string Name of the context to use to connect to the daemon (overrides DOCKER_HOST env var and default context set with "docker context use")
-D, --debug Enable debug mode
-H, --host list Daemon socket(s) to connect to
-l, --log-level string Set the logging level ("debug"|"info"|"warn"|"error"|"fatal") (default "info")
--tls Use TLS; implied by --tlsverify
--tlscacert string Trust certs signed only by this CA (default "/Users/heds/.docker/ca.pem")
--tlscert string Path to TLS certificate file (default "/Users/heds/.docker/cert.pem")
--tlskey string Path to TLS key file (default "/Users/heds/.docker/key.pem")
--tlsverify Use TLS and verify the remote
-v, --version Print version information and quit
Management Commands:
builder Manage builds
checkpoint Manage checkpoints
config Manage Docker configs
container Manage containers
context Manage contexts
image Manage images
network Manage networks
node Manage Swarm nodes
plugin Manage plugins
secret Manage Docker secrets
service Manage services
stack Manage Docker stacks
swarm Manage Swarm
system Manage Docker
trust Manage trust on Docker images
volume Manage volumes
Commands:
attach Attach local standard input, output, and error streams to a running container
build Build an image from a Dockerfile
commit Create a new image from a container's changes
cp Copy files/folders between a container and the local filesystem
create Create a new container
deploy Deploy a new stack or update an existing stack
diff Inspect changes to files or directories on a container's filesystem
events Get real time events from the server
exec Run a command in a running container
export Export a container's filesystem as a tar archive
history Show the history of an image
images List images
import Import the contents from a tarball to create a filesystem image
info Display system-wide information
inspect Return low-level information on Docker objects
kill Kill one or more running containers
load Load an image from a tar archive or STDIN
login Log in to a Docker registry
logout Log out from a Docker registry
logs Fetch the logs of a container
pause Pause all processes within one or more containers
port List port mappings or a specific mapping for the container
ps List containers
pull Pull an image or a repository from a registry
push Push an image or a repository to a registry
rename Rename a container
restart Restart one or more containers
rm Remove one or more containers
rmi Remove one or more images
run Run a command in a new container
save Save one or more images to a tar archive (streamed to STDOUT by default)
search Search the Docker Hub for images
start Start one or more stopped containers
stats Display a live stream of container(s) resource usage statistics
stop Stop one or more running containers
tag Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE
top Display the running processes of a container
unpause Unpause all processes within one or more containers
update Update configuration of one or more containers
version Show the Docker version information
wait Block until one or more containers stop, then print their exit codes
Run 'docker COMMAND --help' for more information on a command.
The basis of any Docker image is the Dockerfile. This is a list of instructions that form a recipe for an image. When an image is built, Docker steps through the instructions in the Dockerfile and carries out the directive at each step. Directives are keywords which (usually) form the start of a new line in the Dockerfile.
As mentioned, Dockerfiles are laid out as a series of directives, which are executed in series. Here's a very basic example:
FROM debian:buster
COPY helloworld /usr/src/helloworld
CMD "/usr/src/helloworld"
This Dockerfile:
- Uses the latest Debian Buster base image (containing all the files in the Buster release) as the base for the image's file system.
- Copies a local binary (
helloworld
) from the same directory that the Dockerfile is in into the image's/usr/src
directory. - Sets the command that should be run when the image is run as a conatiner
(in this case, it will run the
/usr/src/helloworld
) binary.
We'll go through some of the most common Dockerfile directives, and what they do, followed by an exercise to build a very simple Dockerfile, but we'll introduce some more keywords later in the Masterclass when we've pinned down the basics.
4.1 FROM
Directive
The FROM
directive usually starts a Dockerfile. It informs Docker of the image
that should be inherited from as a basis from with which to run the rest of the
directives in the Dockerfile.
Because images are formed of a set of file system layers, an image can actually inherit another image as its base, which therefore gives it immediate access to the filesystem layout that exists in the base image.
Here's a small example. Consider an image, let's call it 'simple-base', which has the following directories and files in its layers:
.
├── directory-1
│ ├── directory-1-1
│ │ └── file2.txt
│ └── file1.txt
└── directory-2
└── file3.txt
We can use the FROM
directive at the start of a Dockerfile to inherit from
this base image:
# Will use simple-base as a base images, so all the files/directories from
# simple-base are now available to work on.
FROM simple-base
You'll note that comments can be used in Dockerfiles, by simply using #
at
the start of the line, very much like Linux/Unix shell operations.
Further directives in the file can carry out operations on the inherited filesystem, as well as creating new layers which include new or modified directories and files. Every time the filesystem is modified via the Dockerfile, a new layer is created. This usually occurs after each directive.
As well as inheriting from base images, it's also possible to start a completely
new Docker image which does not include any layers. This can be achieved by
using the special image name scratch
. This tells Docker it shouldn't try to
inherit from any other image, but start a brand new set of layers. In fact, this
is how a large number of 'standard' base images start, such as the Docker
versions of Linux distributions, by starting from scratch
and copying in all
of the files that build up that distribution of Linux.
4.2 COPY
Directive
Now we know how to use a base image to start with a filesystem in our own image, but how do we get new files into it?
This is where the COPY
directive comes in. This allows the copying of files
from the host's local filesystem into the Docker image being built. It's fairly
similar to the standard copying commands under most OS variants in
that it takes the location of a source file or directory on the host and a
destination where the contents of the source should be copied inside the image.
COPY
preserves all flags on the file when copied. For example, if you copy
a shell script that has the execution flag set this will be preserved when
copied into the image.
NOTE: The directory location of the Dockerfile being used to build the image is known as the build context. Any files in the build context, or any of its sub-directory hierarchy, is also part of the build context. This can be changed by other commands, but this is the default and we'll be using it for the moment.
The following will recursively copy the contents of the directory-3
directory in the current build context into the image at the location
/directory-3
:
COPY directory-3 /directory-3/
Note that you need to add the trailing /
to the destination directory. If
the destination directory doesn't already exist, it'll be dynamically created
(which means you can specify a nested destination directory without having to
run mkdir
in the image).
Wildcards also work, so that the following:
COPY * /all-the-files/
Will recursively copy everything (including the Dockerfile) in the current
build context to the /all-the-files/
directory in the built image.
It should be noted that the source of files are always relative to the build context. Therefore files must be available to that context (you can't, for example, reference files that are in the directory above where the Dockerfile is, because it's not within the same context).
COPY
has a few switches that include abilities to set ownership and
on files as they're copied. Refer to the
Dockerfile documentation
for more details.
If you want to use files that are outside of the build context there are a few tricks we can use, which we'll discuss in an advanced Docker Masterclass.
4.3 RUN
Directive
The RUN
directive runs commands whilst building a Docker image. This is
sometimes confused, when starting out with Docker, with the CMD
directive
which runs commands when the image is created as a container instance.
RUN
is used to carry out processing on your image to make it more useful.
Because the RUN
command has access to all of the filesystem layers currently
available at the point when the directive is executed inside the image, it's
used to do things like install useful libraries and packages. For example,
suppose we're using a Debian Buster base image, and we want to install the
Node.js package, we can use RUN
to execute the apt-get
binary to do this:
RUN apt-get update \
&& apt-get install -y nodejs
Another example might be running a command on some files in the image, to
create new files or modify them. The following will create a new file with
some initial text in and then print that file out, before then modifying that
file using sed
to change the text and printing out the next contents:
RUN echo 'Hello there!' > /my-file.txt \
&& cat /my-file.txt
RUN sed -i 's/Hello/Goodbye/' /my-file.txt \
&& cat /my-file.txt
As you can see, any binary that's part of the distribution is available, and you
can even use files you copy in yourself. Suppose we had a script called
list-files.sh
in the current build context that was simply:
#!/bin/bash
ls -l /
We could copy this into the Docker image during the build and then run it:
COPY list-files.sh /list-files.sh
RUN /list-files.sh > /files.txt \
&& cat /files.txt
This would copy the script into the image at location /list-files.sh
, and
then run that script to get a list of all the entries in the root directory to
a file and then outputing the contents to the build.
4.4 CMD
Directive
The CMD
directive is usually the last directive in a Dockerfile. This informs
Docker what should be run when a container instance is created from an image.
This tends to be a call to a single binary or script. Once the call has
finished, the container will exit. Because most Docker images are designed
around services or daemons (for example a web server or similar), it's unusual
for a CMD
to not start up a daemon or script that restarts should that service
exit. But this doesn't mean it has to be like that.
Taking the example from the RUN
directive explanation, the following will set
the command to run when the container is instantiated to list the entries in the
root of the file system:
CMD /list-files.sh
After the command has run, it will return and the container running it will exit. Parameters can be passed to it by simply adding a space between the executable and the parameters:
CMD echo "Hello there!"
We've actually used what is known as the 'shell' form of the CMD
directive
here. In this case, before the /list-files.sh
script is run, a shell is
created inside the container to run the command (it is essentially wrapped
inside a /bin/sh -c <command>
).
There's also an alternative form of the command, called the 'exec' form, which uses a JSON array to call an executable and pass parameters to it without a shell. We'd call our script in the same way in this form as such:
CMD ["/bin/sh", "-c", "/list-files.sh"]
Here, the command being executed is the shell itself (/bin/sh
) and the
parameters passed signify to run a command in it (-c
) and the command to
run (/list-files.sh
). Of course, if the command to execute was a binary,
we could just use CMD ["/path/to/binary", "param1", "param2"]
instead, and
in fact because list-files.sh
includes the environment to use in its header
(#!/bin/bash
) we could run the script without invoking bash
first.
At this point, most users would say "I think I'll use the shell form, it's much
quicker to write". But beware! Whilst the second form of CMD
does indeed
require you to use a JSON structure, it also ensures that signals
passed to the container are propagated to it. That means that when the
container is stopped using docker stop
, the SIGTERM
signal is correctly
passed to the script or executable that allows it to clean up correctly. Using
the 'shell' form will cause the process to be effectively be SIGKILL
ed.
As such, you should always use the 'exec' form unless there are very good reasons for not doing so.
As the CMD
directive informs Docker of what to run when a container is
started, there should only ever be one CMD
directive in a Dockerfile. If
more than one exist, only the last CMD
directive seen will be run. Sometimes
this can be useful, if you want to use a base image which usually runs as a
service, but wish to override the CMD
directive from that base image (perhaps
you want to run with different parameters, or use the base image because of the
filesystem layers it includes).
The CMD
directive can also be used in conjunction with the ENTRYPOINT
directive, which we'll describe in a more advanced Docker Masterclass.
Now that we know some of the basic Dockerfile directives, it's worth learning how to put some of these together to build a simple Docker image.
For convenience, export a variable to point to the root of this masterclass repository, as we'll use this for the rest of the exercises, e.g.:
$ export BALENA_DOCKER_MASTERCLASS=~/docker-masterclass
Now change directory to the simple-docker-app directory in the root of this masterclass repository, e.g.:
$ cd $BALENA_DOCKER_MASTERCLASS/simple-docker-app
Note that for the rest of these exercises, we're assuming that you're using an x64 based machine. Should you be using an alternative architecture, such as Arm, you'll need to change the architecture of the base image that you're using in the Dockerfiles.
We're going to use balenalib
based images in the remainder of the exercises in
this Masterclass, as they include useful images which are 'ready to go' with
various languages, libraries, etc.
Load a code editor of your choice, and edit the Dockerfile
file in the
simple-docker-app
directory. Add the following:
FROM balenalib/intel-nuc-debian-node:latest
COPY * /usr/src/app/
RUN cd /usr/src/app \
&& npm install
CMD ["/usr/local/bin/node", "/usr/src/app/index.js"]
Ensure you have a newline at the end of the Dockerfile. Let's quickly run through exactly what this Dockerfile does:
FROM balenalib/intel-nuc-debian-node:latest
This declares that the Docker image built should inherit from the
balenalib/intel-nuc-debian-node:latest
image, which in this case is the latest
Debian Node.js image built and tagged by balena. This image includes all
standard Debian packages, binaries and libraries, as well as a copy of the
Node.js language and runtime, and npm. This allows us to easily run the Node.js
source code located in index.js
.
COPY * /usr/src/app/
This copies all of the files in the simple-docker-app
directory (the current
Docker build context) into the image being built in the /usr/src/app
directory
which is dynamically created first.
RUN cd /usr/src/app \
&& npm install
After changing directory into the /usr/src/app
directory, the npm install
command is run, which uses the package.json
file previously copied to this
location to install all of the required npm packages specified in it.
This creates a /usr/src/app/node_modules
directory with all the dependencies
in.
CMD ["/usr/local/bin/node", "/usr/src/app/index.js"]
As previously described, the CMD
directive informs Docker what to run when
the Docker image is instantiated as a container. In this case, because the
directive is using the exec
format, full paths to the command to run need to
be given (/usr/local/bin/node
for the Node.js executable), as well as the
source file (/usr/src/app/index.js
).
Once you've edited the Dockerfile, we can build a Docker image from it using
the following docker
CLI command:
$ docker build --tag simple-docker-app .
Sending build context to Docker daemon 18.94kB
Step 1/4 : FROM balenalib/intel-nuc-debian-node:latest
latest: Pulling from balenalib/intel-nuc-debian-node
68ced04f60ab: Pulling fs layer
1418738fb4eb: Pulling fs layer
28ddf06c756e: Pull complete
6544792768e8: Pull complete
6e7ea75f4464: Pull complete
38aca9ee3123: Pull complete
5e30c86f7b84: Pull complete
12febcfb1db1: Pull complete
4acc7449c8c2: Pull complete
45be4c06aae8: Pull complete
Digest: sha256:0a613216d7a23e2ebbb54c0d42f420e8ba65b7d5507149a6690141e4ed700531
Status: Downloaded newer image for balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/4 : COPY * /usr/src/app/
---> d9fad658bc39
Step 3/4 : RUN cd /usr/src/app && npm install
---> Running in 5f4a3fa70ea5
npm WARN [email protected] No description
npm WARN [email protected] No repository field.
added 50 packages from 37 contributors and audited 126 packages in 4.182s
found 0 vulnerabilities
Removing intermediate container 5f4a3fa70ea5
---> 526ffdc6d52e
Step 4/4 : CMD ["/usr/local/bin/node", "/usr/src/app/index.js"]
---> Running in 9f9c787b7a1d
Removing intermediate container 9f9c787b7a1d
---> 630b3919abb5
Successfully built 630b3919abb5
Successfully tagged simple-docker-app:latest
The docker build
command itself instructs the Docker daemon to build a new
image, the --tag
(shortened form -t
) switch informs it how to name the image
(naming is in the form of repository:tag
, where the tag
is optional), and
the .
is the path to the build context, in this case the directory we're
running the command inside.
As you can see, quite a lot happens here. You'll see that there are several
steps which get output (eg.
Step 1/4 : FROM balenalib/intel-nuc-debian-node:latest
). As you may have
noticed, each of these steps corresponds to a Dockerfile directive. Therefore
you can quickly see which directive is being run by the step that's being
carried out. In this case, the Docker daemon pulls the base image required (and
during this you can see each individual image layer that makes up the base image
being pulled), before copying the files into the image, carrying out the npm
install and then setting the command to run on container instantiation.
Another really important point to note here is that when a Dockerfile is
processed, new layers for that image are created by particular directives.
Generally, new layers are created by each new Dockerfile directive (this is not
always true, but at this point, thinking about it in this way is useful). As
such, consolidating command into statements such as RUN
can be useful, as it
limits the number of layers produced in an image. We'll go more into these
techniques later in the Masterclass.
We can now list the images by running:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
simple-docker-app latest 630b3919abb5 14 minutes ago 272MB
balenalib/intel-nuc-debian-node latest 4498fe12b99e 11 hours ago 270MB
Note that each image has a repository name, tag and image ID. Most Docker
commands working with images can take the name of the container or the ID.
If using the container name, note that a Docker image repository name can be
used with or without a tag, however if there is more than one tag for an image's
Docker repository name, the latest
tag will be assumed if not given.
More information about repository naming and tags will be discussed in a later exercise.
As you can see, we have a total of two images. The first is expected, because
this is the image we just built (simple-docker-app
), but the base image that
our Docker image has inherited from is also present
(balenalib/intel-nuc-debian-node
). This is because Docker stores each image,
including base images that others inherit, separately inside the system. When
a container is created from an image, all the layers that make up the image are
referenced and used by the container. This happens regardless of the number of
times an image is used, or how many parents it has. Because layers are
referenced and not directly copied, this allows many different containers to
run without the resource penalty of copying the same files and directory layouts
for each separate instance, thus saving a large amount of space.
5.1 WORKDIR
Directive
We now have a built image. However, even though we've a few steps, there's a
few things we could so to improve the way it's built. The first is to specify
a working directory. The WORKDIR
directive tell the Docker daemon that this is
our default working directory. Think of it a bit like running cd
to change to
the directory specified before running any commands.
With that in mind, add a new line to our Dockerfile after the FROM
directive:
WORKDIR /usr/src/app
We'll also now change a few of the other lines to remove specific references to
/usr/src/app
.
Change:
RUN cd /usr/src/app \
&& npm install
to:
RUN npm install
And finally change:
CMD ["/usr/local/bin/node", "/usr/src/app/index.js"]
to:
CMD ["/usr/local/bin/node", "index.js"]
Let's rebuild the image again:
$ docker build -t simple-docker-app .
Sending build context to Docker daemon 18.94kB
Step 1/5 : FROM balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/5 : WORKDIR /usr/src/app
---> ebec93079c63
Step 3/5 : COPY * /usr/src/app/
---> aa53bfc5582c
Step 4/5 : RUN npm install
---> Running in 7a726fbf79ad
npm WARN [email protected] No description
npm WARN [email protected] No repository field.
added 50 packages from 37 contributors and audited 126 packages in 2.249s
found 0 vulnerabilities
Removing intermediate container 7a726fbf79ad
---> 6c7e0341586d
Step 5/5 : CMD ["/usr/local/bin/node", "index.js"]
---> Running in c83a11a07b0f
Removing intermediate container c83a11a07b0f
---> ca72102d9256
Successfully built ca72102d9256
Successfully tagged simple-docker-app:latest
The first thing you'll notice is that the FROM
line didn't show the Debian
base image being pulled. That's because it's already downloaded and stored it
in the local registry.
The next thing you'll notice is that we no longer have to specify the cd
to change directory in the RUN
command. If we hadn't specified the WORKDIR
to use, it would be looking at the root /
of the filesystem to run
npm install
and would fail as the package.json
would not exist there.
Finally, we no longer have to specify the full path of the index.js
source
file, as this exists in the /usr/src/app
directory (which is our working
directory).
As we've now specified the working directory as /usr/src/app
, we could also
change the command used to actually run the Node.js app to npm, as it would find
the package.json
in the same directory and use the start
section from it.
Change the final CMD
line to:
CMD ["/usr/local/bin/npm", "start"]
Finally rebuild the image again so this final line is used, with:
$ docker build -t simple-docker-app .
We now have a built Docker image, let's run it! First it's worth noting a little bit about what happens when you run a Docker image and use it to create a new container.
Docker containers run as processes spawned by the Docker daemon. Linux-based, they use the underlying host kernel to do so. Under host OSs such as macOS and Windows, these kernels are supplied by minimal Linux VMs that are closely bound to the underlying host OS so that contexts such as device nodes, networking, etc. are available to the VM kernel and therefore the container running on it.
This explains why OS resources are acquired by containers dynamically (and also released when no longer required), as each running process only consumes what it needs, as it essentially is running on the host OS, albeit in a file system and resource sandbox.
To run an image as a container, we can use the
docker run
command. We can either pass the repository and tag name or use the ID. We'll
just simply ask docker run
to use the simple-docker-app
image, without a tag
(which will assume it should use the latest
tag):
$ docker run --tty simple-docker-app
> [email protected] start /usr/src/app
> node index.js
Echo server is up and listening...
We'll explain the significance of --tty
later, but for now it just means that
we can stop the container using Ctrl+C
when we're done with it.
The act of running creates a new container instance based on that
image, which is now running under the Docker daemon. Open a new terminal (which
from this point on we'll call T2 with the terminal you ran the Docker image in
as T1), and enter:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
61578a11c16e simple-docker-app "/usr/bin/entry.sh /…" About a minute ago Up About a minute tender_gagarin
docker ps
shows
us all the currently running container instances. In this case there's just the
one we've started. You can see the command that it's running, as well as the
image it's derived from. It also has a container ID (just like an image, each
container also has a unique ID) and a name (which because we didn't specifically
pick one when running is autogenerated). However, there's also another field you
might notice which doesn't have any value associated with it. This is the
PORTS
field.
If you look at the source code in index.js
, you'll see that we've a little
endpoint service listening for REST POST requests to the /echo
endpoint.
You'll also see that the service is running on port 9854. So, let's use curl
in T2 to send a POST request:
$ curl -XPOST http://localhost:9854/echo -H 'Content-type: text/plain' -d 'Hello there!'
curl: (7) Failed to connect to localhost port 9854: Connection refused
Well, that's odd! The source for the image shows that the Express server is
listening on port 9854, but curl
can't get a connection to it. The reason is
that whilst the Express server in the container is trying to listen to the
port, we haven't actually told the Docker daemon that the container should open
a connection to the same port. This actually demonstrates a fundamental
principal behind containerization, in that unless specifically instructed,
Docker sandboxes all running containers securely away from the host OS.
First terminate the container using either Ctrl+C
in T1 where it's running,
or stop it by running
docker stop
in
T2, passing either its name or ID as a final parameter. In the case of the
container we've created here, that would be:
$ docker stop tender_gagarin
or:
$ docker stop 61578a11c16e
Once stopped, the container no longer shows up as running in docker ps
:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
But it will show up as a stopped container in the list of all containers if we
pass --all
(shortened form -a
) to docker ps
:
$ docker ps --all
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
61578a11c16e simple-docker-app "/usr/bin/entry.sh /…" About a minute ago Exited (0) a minute ago tender_gagarin
The --all
(which can also be written -a
) switch to docker ps
tells it to
list all the container instances that are currently available, not just those
that are currently running.
We can restart the container easily with
docker start
,
which will allow you to restart any previously created and stopped container
instance. Simply pass the name of the container or its ID and Docker will start
the container again using the same parameters that it was originally run or
created with.
$ docker start 61578a11c16e
61578a11c16e
Finally stop the container again before continuing with:
$ docker stop 61578a11c16e
61578a11c16e
Instead of using docker run
, we could also use
docker create
to create a new container from an image. However, this command only creates a
container and does not run it. docker create
takes many of the same parameters
as docker run
. To start running a created container, you then need to use
docker start
on the container name or ID.
We'll get into Docker networking later and how traffic is routed, but for now
all we need to know is that to ensure that traffic can get in and out of the
container so that Express can receive traffic on port 9854, we need to run the
container with the --publish
command to signal to the Docker daemon that we'd
like to 'publish' traffic into it from the host OS.
We're going to re-run the image again, but this time specifying that the right port should be opened up into the container. In T1, run the following:
$ docker run --tty --publish 9854:9854 simple-docker-app
> [email protected] start /usr/src/app
> node index.js
Echo server is up and listening...
Note that we specified 9854 twice as the arguments to the --publish
switch
(this can be shortened to -p
).
This isn't an error, but it lets us map a specific port inside the container
to a port on the host. In this case, we want both port numbers to be the same,
but you can change the host port by altering the first argument to be the port
on the host you want to use, ie.
--publish <hostPort>:<containerPort>
.
As a final note, --publish
also allows ranges of ports from the host to be
published (for example mapping all traffic from ports 1234 to 1238 on the host
to a single port inside the container would be --publish 1234-1238:1234-1238
).
In T2, have a look at the output of docker ps
again:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0579294093b6 simple-docker-app "/usr/bin/entry.sh /…" 8 seconds ago Up 8 seconds 0.0.0.0:9854->9854/tcp jolly_banzai
Aha! Now we can see that the PORTS
section has a value in it,
0.0.0.0:9854->9854/tcp
. This states that port 9854
on IP address 0.0.0.0
on the host OS is mapped to port 9854/tcp
inside the container. A couple of
things to note are that by default, unless otherwise specified, the Docker
daemon will allow traffic coming into the host on any interface through the
specified port into the container (hence 0.0.0.0
which states any IP assigned
to a valid host network interface). We could have specified a set IP address for
the host, for example if we wanted only to let traffic coming through localhost
into the container, we could use --publish 127.0.0.1:9854/9854
.
Additionally you can see that the traffic entering the container from the host
on port 9854 is going to port 9854/tcp
in the container. TCP is always the
default if not otherwise stated. To use UDP instead, simply prefix the published
ports list with /udp
, eg. --publish 9854/udp:9854/udp
would map any traffic
from UDP port 9854 on the host to UDP port 9854 in the container.
Finally let's try the curl
command again in T2:
$ curl -XPOST http://localhost:9854/echo -H 'Content-type: text/plain' -d 'Hello there!'
Hello there!
Success! We've posted an HTTP request from the host to the Docker container instance, and received a response.
Finally, stop the container currently running either with Ctrl+C
in T1, or via
docker stop
in T2.
6.5 EXPOSE Directive
Whilst we're on the topic of ports, this is a good time to quickly discuss the
EXPOSE
directive. This can be used in Dockerfiles to state that the container
would like to open a particular port when run as a container. Let's try this
by adding:
EXPOSE 9854
to our Dockerfile (it doesn't matter particularly where for this example, but
good practice is near the start of the Dockerfile, after the FROM
line).
(If you wanted to expose a UDP port, you'd simply specify EXPOSE 9854/udp
.)
Now let's rebuild our Docker image in T1:
$ docker build -t simple-docker-app .
Sending build context to Docker daemon 18.94kB
Step 1/6 : FROM balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/6 : EXPOSE 9854
---> Running in a864220d9450
Removing intermediate container a864220d9450
---> 6c956e6eab8a
Step 3/6 : WORKDIR /usr/src/app
---> Running in 51b79d5d5e92
Removing intermediate container 51b79d5d5e92
---> e871df45a8ae
Step 4/6 : COPY * /usr/src/app/
---> 4cc2425acfa2
Step 5/6 : RUN npm install
---> Running in 4dad7c0549f3
npm WARN [email protected] No description
npm WARN [email protected] No repository field.
added 50 packages from 37 contributors and audited 126 packages in 2.332s
found 0 vulnerabilities
Removing intermediate container 4dad7c0549f3
---> f8d369f729ae
Step 6/6 : CMD ["/usr/local/bin/npm", "start"]
---> Running in 1c027dde602e
Removing intermediate container 1c027dde602e
---> 8d1460ff22b2
Successfully built 8d1460ff22b2
Successfully tagged simple-docker-app:latest
Now we'll run the new version of the image, but we'll remove the use of
--publish
and instead use the --publish-all
(shortened form -P
) switch
instead, which will publish all ports specifed in EXPOSE
directives in the
image to the host:
$ docker run --tty --publish-all simple-docker-app
> [email protected] start /usr/src/app
> node index.js
Echo server is up and listening...
In T2 run the curl
command again:
$ curl -XPOST http://localhost:9854/echo -H 'Content-type: text/plain' -d 'Hello there!'
curl: (7) Failed to connect to localhost port 9854: Connection refused
It looks like curl
can't actually connect to the host on port 9854. Let's use
T2 again to see what docker ps
says:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
636c8f11fee2 simple-docker-app "/usr/bin/entry.sh /…" 3 minutes ago Up 3 minutes 0.0.0.0:32769->9854/tcp epic_montalcini
Interestingly, whilst port 9854 in the container is mapped a port on the host,
it's not mapped to port 9854. This is because --publish-all
maps ports
specified by the EXPOSE
directive to
ephemeral ports on the host,
usually the next free one. This initially doesn't sound particularly useful,
but becomes more so in the context of Docker deployments to Cloud hosts where
services are usually run.
If we now try curl
again in T2 using port 32769 (this will most likely change
on your own machine, ensure you use the right mapped port), we'll see that all
works as expected:
$ curl -XPOST http://localhost:32769/echo -H 'Content-type: text/plain' -d 'Hello there!'
Hello there!
Any Docker image which has been built with the EXPOSE
directive can still be
forced to map a specific host port to a specific container port just as before
by using the --publish
switch to docker run
.
Whilst the use of EXPOSE
is not mandatory when writing Dockerfiles, its use
is encouraged as useful documentation and metadata to those who want to run your
Docker images. This stops extra documentation being required to state explicitly
which ports your image uses, and also means when you update your image, all a
user has to do is see if extra EXPOSE
directives have been used and then map
the new ports appropriately to the host.
Finally, stop the container running in T1 and then close T2. We'll refer to T1 from this point as 'the terminal' again.
One thing you may have spotted whilst we were stopping and rerunning our Docker image was that the name and the ID of the container kept changing.
In the terminal, run the following:
$ docker ps --all
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
636c8f11fee2 simple-docker-app "/usr/bin/entry.sh /…" 20 minutes ago Exited (0) 3 minutes ago epic_montalcini
0579294093b6 f50ceee6da24 "/usr/bin/entry.sh /…" 51 minutes ago Exited (0) 26 minutes ago jolly_banzai
61578a11c16e d00c59983bd8 "/usr/bin/entry.sh /…" About an hour ago Exited (0) About an hour ago tender_gagarin
However, looking at the list, this seems strange at first. We we just kept
running our simple-docker-app
, so why are there several instances of it? Well,
every time we used docker run simple-docker-app
, we were asking the Docker
daemon to run a new instance of the image, not to run the same container
image that we had before. If we want to re-start an already existing container,
we'd simply use
docker start
.
We've previously built an image using docker build
, but currently if we make
any changes this rebuilds the entire image. In this section, we'll learn how
to tag images to give them versioning information, as well as how to use
caching to ensure quicker rebuilding of images.
You can also see from the container list that only the last container ran using
the simple-docker-app
image, but why is this? If you remember earlier we
discussed that each image name had a repository and tag section. Because we
have rebuilt the image several times with changes to the Dockerfile, we've
actually changed the resulting image's file system layers and unique hash.
This has meant that the latest
tag on the simple-docker-app
repository has
moved each time we've made changes to the image to the newly built version of
it. In the above output, both images d00c59983bd8
and f50ceee6da24
were
previously called simple-docker-app
, as they were the last version of the
image to be built. However, as we've built new versions, their name has been
changed to that of the image's ID instead, to differentiate them from the other
versions.
This again is consistent with how Docker denotes images and containers. If we hadn't made any changes to the image but simply rebuilt it, it would have checked the layers against those that already exist for the image, have seen nothing had changed, and the image would not have been given a new ID or name.
We can easily check this by running the following in the terminal:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
simple-docker-app latest 8d1460ff22b2 29 minutes ago 272MB
...
The latest image ID for simple-docker-app:latest
is 8d1460ff22b2
. Let's
try rebuilding the image:
$ docker build -t simple-docker-app .
Sending build context to Docker daemon 18.94kB
Step 1/6 : FROM balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/6 : EXPOSE 9854
---> Using cache
---> 6c956e6eab8a
Step 3/6 : WORKDIR /usr/src/app
---> Using cache
---> e871df45a8ae
Step 4/6 : COPY * /usr/src/app/
---> Using cache
---> 4cc2425acfa2
Step 5/6 : RUN npm install
---> Using cache
---> f8d369f729ae
Step 6/6 : CMD ["/usr/local/bin/npm", "start"]
---> Using cache
---> 8d1460ff22b2
Successfully built 8d1460ff22b2
Successfully tagged simple-docker-app:latest
The build will finish almost at once; because there were no changes, the image
ID build is still 8d1460ff22b2
, and the name remains the same:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
simple-docker-app latest 8d1460ff22b2 32 minutes ago 272MB
...
If we were to make any changes to the Dockerfile or any of the files built into
the image, the ID and the name would change again, to denote that the latest
version of the image is now different to that previously.
To make this a bit clearer, we'll remove all the current containers and images and start again with a new naming scheme.
In the terminal run:
$ docker rm $(docker ps --all --quiet) && docker rmi $(docker images --quiet)
92b04db835b2
c7747c038ccb
646ee43d28ec
Untagged: simple-docker-app:latest
Deleted: sha256:d3916aacd72ef1e141885e7726dd8b1ca0388e6e10b2d8faf79f3cde60cf803d
Deleted: sha256:46d5233518e31e66f3447c830bb70372787d1873d012e9b8ad57fd108f4c4b02
Deleted: sha256:4187fd02a41805db6d6dbef6aab2b257aff00e166b8db2ede3060015247c9632
Deleted: sha256:bb46a70a9ea96fe011d030f518aabd7fa5619c42e0cc5adab5cb371d22bb1342
Deleted: sha256:419dff4dd6df34bba7cbab86dae5845c189b6c7dd6e08a9a7564de7d6df87af6
Deleted: sha256:4e71ef43d45dc3f7758b01c9b722331a22cf8a7c895ccdf4638c6dc62cdf4802
Deleted: sha256:bb249dbcb57e2548dbf63b0b3f148edd1380ac956ed5fca994d990331b3f00c3
Deleted: sha256:2183028d46ba9f26d13177acc9cfc9065f480673118360ff23ef81ac7915898e
Deleted: sha256:b9bb00a5e2249823a25a926a24c4f28292629fe42db30ce6e266eae32bff5b8d
Deleted: sha256:dbb5f06727dc12ecd7db5ea5ede261d4197e1cdbe009c7decc8edbb69e0a21ea
Deleted: sha256:0e9cc530a9220017f6687e024834f9220ec5e7e8acdc28dd88b40cc1a884b385
Deleted: sha256:063849044d81abc6006d5b7a5facfd6181ba0d1d4b648f4dddffed81d0d5b556
Deleted: sha256:a849ba8f23c4f8eeef20ac2ff3dcb587b6bc49874492aaaab313001337a2ead0
Deleted: sha256:01753aa7f35f274cc19cda7a05cf52f9453b4e8d50f949d2d1c1d240dbfc9ced
Deleted: sha256:b9c4eebbf47dd5650591125038aecb2d42c705720abd6f0dec5251b5addae859
Deleted: sha256:9737e0a4b16054e6a1477c2d3727f0f80e41b063894f878734a7b5f16c741568
Deleted: sha256:42f2cd768ae7bf3e8d41211db4d305ff25a7e90a952e643420fd48fa90885803
Deleted: sha256:222e19d39a010af250e7acc204ff5aeca1dff11991435d3bf4539ada9bea10f3
Deleted: sha256:e3fccddc6e65c39402683215147e79ee63fe6cd79ee38b8deacf01da0c8c63d6
Deleted: sha256:a4651a5f72e536808b761f364233aff6943adf92272ca0d63d27bbfa18a3da09
Deleted: sha256:4d15f60da9f321cdeefecfd0c5699cf0bddf6fcc371782a7f29105abf11f361f
Deleted: sha256:45783d47385fda0accce26aa731d61be719ffc9cc4889d43f64621cbf44e999c
Deleted: sha256:78ff73474fe0a280269358279dfde63d4f0d81c2fdf37b42dde76c3ef1583f6d
Deleted: sha256:f70452e65f0456c42f67affc22ef1f6c6530211fa9138c063a984e4a28315ead
Deleted: sha256:53c350e072b435330016235e6081255151c104c2a73664a39436f59c81b8d030
Untagged: balenalib/intel-nuc-debian-node:latest
Untagged: balenalib/intel-nuc-debian-node@sha256:0a613216d7a23e2ebbb54c0d42f420e8ba65b7d5507149a6690141e4ed700531
Deleted: sha256:4498fe12b99e2e5b0ed343ac14f53ceefa18b8d7a8ad458da05ffb094989bfe5
Deleted: sha256:6774853e36677eac96d376174993694fc24daa455724ba99b498d30eaaff581c
Deleted: sha256:191098869ebd8201a2e909969fb26db0a13d5ed20dc807bdb639c52fa0cd3aa5
Deleted: sha256:255db71d63d0f14a6babfed360cc8317c6f060046902c2492d25edbdd6efd74e
Deleted: sha256:54b05cf3c2b53f3aa35b63d6e517dcc66ad4e5c04cf3e10f8a77685b964b3b11
Deleted: sha256:5a8535c5253022eb99d04334c697d356cd6d28876965cc55f9c8cc8976ba44c5
Deleted: sha256:f77bc377ce89acd6568b3640f2d17ab69fba419f3c45ea5036842188fe610baa
Deleted: sha256:52638c47b14625df47fc7ece74f6508c056181e965e2acbc7e4e1ae269f7a0d0
Deleted: sha256:bd032d966a12ff731c1a46bf62b0fe079d4c412d3e290d4f6321c483ea9eb2eb
Deleted: sha256:6d378e46857b5177133d61d45333dae1f29e6171914bc5ee7bcc7e2f0fc75c17
Deleted: sha256:f2cb0ecef392f2a630fa1205b874ab2e2aedf96de04d0b8838e4e728e28142da
This will wipe all images and containers from the local repository. Let's quickly break down what that command did:
docker rm $(docker ps --all --quiet)
The docker ps --all --quiet
command lists all instances of containers, running
or not, and output their IDs and nothing else (the --quiet
switch, which can
also be shortened to -q
). The $()
wrapper executes this for the docker rm
command, which then removes each of these container IDs from the local
repository. You can individually remove a container instance either by using its
name or ID, eg. docker rm <IDorName>
.
After this, the next command is run:
docker rmi $(docker images --quiet)
Similarly to the docker ps
command, the --quiet
switch to docker images
will only output the IDs of the images. This is then exectured for docker rmi
which will remove each image by its ID. Again, you can manually remove a
Docker image from the local repository with docker rmi <IDorRepositoryName>
.
A few things to note here, had we just tried to remove the images without first
remove the containers, Docker would have complained that the images were in use
by instantiated containers. A quick way round this is to use --force
(shortened form -f
) when running docker rmi
, but bear in mind it will never
remove the image from a running container, which must be manually stopped first.
Another important note is that whilst the images will be removed doing this,
it will not remove containers that have been created from these images.
We're going to now rebuild our image from the Dockerfile, but this time we're explicitly going to give it version tags:
$ docker build -t simple-docker-app:v0.0.1 .
Sending build context to Docker daemon 5.12kB
Step 1/6 : FROM balenalib/intel-nuc-debian-node:latest
latest: Pulling from balenalib/intel-nuc-debian-node
68ced04f60ab: Pull complete
1418738fb4eb: Pull complete
28ddf06c756e: Pull complete
6544792768e8: Pull complete
6e7ea75f4464: Pull complete
38aca9ee3123: Pull complete
5e30c86f7b84: Pull complete
12febcfb1db1: Pull complete
4acc7449c8c2: Pull complete
45be4c06aae8: Pull complete
Digest: sha256:0a613216d7a23e2ebbb54c0d42f420e8ba65b7d5507149a6690141e4ed700531
Status: Downloaded newer image for balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/6 : EXPOSE 9854
---> Running in d946211dd5c2
Removing intermediate container d946211dd5c2
---> ff63da014498
Step 3/6 : WORKDIR /usr/src/app
---> Running in 0dfe40907d63
Removing intermediate container 0dfe40907d63
---> 568a5e43a29e
Step 4/6 : COPY * /usr/src/app/
---> 14385fc25cae
Step 5/6 : RUN npm install
---> Running in 77c4a4af57d7
npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN [email protected] No description
npm WARN [email protected] No repository field.
added 50 packages from 37 contributors and audited 126 packages in 3.933s
found 0 vulnerabilities
Removing intermediate container 77c4a4af57d7
---> bdb6468729c5
Step 6/6 : CMD ["/usr/local/bin/npm", "start"]
---> Running in a628f762ef4b
Removing intermediate container a628f762ef4b
---> 25d3625211c2
Successfully built 25d3625211c2
Successfully tagged simple-docker-app:v0.0.1
Now run:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
simple-docker-app v0.0.1 25d3625211c2 24 seconds ago 273MB
balenalib/intel-nuc-debian-node latest 4498fe12b99e 38 hours ago 270MB
Unlike when we built the image before, we have now set a specific tag for the
image repository. Also note how the latest
tag is not present. Let's try
and run the image in the same way we did before, using just the repository name:
$ docker run --tty simple-docker-app
Unable to find image 'simple-docker-app:latest' locally
docker: Error response from daemon: pull access denied for simple-docker-app, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
See 'docker run --help'.
Docker can't find the latest
tag, because it's not been set, so it is not
able to run the image. We could run it by specifying
docker run simple-docker-app:v0.0.1
, but there's also a couple of ways we
can generate the latest tag too. The first would be to rebuild the image with
the latest
tag as well:
$ docker build -t simple-docker-app:v0.0.1 -t simple-docker-app:latest .
Sending build context to Docker daemon 5.12kB
Step 1/6 : FROM balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/6 : EXPOSE 9854
---> Using cache
---> ff63da014498
Step 3/6 : WORKDIR /usr/src/app
---> Using cache
---> 568a5e43a29e
Step 4/6 : COPY * /usr/src/app/
---> Using cache
---> 14385fc25cae
Step 5/6 : RUN npm install
---> Using cache
---> bdb6468729c5
Step 6/6 : CMD ["/usr/local/bin/npm", "start"]
---> Using cache
---> 25d3625211c2
Successfully built 25d3625211c2
Successfully tagged simple-docker-app:v0.0.1
Successfully tagged simple-docker-app:latest
Again, this rebuild was very fast because nothing had actually changed, and all that the Docker daemon actually had to do was add a new tag. Tags show up as separate images. If we list the images again:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
simple-docker-app latest 25d3625211c2 About a minute ago 273MB
simple-docker-app v0.0.1 25d3625211c2 About a minute ago 273MB
balenalib/intel-nuc-debian-node latest 4498fe12b99e 38 hours ago 270MB
You can see that the same image ID is actually related to two different tags.
We will now be able to run the image without the specific tag (defaulting to
latest
) again.
However, if we now change something that alters the image that we build, this
will change he situation. In your code editor, change index.js
and add a
line at the very top that says:
// This comment changes the image
We'll now rebuild the image with:
$ docker build -t simple-docker-app .
Sending build context to Docker daemon 5.12kB
Step 1/6 : FROM balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/6 : EXPOSE 9854
---> Using cache
---> ff63da014498
Step 3/6 : WORKDIR /usr/src/app
---> Using cache
---> 568a5e43a29e
Step 4/6 : COPY * /usr/src/app/
---> 984238d40dc5
Step 5/6 : RUN npm install
---> Running in 6daa75bedeb7
npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN [email protected] No description
npm WARN [email protected] No repository field.
added 50 packages from 37 contributors and audited 126 packages in 4.965s
found 0 vulnerabilities
Removing intermediate container 6daa75bedeb7
---> 2ffc21acaf16
Step 6/6 : CMD ["/usr/local/bin/npm", "start"]
---> Running in 97806c309a25
Removing intermediate container 97806c309a25
---> 4817e03f7def
Successfully built 4817e03f7def
Successfully tagged simple-docker-app:latest
Finally run:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
simple-docker-app latest 4817e03f7def 27 seconds ago 273MB
simple-docker-app v0.0.1 25d3625211c2 2 minutes ago 273MB
balenalib/intel-nuc-debian-node latest 4498fe12b99e 38 hours ago 270MB
As you can see, because when we rebuilt we only specified the repository name
to -t
and not an actual tag, latest
was assumed and the latest
tag now
points to the newer image.
It's always a really good idea to specifically tag the images you're building with a suffix. This stops older built images from losing their repository and tag name, which allows you to easily identify older versions of the image.
To reinforce that point, we'll add one final comment line to index.js
, just
below the other:
// Another comment changes the image
Finally, rebuild the image with a v0.0.2
tag and the latest
tag:
$ docker build -t simple-docker-app:v0.0.2 -t simple-docker-app:latest .
Sending build context to Docker daemon 5.12kB
Step 1/6 : FROM balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/6 : EXPOSE 9854
---> Using cache
---> ff63da014498
Step 3/6 : WORKDIR /usr/src/app
---> Using cache
---> 568a5e43a29e
Step 4/6 : COPY * /usr/src/app/
---> ba616c670298
Step 5/6 : RUN npm install
---> Running in 0201cb715e8a
npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN [email protected] No description
npm WARN [email protected] No repository field.
added 50 packages from 37 contributors and audited 126 packages in 2.677s
found 0 vulnerabilities
Removing intermediate container 0201cb715e8a
---> b20eabc15eb0
Step 6/6 : CMD ["/usr/local/bin/npm", "start"]
---> Running in dceef18cc322
Removing intermediate container dceef18cc322
---> d7fefc78916d
Successfully built d7fefc78916d
Successfully tagged simple-docker-app:v0.0.2
Successfully tagged simple-docker-app:latest
And run:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
simple-docker-app latest d7fefc78916d 22 seconds ago 273MB
simple-docker-app v0.0.2 d7fefc78916d 22 seconds ago 273MB
<none> <none> 4817e03f7def About a minute ago 273MB
simple-docker-app v0.0.1 25d3625211c2 3 minutes ago 273MB
balenalib/intel-nuc-debian-node latest 4498fe12b99e 38 hours ago 270MB
We now have a new version, v0.0.2
, which is also the latest
version. We also
still have our original version, v0.0.1
. But note that the version we built
without a tag has now been stripped of its repository and latest
tag. Because
Docker was not given a full repository:tag
name for the image, and because
we've moved the latest
tag on, it's had to remove the name and tag from that
image, and it is now only know by it's original ID (4817e03f7def
).
Usually, especially during development of a Docker service, you'll want to change things such as the packages being installed into the image, the source code for the application itself, configuration files, etc.
Whilst our small example Docker image builds quite quickly, if we were to start building very large images we wouldn't want the entire image to be rebuilt from scratch every time we changed a source file that had nothing to do with OS packages being installed.
As we've previously mentioned, when building an image each Dockerfile directive (known as a step) is processed in serial, and these usually result in new layers. However, when building an image, the Docker daemon actually examines the directive and the layers already in the local repository, and determines if it actually needs to create a new layer for that directive or just re-use what already exists.
Let's visualize an example here to make it a bit clearer. Taking the contents
of the simple-docker-app
Dockerfile you currently have, it'll look something
like this:
FROM balenalib/intel-nuc-debian-node:latest
EXPOSE 9854
WORKDIR /usr/src/app
COPY * /usr/src/app/
RUN npm install
CMD ["/usr/local/bin/npm", "start"]
If we rebuild the app, without making any changes, we'll see that Docker finishes it very quickly:
$ docker build -t simple-docker-app:latest .
Sending build context to Docker daemon 5.12kB
Step 1/6 : FROM balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/6 : EXPOSE 9854
---> Using cache
---> ff63da014498
Step 3/6 : WORKDIR /usr/src/app
---> Using cache
---> 568a5e43a29e
Step 4/6 : COPY * /usr/src/app/
---> Using cache
---> ba616c670298
Step 5/6 : RUN npm install
---> Using cache
---> b20eabc15eb0
Step 6/6 : CMD ["/usr/local/bin/npm", "start"]
---> Using cache
---> d7fefc78916d
Successfully built d7fefc78916d
Successfully tagged simple-docker-app:latest
Note that for each step that gets carried out, Docker reports Using cache
.
The reason for this is because Docker has noticed that there is already an image
that exists based on balenalib/intel-nuc-debian-node:latest
, and that for
each step in the Dockerfile, there's already a layer that corresponds to
this step from a previous build. This only occurs because the steps are in
the same order and carry out the same operations. Because of this, it can reuse
the layers that already exist. This is called Docker caching, and it's another
powerful feature in building Docker images.
However, as soon as something changes either in the Dockerfile directives or any of the files that are used in building the image, this causes Docker to have to build a new layer as it no longer pertains to previously created layers for that step. This is called 'cache busting'.
So now we know a bit more about this, let's change one of the last lines in the
Dockerfile so that change the way the output image is built. Just before the
final line in the Dockerfile (the CMD
line), add the following:
RUN echo "I'm a new layer!"
From what we now know, when we now rebuild the Docker image it'll use the cache
for all the layers up to the new step we've added with the new RUN
directive:
$ docker build -t simple-docker-app:latest .
Sending build context to Docker daemon 4.096kB
Step 1/7 : FROM balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/7 : EXPOSE 9854
---> Using cache
---> ff63da014498
Step 3/7 : WORKDIR /usr/src/app
---> Using cache
---> 568a5e43a29e
Step 4/7 : COPY * /usr/src/app/
---> 1170dda6a85a
Step 5/7 : RUN npm install
---> Running in f0f10ee5cb1c
npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN [email protected] No description
npm WARN [email protected] No repository field.
added 50 packages from 37 contributors and audited 126 packages in 4.573s
found 0 vulnerabilities
Removing intermediate container f0f10ee5cb1c
---> a59c235d0394
Step 6/7 : RUN echo "I'm a new layer!"
---> Running in 2ffe6005e09e
I'm a new layer!
Removing intermediate container 2ffe6005e09e
---> c9d98338032c
Step 7/7 : CMD ["/usr/local/bin/npm", "start"]
---> Running in 5b954db009d2
Removing intermediate container 5b954db009d2
---> 8c3c0d4e9665
Successfully built 8c3c0d4e9665
Successfully tagged simple-docker-app:latest
We can see from the build output though, that it has not used the cache for
all the steps up to the new directive. In fact, although it used the cache for
the FROM
, EXPOSE
and WORKDIR
steps, it created a new layer for the COPY
command onwards. But we didn't change anything else apart from the Dockerfile!
Well, that actually explains exactly the reason why it created a new layer
from that point onwards. Let's look at the COPY
command:
COPY * /usr/src/app/
This says to copy everything from the current build context into the
/usr/src/app
directory inside the image. But this includes the Dockerfile
itself, which we've changed. Because of this, the layer output would not be
the same as the ones from previous builds, as the contents of the Dockerfile
are now different. There's a pretty obvious way round this, which is to not
copy the Dockerfile as part of the build context. There's a couple of different
ways to do this.
The first is via the use of a
.dockerignore
file. This allows us to specify files and directories that shouldn't be included
as part of the context sent via COPY
(or ADD
which we'll describe later in
the Masterclass). In the simple-docker-app
directory, execute the following:
cat <<EOF >> .dockerignore
Dockerfile
.dockerignore
EOF
This will create a .dockerignore
file that says not to copy the Dockerfile
and .dockerignore
file to the new layer when the COPY
command is run. Note
that if we hadn't included the .dockerignore
file itself, then when a build
is next run, cache busting will still occur on the COPY
line because the
.dockerignore
file itself would be copied into the image!
And now let's add yet another line to the Dockerfile just before the CMD
:
RUN echo "I'm yet another new layer!"
Run the build again:
$ docker build -t simple-docker-app:latest .
Sending build context to Docker daemon 5.12kB
Step 1/8 : FROM balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/8 : EXPOSE 9854
---> Using cache
---> ff63da014498
Step 3/8 : WORKDIR /usr/src/app
---> Using cache
---> 568a5e43a29e
Step 4/8 : COPY * /usr/src/app/
---> 9dba319eb545
Step 5/8 : RUN npm install
---> e592d47811be
Step 6/8 : RUN echo "I'm a new layer!"
---> 50b9b858560a
Step 7/8 : RUN echo "I'm yet another new layer!"
---> Running in d2f1941af3f7
I'm yet another new layer!
Removing intermediate container d2f1941af3f7
---> 088cb20dc04e
Step 8/8 : CMD ["/usr/local/bin/npm", "start"]
---> Running in c73ee349d0c8
Removing intermediate container c73ee349d0c8
---> 8da0e5e54752
Successfully built 8da0e5e54752
Successfully tagged simple-docker-app:latest
So again, this didn't work, why not? Well, this time, the COPY
instruction
has changed again because now we're not copying the Dockerfile in. Finally,
add another new line before the CMD
directive:
RUN echo "I'm a final layer!"
And now rerun the build:
$ docker build -t simple-docker-app:latest .
Sending build context to Docker daemon 5.12kB
Step 1/9 : FROM balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/9 : EXPOSE 9854
---> Using cache
---> ff63da014498
Step 3/9 : WORKDIR /usr/src/app
---> Using cache
---> 568a5e43a29e
Step 4/9 : COPY * /usr/src/app/
---> Using cache
---> 9dba319eb545
Step 5/9 : RUN npm install
---> Using cache
---> e592d47811be
Step 6/9 : RUN echo "I'm a new layer!"
---> Using cache
---> 50b9b858560a
Step 7/9 : RUN echo "I'm yet another new layer!"
---> Using cache
---> 088cb20dc04e
Step 8/9 : RUN echo "I'm a final layer!"
---> Running in 6ac0e5fd891e
I'm a final layer!
Removing intermediate container 6ac0e5fd891e
---> 4bf14cfdce0b
Step 9/9 : CMD ["/usr/local/bin/npm", "start"]
---> Running in 8730c2627796
Removing intermediate container 8730c2627796
---> 14984b3f702d
Successfully built 14984b3f702d
Successfully tagged simple-docker-app:latest
This time the cache was used for the build right up until the new step we added
to the Dockerfile because it's no longer being copied. Finally, remove the
RUN
lines that we just added.
There's another way to use COPY
so that the cache is used efficiently, and
that's to only copy files required to build your image. Change your Dockerfile
so that the line:
COPY * /usr/src/app/
becomes:
COPY package.json index.js /usr/src/app/
If we hadn't have already created a .dockerignore
file, this would have much
the same effect in that the Dockerfile would not be copied, but only the
npm package manifest and the source code.
Only copying what's required for a Docker image build is a very good habit to
get into, and is usually used in conjunction with .dockerignore
files to
ensure that the cache is always used as much as possible.
The second way to ensure that the cache is used as much as possible is to consider how files and directives may affect those that come after it in the Dockerfile. By now, your Dockerfile should look something like this:
FROM balenalib/intel-nuc-debian-node:latest
EXPOSE 9854
WORKDIR /usr/src/app
COPY package.json index.js /usr/src/app/
RUN npm install
CMD ["/usr/local/bin/npm", "start"]
Ensure that's up-to-date by building it with
docker build -t simple-docker-app:latest .
.
Now remove the two lines in the index.js
source file we added earlier,
ie.:
// This comment changes the image
// Another comment changes the image
Let's rebuild the image again:
$ docker build -t simple-docker-app:latest .
Sending build context to Docker daemon 5.12kB
Step 1/6 : FROM balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/6 : EXPOSE 9854
---> Using cache
---> ff63da014498
Step 3/6 : WORKDIR /usr/src/app
---> Using cache
---> 568a5e43a29e
Step 4/6 : COPY package.json index.js /usr/src/app/
---> 0b2e6fcc9fb6
Step 5/6 : RUN npm install
---> Running in bbec4d8050c0
npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN [email protected] No description
npm WARN [email protected] No repository field.
added 50 packages from 37 contributors and audited 126 packages in 3.593s
found 0 vulnerabilities
Removing intermediate container bbec4d8050c0
---> 1529009594ee
Step 6/6 : CMD ["/usr/local/bin/npm", "start"]
---> Running in 674741c25eaa
Removing intermediate container 674741c25eaa
---> 4f6086286849
Successfully built 4f6086286849
Successfully tagged simple-docker-app:latest
As expected, because we've removed the two lines from the source file,
everything from the COPY
directive onwards has forced a cache bust and new
layers. However, there's actually something we can do here to ensure that
fewer new layers are required and that the majority of the build is used from
the cache. Note that we need the npm manifest (package.json
) file to carry
out the npm installation of required modules. But this isn't actually something
that altering the source file changes. As we're not changing the manifest,
we could actually split up the COPY
commands so that if only the source file
changes, the previous npm installation layer is used from cache. Change the
following lines in the Dockerfile:
COPY package.json index.js /usr/src/app/
RUN npm install
to:
COPY package.json /usr/src/app/
RUN npm install
COPY index.js /usr/src/app/
And rebuild the image again:
$ docker build -t simple-docker-app:latest .
Sending build context to Docker daemon 5.12kB
Step 1/7 : FROM balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/7 : EXPOSE 9854
---> Using cache
---> ff63da014498
Step 3/7 : WORKDIR /usr/src/app
---> Using cache
---> 568a5e43a29e
Step 4/7 : COPY package.json /usr/src/app/
---> 2c51ccf2a7b2
Step 5/7 : RUN npm install
---> Running in 02c04604c1f0
npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN [email protected] No description
npm WARN [email protected] No repository field.
added 50 packages from 37 contributors and audited 126 packages in 2.638s
found 0 vulnerabilities
Removing intermediate container 02c04604c1f0
---> 1705ac26c046
Step 6/7 : COPY index.js /usr/src/app/
---> eb4a5841b1f5
Step 7/7 : CMD ["/usr/local/bin/npm", "start"]
---> Running in faf3d745a235
Removing intermediate container faf3d745a235
---> 3a8c0a432bc2
Successfully built 3a8c0a432bc2
Successfully tagged simple-docker-app:latest
Now change the index.js
source file again and re-add those two comment lines:
// This comment changes the image
// Another comment changes the image
at the start of the file. Finally let's rebuild again:
$ docker build -t simple-docker-app:latest .
Sending build context to Docker daemon 5.12kB
Step 1/7 : FROM balenalib/intel-nuc-debian-node:latest
---> 4498fe12b99e
Step 2/7 : EXPOSE 9854
---> Using cache
---> ff63da014498
Step 3/7 : WORKDIR /usr/src/app
---> Using cache
---> 568a5e43a29e
Step 4/7 : COPY package.json /usr/src/app/
---> Using cache
---> 2c51ccf2a7b2
Step 5/7 : RUN npm install
---> Using cache
---> 1705ac26c046
Step 6/7 : COPY index.js /usr/src/app/
---> 5233ceadf248
Step 7/7 : CMD ["/usr/local/bin/npm", "start"]
---> Running in 0c15ac7cfc32
Removing intermediate container 0c15ac7cfc32
---> 81a2fcf7791c
Successfully built 81a2fcf7791c
Successfully tagged simple-docker-app:latest
The build was far quicker as npm didn't have to reinstall the used modules again, and the filesystem layer already built was reused:
Step 5/7 : RUN npm install
---> Using cache
Considering how your Dockerfile is constructed, and how the cache may be utilized, is an extremely important part of using Docker. Whilst the example we've used is very simplistic, for builds where a Dockerfile is many lines long this can be critical in cutting down build time from hours to minutes.
So far we've docker run
to create a new container, which outputs all stdout
to the console.
However, there are multiple ways in which a container can be instantiated.
Detached mode is pretty much what it sounds like. A container can be run, and then detached from the current terminal to be run in the background by the Docker daemon. In this mode, once the container is started it is detached from the terminal, with no output being logged to it.
This can be achieved by passing the --detach
(shortened form -d
) to the
docker run
command. Let's give this a go now:
$ docker run --detach --publish "9854:9854" simple-docker-app
f0960149a42f7f01c219c325e18da7beac0ca9ff7072344642f56baaeb9cd641
Notice how this time, we haven't passed --tty
. That's because we're inherently
detaching the terminal from the Docker container, and the --detach
would
essentially be in conflict.
The output from this command is the ID of the now running docker container:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f0960149a42f simple-docker-app "/usr/bin/entry.sh /…" 11 seconds ago Up 11 seconds 0.0.0.0:9854->9854/tcp compassionate_williams
We can still use the container in detached mode, of course:
$ curl -XPOST http://localhost:9854/echo -H 'Content-type: text/plain' -d 'Hello!'
Hello!
The container will now run until explicitly stopped. However, we may want to see
what it's doing. We can easily do this by using the
docker attach
command specifying the name or ID of the container:
docker attach f0960149a42f
Start a new terminal and in it run the curl
command to our container again:
$ curl -XPOST http://localhost:9854/echo -H 'Content-type: text/plain' -d 'Attached echo!'
Attached echo!
Note that in the terminal where we attached to the container, we see:
Echoing Attached echo!...
Ctrl+C
in the terminal where we attached to the container will stop that
container, because we reattached to it after starting it in detached mode.
However, it's very important to note that if you run a container in detached
mode (or use docker start
with its default parameters on a stopped container),
you need to use docker stop
explicitly to stop that container if you do not
reattach a terminal to it.
Finally (if you haven't already), stop the running container.
A container can also be run by overriding the default command that it uses
(the CMD
directive from the Dockerfile). Here, the command that you want to
run instead is passed as the last parameters to the docker run
command. Let's
try this now by running the following:
$ docker run --publish "9854:9854" simple-docker-app ls -l /usr/src/app
total 28
-rw-r--r-- 1 root root 488 Mar 5 00:09 index.js
drwxr-xr-x 52 root root 4096 Mar 5 00:09 node_modules
-rw-r--r-- 1 root root 14293 Mar 5 00:09 package-lock.json
-rw-r--r-- 1 root root 321 Mar 3 18:50 package.json
Note that once the command exits, the container itself stops running. This is because the command that we've used to override the default container command is not a long running command, which brings us to another important point. Docker containers only run for as long as the command they've been asked to execute runs. As soon as that command exits, the container stops running.
We also don't use the --tty
switch, because we're running a command we expect
to exit either successfully or with an error code, rather than a long-running
process that might need to be manually terminated.
Docker containers can also run in interactive mode. The stdin
stream is passed
to the container from the host, which allows input directly into the container.
The --interactive
(shortened form -i
) switch passed to docker run
allows
this, and it's usually used in conjunction with --tty
(shortened form -t
) to
allocate a psuedo terminal to the container. This allows us to perform many
operations as well as send signals (which is why Ctrl+C
stops the container
running), but is also a very good way of quickly running interactive commands
in the container. Because we're using Node.js inside our container to run the
listening HTTP server, we can use interactive mode to get a psuedo terminal into
the Node.js interpreter. Run the following:
$ docker run --interactive --tty --publish "9854:9854" simple-docker-app /usr/local/bin/node
Welcome to Node.js v13.9.0.
Type ".help" for more information.
>
As you can see, we've now been able to gain a TTY into the Node.js process.
Feel free to run some JavaScript statements in it, and when you've finished
hit Ctrl+C
twice to exit the interpreter. This will finish the interactive
session within the container, and the container (because we overrode the default
command by running /usr/local/bin/node
) will stop.
Most non-lightweight Docker images are built based on top of a Linux
distribution of one kind or another. Our particular image was built using Debian
Buster as its base. Because of this, we can very easily use the inbuilt bash
shell to get a TTY connection into the container and run it as if we were just
using another installation of Debian:
$ docker run --interactive --tty --publish "9854:9854" simple-docker-app /bin/bash
root@30e19e9a868b:/usr/src/app#
Again, because we overrode the default command, the HTTP server isn't listening, but because we've started an interactive container session using the shell, we can actually easily run it:
root@30e19e9a868b:/usr/src/app# npm start
> [email protected] start /usr/src/app
> node index.js
Echo server is up and listening...
If you now sent a curl
command to the HTTP server, it would respond
appropriately.
Finally, exit and stop the container by using Ctrl+C
to stop the HTTP server
and then exit
to leave the interactive bash shell in the container.
8.3.2 Via docker exec
Whilst docker run
in interactive mode is very useful for examining the
container and its contents, because we're overriding the default command from
the Dockerfile, it means we have to manually run the command which would have
otherwise run on startup if we want to see the container running 'as normal'.
There is another command though that allows us to run a command in an already
running container. This is docker exec
, and it behaves in the same way as
sending a command parameter to docker run
, except that it does this on already
running containers.
To demonstrate, run a container in detached mode to start the HTTP server:
$ docker run --detach --publish "9854:9854" simple-docker-app
6c4e438ffa43683d7cb8a9f52ba72c4ec874acd11cd5842ae34cba8243c3c527
As we're running in detached mode, we can now send a curl
command to prove
the HTTP server is listening:
$ curl -XPOST http://localhost:9854/echo -H 'Content-type: text/plain' -d 'Running echo!'
Running echo!
As we have the ID of the running container, we can now execute a command using
docker exec
(obviously change the container ID appropriately):
$ docker exec --interactive --tty 6c4e438ffa43683d7cb8a9f52ba72c4ec874acd11cd5842ae34cba8243c3c527 /bin/bash
root@6c4e438ffa43:/usr/src/app# # ps -ae
PID TTY TIME CMD
1 ? 00:00:00 npm
30 ? 00:00:00 sh
31 ? 00:00:00 node
42 pts/0 00:00:00 bash
48 pts/0 00:00:00 ps
We now have a bash shell into the container and as we can see, the npm start
from the CMD
in the Dockerfile is running correctly as PID 1.
Finally stop the container:
$ docker stop 6c4e438ffa43683d7cb8a9f52ba72c4ec874acd11cd5842ae34cba8243c3c527
As we've discussed previously, when an image is run and becomes a container instance it uses the image's layers as the base file system for that container. Whenever a container makes changes to the file system, a new layer is created for that specific container instance with the relevant changes.
Because of this, any changes that the container makes to the file system is only persistent for the life of that particular instance. Other instances will use different layers, and none of these changes will occur in the image.
Deletion or recreation of the container will see these container instance layers discarded. Whilst this is useful for transient files and directories, there are times when files need to persist.
For example, suppose you've written an application where one Docker image deals with storing and retrieving data from a database. A container instanced from the image could store all of the data in a new layer, but should you want to change the underlying code, create a new version of the image and then use it, you can't update to this new version without either throwing all of the stored data away or carrying out a very painful migration of the data from the old container instance to the new.
This is where bind mounts and volumes become useful. When a container is created or run, you can specify a bind mount or a volume to attach to that container. This allows the container to then use the bind mount or the volume as another filesystem path within the container, which allows for persistent storage outside of the container.
Bind mounts are directories or files that pre-exist on the Docker host machine that, when specified on running, or creating a new container, get bound into the container at a given destination path within that container.
This essentially allows the host machine's filesystem structure to be shared with a container. Any changes made to the bound directory/file inside the container will therefore change in the host, and vice versa.
To bind a directory or filepath to a container, simply use the
--mount
switch passed to
docker run
or docker create
. --mount
takes several options (separated by
commas), which allows the source filepath, destination filepath, type of mount,
etc. to be specified.
Let's carry out a quick example of this. We'll use the
balenalib/intel-nuc-debian-node
image to create a new container which binds
to an already existing directory structure. Run the following in the terminal
to change to the bind-mounts-and-volumes and start running a new container which
we'll connect a psuedo terminal to:
$ cd $BALENA_DOCKER_MASTERCLASS/bind-mounts-and-volumes
$ docker run --interactive --tty --mount type=bind,source="${BALENA_DOCKER_MASTERCLASS}"/bind-mounts-and-volumes/bind-directory,destination=/bound-mount simple-docker-app /bin/bash
root@ce0c2aa62101:/usr/src/app# cat /bound-mount/testfile.txt
This is a bound file!
root@ce0c2aa62101:/usr/src/app# echo "This was added from the container!" >> /bound-mount/testfile.txt
root@ce0c2aa62101:/usr/src/app# exit
exit
bind-mounts-and-volumes heds$ cat bind-directory/testfile.txt
This is a bound file!
This was added from the container!
Here we ran a new container instance, bound a local directory
(/bind-mounts-and-volumes/bind-directory
) which contained a file into it,
started an interactive psuedo terminal, printed the contents of the file to
STDOUT, then added a new line. Finally we exited the container, and then printed
the contents of the newly changed file out. As you can see, the changes made
in-container are echoed in the host filing system. Changes at the host level
would also have been echoed in the container.
A quick note on the --volume
option for bind mounts: Traditionally, bind
mounts were specified with the --volume
option. This diverged when volumes
became first class Docker citizens, which is when --mount
was introduced.
Whilst you can attach a bind mount to a container using --volume
, it is
not recommended for those starting Docker, as it reinforces legacy behavior.
Try to only use --mount
for bind and volume mounts (and just --volume
for
Docker volumes if really required). Another important note is that a directory
or file used for bind mounting must exist prior to the mount occuring.
Volumes, unlike bind mounts, are Docker managed storage areas which can also
be mounted into a container. Also unlike bind mounts, a directory path is not
passed to the container, but the name of the volume itself. This has several
immediately obvious benefit that only the name of the volume is required to
work with it instead of a filepath. Another big advantage is that volumes can
be of many forms, and not just of local filesystem hierarchies. There are
volume drivers for NFS, tmpfs
and others for example, which allows them to
be far more flexible than bind mounts.
For these exercises, we're just going to use a local filesystem volume.
Volumes, if not pre-existing, are created when a new container that it is bound is created or started. However, we'll go into a little more detail and create one ourselves:
$ docker volume create my-volume
my-volume
We've created a volume with the name my-volume
. Now we can attach it to a
new container:
$ docker run --interactive --tty --mount type=volume,source=my-volume,destination=/data-volume simple-docker-app /bin/bash
root@a3d4665bc8b1:/usr/src/app# echo "Data in my-volume!" > /data-volume/newfile.txt
root@a3d4665bc8b1:/usr/src/app# exit
Note that we've again used the --mount
parameter to mount the volume into the
container, but this time we've change the type
to volume
. The container ran
and a new file was created with some text in the volume.
We'll now run a new container, using a different image, and mount the same volume that we created:
$ docker run --interactive --tty --mount type=volume,source=my-volume,destination=/modified-data-volume balenalib/intel-nuc-debian-node /bin/bash
root@ab2ad1db1bce:/# cat /modified-data-volume/newfile.txt
Data in my-volume!
root@ab2ad1db1bce:/# exit
exit
As we can see, the data in the volume has persisted.
Because the volume is just another Docker object, we're actually able to inspect
it via the command line with
docker volume inspect
:
$ docker volume inspect my-volume
[
{
"CreatedAt": "2020-03-12T00:07:30Z",
"Driver": "local",
"Labels": {},
"Mountpoint": "/var/lib/docker/volumes/my-volume/_data",
"Name": "my-volume",
"Options": {},
"Scope": "local"
}
]
The Mountpoint
actually shows that the volume is created in the Docker storage
heirarchy, and if we were to inspect the
/var/lib/docker/volumes/my-volume/_data
directory, we'd find the newfile.txt
file.
You can list the current volumes known by Docker with
docker volume ls
, which we'll try now:
$ docker volume ls
DRIVER VOLUME NAME
local my-volume
Finally, let's remove the volume to tidy up, we use
docker volume rm
to do this:
$ docker volume rm my-volume
Error response from daemon: remove my-volume: volume is in use - [a3d4665bc8b16aa7df75faefd1edfef0517015cb506a200f9bb29a00132167f0, ab2ad1db1bce61e632b6bea51d65d5febc1a814d07e1ebd4350d8dcca5cecf5c]
But as you can see, we're first warned that the volume we're using is already in use! This is because we've previously attached it to a container that's only been stopped and not removed.
Because we actually attached the same volume to two different containers, we need to remove both of those containers before we can remove the volume. The removal command actually told us which containers are using the volume, so we can simply remove them first:
$ docker rm a3d4665bc8b16aa7df75faefd1edfef0517015cb506a200f9bb29a00132167f0 ab2ad1db1bce61e632b6bea51d65d5febc1a814d07e1ebd4350d8dcca5cecf5c
a3d4665bc8b16aa7df75faefd1edfef0517015cb506a200f9bb29a00132167f0
ab2ad1db1bce61e632b6bea51d65d5febc1a814d07e1ebd4350d8dcca5cecf5c
$ docker volume rm my-volume
my-volume
Containers, as described previously, share the kernel of the host OS (or the underlying Hypervisor or VM). As such, processes run against this kernel. However, this doesn't mean that processes in a container can carry out any operations that they want. To explain this, we need to dig a little into both the user model and the way containers are controlled.
Processes in containers run as a child process spawned by the Docker daemon. But
this actually raises some questions. The Docker daemon (under Linux) runs as
root
, and under macOS and Windows runs as a Hypervisor/VM that also
essentially has root
privileges. So doesn't this mean that all our containers
run processes as root
?
Well, let's have a look at what we see on a Linux system when we run a Docker container in it (Note: this will not work under Docker for Windows or Docker for macOS, because the processes running will actually occur under the VM and not the host OS). The following command will run the Docker container for 60 seconds (whilst it sleeps) before exiting, and whilst it sleeps we'll have a look at the process running:
$ whoami
heds
$ docker run --detach simple-docker-app sleep 60
1d9ad9d967c020ce1766a9080d05150e5e67f88c2d9980f7e3bd3cc3ef7751f4
$ ps aux | grep "[s]leep"
root 9384 18.0 0.0 4052 748 ? Ss 23:26 0:00 /bin/sleep 60
The first line carries out a whoami
to print the current user of the host OS,
the second starts our detached running container and the third searches the
host's process list for a command running sleep
. This final command finds the
running container process (which is run as a child of the Docker daemon), but...
it's running as the root
user, not as heds
!
How can this be? Well, the Docker daemon is actually running (if you've
installed it using the default method) under the root
user, and because of
this any process running in a container is also run by the root
user.
Even though the default is to run as root
, it's usually better to ensure that
the processes in a container run as a suitable user. There's a very easy way to
do this, which is to specify the
USER
directive in a Dockerfile. We'll go into
this in a later Masterclass.
However, the question a lot of new users have when they see the process running
as root
is usually, 'Is this not incredibly dangerous?'. The short answer is
'No' and we'll explain why in the next section.
Let's have a look at a quick example of running a command in a container. Pretty
much any Linux distribution allows you (as the root
user) to mount a
filesystem, and as we've already discovered, processes inside our container are
running as root. So let's mount a tmpfs
filesystem into /mnt/tmp
:
$ docker run --interactive --tty simple-docker-app /bin/bash -c 'mkdir /mnt/tmp \
&& mount -t tmpfs none /mnt/tmp \
&& echo "Temp file!" > /mnt/tmp/newfile.txt \
&& cat /mnt/tmp/newfile.txt'
mount: /mnt/tmp: permission denied.
So even though the root
user is trying to run the command to mount a tmpfs
filesystem inside the container, it's been denied permission. This answers
our previous question of why running container processes as root
is not
inherently dangerous (although still not recommended).
This is due to the way in which containers operate. Whilst all containers share the same Linux kernel that runs on the host OS (or Hypervisor/VM for Docker for Mac or Windows), they run in a separate namespace. This means that any processes that run inside these containers are not allowed to communicate or share resources with any other namespace, including processes run on the the host OS or other containers. Wikipedia has a good article on Linux namespaces that goes into this a little more.
As well as namespace separation, containers also adhere to the Linux capability
model, which allows very fine grained control over what container processes are
allowed to carry out. For example, should a process need to make changes to a
file's UIDs and GIDs, it would require theCAP_CHOWN
capability to allow it
to make such changes. Without this capability, any attempt to change file
ownership in a volume or bind mount would fail, regardless of underlying process
user. You can read more about Linux capabilities and the functionality that each
capability has access to
here.
By default, Docker drops almost all capabilities for a container that are not
directly required for processes to run in them. Other capabilities are added
via a 'whitelist' to open up more functionality.
Containers also run their own network stacks. As we've already seen, just
because a container has a process listening on a defined port, it doesn't mean
that any traffic will actually make it into the container. There's essentially a
firewall between the container and the host's networking interfaces, and the act
of publishing ports on a container allows traffic through that firewall (note
that this is not true should host
networking be used). Users can also
create their own Docker networks, and then run containers specifying these
networks so that containers can communicate with each other, and the host,
where required. These networks essentially contain their own namespacing, so
that they can only interact if given permission to do so. The
Docker networking overview
is a good place to start to read up on networking features.
Docker containers are all subject to the use of control groups (also known as
cgroups
). Control groups allocate host resources that can be used by a
container, and as such can limit the amount of CPU, memory or I/O usage
for it. This ensures that a rogue container cannot disrupt a host OS to the
point of bringing it down or making it unusable. Control groups for a container
are configurable, so a user can alter the resources available to it when
required. By default, Docker doesn't limit resources to containers, so in theory
you could max out the host OS by running an extremely CPU intensive process in a
container, or by using all its memory (though usually the OOM killer will kick
in). However, Docker allows you to specify limits when you run a container, and
it will then create appropriate cgroup
rules on the host to limit the
container's resource use. More information on cgroups can be found on the
Wikipedia page for cgroups
here.
Containers can also use other security features in the Linux kernel, such as SELinux and AppArmor to further ensure that containers are only given access to what they need.
Now that we know that a container has its own networking, namespace and
(potentially) limited cgroups
rules, we can relax a bit knowing that there's
little chance of a container doing anything nasty to the host OS without the
knowledge of the user who's running it.
In fact, we've seen this already when creating our first container, where although the HTTP server was listening inside the container, Docker did not allow traffic from the host to the container unless the appropriate ports were published. This is because the container was running its own network stack and had not been given any rules to allow incoming traffic from the host into it.
However, for now, we want to try and still mount a tmpfs
mount inside our
container, so how do we do it? The answer in this case lies in Linux
capabilities. It just so happens that to allow a filesystem mount to work inside
a container we need the CAP_SYS_ADMIN
capability feature (which also enables a
large number of other capabilites for container processes). To enable this,
we'll use the --cap-add
(for capability add) switch when executing
docker run
:
$ docker run --interactive --tty --cap-add SYS_ADMIN simple-docker-app /bin/bash -c 'mkdir /mnt/tmp \
&& mount -t tmpfs none /mnt/tmp \
&& echo "Temp file!" > /mnt/tmp/newfile.txt \
&& cat /mnt/tmp/newfile.txt'
Temp file!
This time it worked correctly, because we eased the restriction on the container
to allow the CAP_SYS_ADMIN
capability. Note that when passing the capability,
the CAP_
prefix is dropped.
It's important to note that should your container require elevated permissions or capabilities above that of the default controls, you should find the minimal number of changes required for the functionality required. This way your container will continue to be secured as much as possible from malicious code.
Once you've built a working docker image that contains all the functionality you need, it will be stored in the local Docker registry on the machine that you've carried out development on.
We've already seen the local registry, it's the store for images, the list of which can be seen by running the following command:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
simple-docker-app latest b8a722866d8c 3 days ago 273MB
simple-docker-app v0.0.2 5e2c8c1cb9d9 5 days ago 273MB
simple-docker-app v0.0.1 8a65ba22f11f 6 days ago 273MB
balenalib/intel-nuc-debian-node latest ca37795ac1cc 2 weeks ago 270MB
This local repository stores all of the images that have been built or used when building the images. But how do we now share these images, so that others can use them? Well, think back to our Dockerfile, specifically the first line where we determined which base image to use:
FROM balenalib/intel-nuc-debian-node:latest
As we know, this tells Docker which image to base the rest of the image we're building on. As you may also remember, on the first build, this downloaded the image. This download came from a remote Docker registry, which Docker itself hosts, called Docker Hub. Any user with an account on Docker Hub can upload images they've built so that other users can download them and run them, or use them as base images. Docker has both public and private registries, depending on the type of account (unpaid accounts only allow the maximum of one private repository, with all others being public, whereas paid accounts allow more private repositories). Docker Hub is the default Docker registry that is used when installing Docker, but other registries can also be used including private ones that you can run yourself. We'll discuss this in a future Masterclass.
We discussed tags in a previous section, where we noted that tags that we give
images are of the format repository:tag
. A repository name denotes a specific
location in a registry for a group of images. On Docker Hub, repositories are
usually denoted by the name of a user or organization using the registry and
then the repository name itself, these being split by the use of a forward slash
(/
). For example, the base image we were using was a balena image, in the
repository where it stores all of its base images. The base images are owned by
the balenalib
user, and the repository name is the device type they use as
well as the functionality they include, hence balenalib/intel-nuc-debian-node
is a base image owned by balenalib
which runs on an intel-nuc
device based
on a debian
Linux distribution and includes node
(Node.js). Simply, a
repository is made up of user/repository-name
.
Versions of a particular repository image are then given by a tag, and as we
also learnt earlier in the Masterclass, if no tag is given then latest
is
assumed, which is why the image that was downloaded to the local registry
is of tag latest
.
We'll quickly ensure that our built Docker image is pushed to Docker Hub so that we can use it anywhere else, or let others use it.
First create a new account on Docker Hub if you don't
already have one, using the Sign Up
panel (if you already have a Docker Hub)
account. If you do have one, sign into it. For the rest of the exercise, we'll
assume the user is called dockerhub
so remember to replace this name with the
actual username (also called a Docker ID) you signed up with.
Next ensure that you login to the default Docker registry (Docker Hub), using:
$ docker login
Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to https://hub.docker.com to create one.
Username: dockerhub
Password:
Login Succeeded
First of all, there's the matter of naming our Docker image. When we built it,
we called it simple-docker-app
. This isn't suitable for pushing to Docker Hub
however, because that name isn't prefixed by the name of the user who will own
the repository on Docker Hub. We'll quickly demonstrate this by using the
docker push
command, which is used to push repository images to a registry:
$ docker push simple-docker-app:latest
The push refers to repository [docker.io/library/simple-docker-app]
d1584e117f97: Preparing
e5630e29b3ed: Preparing
7e1009fcbba1: Preparing
77b61c32e3ac: Preparing
88d4980a47f6: Preparing
56db180aad32: Waiting
711c9715952e: Waiting
f8cad89c2502: Waiting
381d6d079e1d: Waiting
8a4c2b7f231f: Waiting
3a2f92b02f7b: Waiting
04bd31396a7a: Waiting
034332202128: Waiting
f2cb0ecef392: Waiting
denied: requested access to the resource is denied
As you can see, we don't have access to the toplevel repository because we don't own it.
Instead, we're going to ensure that Docker Hub knows that the image is owned
by us by creating a new tag for the repository to include our username. We'll be
pushing the latest
version of the image we built, so we'll use
docker tag
to tag that:
$ docker tag simple-docker-app:latest dockerhub/simple-docker-app:latest
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
simple-docker-app latest b8a722866d8c 3 days ago 273MB
dockerhub/simple-docker-app latest b8a722866d8c 3 days ago 273MB
simple-docker-app v0.0.2 5e2c8c1cb9d9 6 days ago 273MB
simple-docker-app v0.0.1 8a65ba22f11f 6 days ago 273MB
balenalib/intel-nuc-debian-node latest ca37795ac1cc 2 weeks ago 270MB
Note that when tagging you can use the name of the repository to create a new tag for, or use the image ID.
We now have a repository called dockerhub/simple-docker-app
which has a single
version tag of latest
. We'll try pushing this again:
$ docker push dockerhub/simple-docker-app:latest
The push refers to repository [docker.io/dockerhub/simple-docker-app]
d1584e117f97: Pushed
e5630e29b3ed: Pushed
7e1009fcbba1: Pushed
77b61c32e3ac: Pushed
88d4980a47f6: Pushed
56db180aad32: Pushed
711c9715952e: Pushed
f8cad89c2502: Pushed
381d6d079e1d: Pushed
8a4c2b7f231f: Pushed
3a2f92b02f7b: Pushed
04bd31396a7a: Pushed
034332202128: Pushed
f2cb0ecef392: Pushed
latest: digest: sha256:eaf4f3d25964c6d56e4dec4fac93212d2fd315474a8040c00189bae46eded30d size: 3246
Success! If you now look at the Docker Hub site in your web browser and go to
the My Profile
section of your user account, you'll notice that the image has
been stored on Docker Hub in the simple-docker-app
repository owned by you.
Because we've carried out versioning with the tags, we can also add a new tag to a previous version and then push that:
$ docker tag simple-docker-app:v0.0.2 whaleway/simple-docker-app:v0.0.2
$ docker push dockerhub/simple-docker-app:v0.0.2
The push refers to repository [docker.io/dockerhub/simple-docker-app]
d217d415dd8d: Pushed
715dcd4f6062: Pushed
77b61c32e3ac: Layer already exists
88d4980a47f6: Layer already exists
56db180aad32: Layer already exists
711c9715952e: Layer already exists
f8cad89c2502: Layer already exists
381d6d079e1d: Layer already exists
8a4c2b7f231f: Layer already exists
3a2f92b02f7b: Layer already exists
04bd31396a7a: Layer already exists
034332202128: Layer already exists
f2cb0ecef392: Layer already exists
v0.0.2: digest: sha256:c82f1c64bbdd86fdeba44de5393f0af6e9271ac9cbad8a626c320e61b29dbe64 size: 3039
Note the message Layer already exists
. Because the repository we pushed to
already has an image (dockerhub/simple-docker-app:latest
), Docker noted that
our latest push used several of the same layers for its image. Because of this,
it didn't need to push them separately to the registry. This is another good
example of how the use of layers saves space.
We've created new tags for our images here to push them to Docker Hub, but usually you'll build an image with the user (owner name) directly so that tagging doesn't have to take place before pushing, for example:
$ docker build -t dockerhub/simple-docker-app:latest .
If you again look at the profile for your user account and select the
dockerhub/simple-docker-app
repository, this will take you to a page for that
specific repository (you can manage your repository here using
'Manage Repository'). If you select the 'Tags' panel, you'll see that there are
now two versions of the image stored in the registry, latest
and v0.0.2
.
This will also show you the compressed size of the images stored on Docker Hub,
as well as a copy and paste line that will allow you to pull the image
(for example docker pull dockerhub/simple-docker-app:latest
).
docker pull
is the command used to pull images from a remote Docker registry,
allowing any image that has been stored in an accessible registry to be pulled
to a local machine's local registry.
So now you know how to store a built image into a remote registry that anyone
else with access to that registry can either use to pull and run the image, or
use it as a base image for further images (and of course there is no need to
docker pull
a base image used in a FROM
directive instruction in a
Dockerfile as it will occur as part of the build).
This Docker Masterclass covered the basics of Docker, and you should now be able to:
- Install Docker for Linux, Windows or Mac.
- Use the
docker
command line interface. - Create a Dockerfile and build an image.
- Run, create, stop and start a container.
- Understand how to publish ports to enable network traffic in a container.
- Tag Docker images and write Dockerfiles in a way that allows use of caching when building.
- Run a container in detached, command mode and interactive mode.
- Use bind mounts to bind a host directory or file into a container, or create a new Docker volume for storing persistent data.
- Understand the restrictions on processes running in a container, including the users running the processes and how to enable access to restricted functionality.
- Be able to store built images on Docker Hub for use by anyone.