Docker 101

Hello everyone and welcome to a new chapter in my learning journey.

Today I present you with a new chapter on Docker. This has been on the back burner for a while and I am excited to finally being tackling it.

Let's start with the problem that gave birth to Docker.

"It works on my computer"

In the world of software development there are three major sources of headaches: Development, Distribution and Execution. From the Docker website: "Developing apps today requires so much more than writing code. Multiple languages, frameworks, architectures, and discontinuous interfaces between tools for each lifecycle stage creates enormous complexity".

As applications and work teams grow in size and complexity is very difficult to keep track of dependencies and keep consistency. Likewise, when we send our final product to clients/teammates, they might have the problem that the app doesn't run on their computer, despite running in yours. Why? Again, dependencies, versioning problems, etc.

The concept of containers was born to solve these issues, similarly to Python environments that we can share in .yml files so that our team mates can have all the libraries (in the versions) that we used. With containers we can solve:

Dependency Management by packaging applications and their dependencies into containers, ensuring that they run consistently across different environments without conflicts.
Consistency Across Environments: such as development, testing, staging, and production environments. With a consistent environment for running applications, regardless of the underlying infrastructure, operating system, or dependencies, we ensure that applications behave the same way in every environment.
Isolation and Security: containers provide isolation for running applications, ensuring that they do not interfere with each other or with the host system, minimising the impact of potential security vulnerabilities.
Portability: containers are portable, allowing developers to build applications once and deploy them anywhere, whether it's on a developer's laptop, in a data center, or in the cloud.

Docker File, Images and Containers

According to their website, Docker Engine is an "open source containerization technology for building and containerizing your applications", Which is very well put but can be summarised as "Docker is a way to package software so it can run on any hardware. And we package software using containers.

Docker Files

A Docker File (DF) is a text file that contains instructions for building a Docker image, including specifying the base image, copying files into the image, setting environment variables, and running commands. DF use a simple syntax to describe the image's configuration, making it easy to automate the build process and ensure consistency across different environments.

The command docker build tells Docker to build the image by following the content (instructions) inside the DF. This way, any programmer that has access to the DF can use it to rebuild the environment.

Docker Images

A Docker Image (DI) is a read-only/inmutable lightweight, standalone, executable template that contains everything needed to run a container, including the application code, runtime, libraries, dependencies, and other files. Images can be thought of as snapshots of a DF at a specific point in time. Docker images are portable and can be shared and distributed via container registries, such as Docker Hub or private registries, so that any developer can pull them into their machine to create a container to run the software.

Docker Container

A Docker Container (DC) is a running process of the image used to create it. We can have multiple containers made from the same image, running simultaneously in the same or different machines. A DC is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.

Containers provide a consistent and isolated environment for running applications, allowing them to run reliably across different computing environments, from development to production. They are based on operating system-level virtualization, which means they share the kernel of the host operating system but have their own isolated filesystem, processes, and networking. This isolation ensures that containers do not interfere with each other and provides security and consistency for running applications.

Summary

A Docker File is a blueprint for building a Docker Image
A Docker Image is a template for running Docker Containers
A Docker Container is a running process

Getting started

We first download Docker Desktop and sign in, as well as installing the Docker extension in VisualCode Studio:

Once Docker Desktop (DD) is installed, we can start using docker commands in our computer's Command-line interface or terminal:

$docker help: shows you the complete list of docker commands
$docker ps: this command give a list of all running containers in your machine. It is worth noting that each container has a unique ID and it is link to a Docker Image

First Docker File

Some of the most common instructions in a Dockerfile include (Check out Writing a Dockerfile and Dockerfile reference for more):

FROM <image> - this specifies the base image that the build will extend.

A base image in Docker is the starting point for building a Docker image. It is the foundational layer upon which you build your custom image. It can be a minimal OS image, like Ubuntu or Alpine, or it can be an image that contains a specific runtime environment, like Node.js, Python, or Java.

WORKDIR <path> - this instruction specifies the "working directory" or the path in the image where files will be copied and commands will be executed.
COPY <host-path> <image-path> - this instruction tells the builder to copy files from the host and put them into the container image.

We can use a .dockerignore file to specify folder or files that we want Docker to ignore in this step (similar to .gitignore)

RUN <command> - this instruction tells the builder to run the specified command. The command has the same syntax as if we were opening a terminal session and typing a command.

The commands can be written in shell form or exec form

ENV <name> <value> - this instruction sets an environment variable that a running container will use.
EXPOSE <port-number> - this instruction sets configuration on the image that indicates a port the image would like to expose.
USER <user-or-uid> - this instruction sets the default user for all subsequent instructions.
CMD ["<command>", "<arg1>"] - this instruction sets the default command a container using this image will run.

Note: there can only be one CMD command per dockerfile
The commands are written in exec form (preferred way to do commands in Docker, why? it doesn't start a shell session)

We don't have to use all of these commands above every time. A DF typically follows these steps:

Determine your base image
Install application dependencies (*)
Copy in any relevant source code and/or binaries
Configure the final image

(*) Please note that in the DF, every command or code line is considered its own step or layer. To be efficient, Docker will try to cache layers if nothing is changed. This means that it is beneficial to install dependencies first (so they can be cached) and then access the sourced code. This way, if we change the source code (and the dependencies don't change) we won't have to reinstall the dependencies (because they have been cached)

To learn, I followed Fireship's tutorial, and therefore, I will be using his sample application to put in a container.

The node.js app is fairly simple: once the localhost gets activated, an API endpoint sends the message "Docker is easy 🐳". To "dockerize" this app, we create a dockerfile in the root folder of the project:

Build and Run a Docker Image

Now that we have our docker file ready, we build an image from it using the terminal command $docker build:

$docker build <options> <path to dockerfile>

With the option -t <tag_name> we can define an alias for the image so that we can identify it later

Once it is build, an image can be used as the base image for other images (in which case we would need to push it to a container registry somewhere like Docker Hub or another form of CLoud Storage like AWS S3 or GCS with the command $docker push) or to run containers.

Importantly, at the end of the image building process, we will be given the Image ID that uniquely identifies that DI (don't confuse the Image ID with the tag)

To run the image, we type the terminal command $docker run <image_id or tag_name>. This command creates a running process called a container.

Note: to implement port forwarding from a dock container to our local machine we need to use the option -p <port_number in local machine>:<port_number in container>

Example: $docker run -p 5000:8080 example_tag

Closing Containers

Docker containers will continue to run in the background (even if we close the terminal window) unless we explicitly close them. We can close them (one or more) either through the GUI or running the terminal command $docker stop [OPTIONS] CONTAINER [CONTAINER...].

Be careful!! Once we shut a container down any state or data created inside of them will be permanently lost.

Docker Volumes

While this ephemeral nature of containers is great, it poses a challenge when you want to persist the data. For example, if you restart a database container, you might not want to start with an empty database. So, how do you persist files?

This is what Docker Volumes are for. A Volume is a dedicated folder on the host machine dedicated to hold persisting files we want to keep so they can be remounted into future containers or be accessed by other containers that are running in parallel.

To create a volume: $docker volume create <volume_name>

When mounting/attaching a volume to a Docker container, you can use either the --mount or --volume (-v) option in the $docker run command. Both options achieve the same result, but they have different syntax and offer different features

--volume option: this is the older and more established way to mount volumes in Docker. It has a simpler syntax but fewer features compared to --mount.

syntax: $docker run -v [host-path:]container-path[:options] image
example: $docker run -v /my/host/path:/container/path:ro my-image

--mount option: it is the newer, more flexible way to mount volumes. It provides a more expressive and clear syntax and supports more advanced features.

simple syntax: $docker run --mount source=[host-path],target=[container-path] my-image
example: $docker run --mount source=/my/host/path,target=/container/path my-image

Managing Volumes

Volumes have their own lifecycle beyond that of containers and can grow quite large depending on the type of data and applications you’re using. The following commands will be helpful to manage volumes:

$docker volume ls : list all volumes
$docker volume rm <volume-name-or-id> : remove a volume (only works when the volume is not attached to any containers)
$docker volume prune : remove all unused (unattached) volumes

See more here: Docker docs

Debugging containers

You can access the logs of containers with: $docker logs [OPTIONS] CONTAINER

You can run commands in running containers with: $docker exec [OPTIONS] CONTAINER COMMAND [ARG...]

Docker Compose

Every thing seems very clear and straight forward now, but as we build and build, things can get out of hand. A tip to keep things healthy is to build container that only run one process. One best practice for containers is that each container should do one thing and do it well. While there are exceptions to this rule, avoid the tendency to have one container do multiple things.

Docker Compose defines your entire multi-container application in a single YAML file called compose.yml. This file specifies configurations for all the containers that you need, their dependencies, environment variables, and even volumes and networks.

This way, we no longer need to type every $docker run command for each container, we just need to launch the compose.yml file:

This centralizes configuration and simplifies management. It's easy to set and manage environment variables.
You can run containers in a specific order and manage network connections easily.
You can simply scale individual services up or down within the multi-container setup. This allows for efficient allocation based on real-time needs.
You can implement persistent volumes with ease.

See more in Docker docs

The Compose File

The default path for a Compose file is compose.yaml (preferred) or compose.yml that is placed in the working directory.

To learn how to write compose.yaml files go to: Docker Docs. However, let's go over the basics. A compose file can contain:

Version (obsolete) and name top-level element
Services top-level element:

A Compose file must declare a services top-level element as a map whose name_key are string representations of service names. Each service is a container. A service definition contains the configuration that is applied to each service container.
Each service may also include a build section, which defines how to create the Docker image for the service.

Network top-level element (optional)

Networks are the layer that allow services to communicate with each other.

Volumes top-level element (optional)

The top-level volumes declaration lets you configure named volumes that can be reused across multiple services. To use a volume across multiple services, you must explicitly grant each service access by using the volumes attribute within the services top-level element.

Configs top-level element (optional)
Secrets top-level element (optional)

Example:

Running the Docker Compose

From the working directory run $docker-compose up to start the all the containers with a single command line.

Use $docker-compose down to shut everything down

Related concepts:

Kubernetes
Swarm
Docker versus Virtual Machines

Thank you to:

The official Docker website
Learn Docker in 7 Easy Steps - Full Beginner's Tutorial by Fireship
https://github.com/docker/welcome-to-docker

Search This Blog

Swimming in the Sea of Numbers