Docker-Datacamp-chapter3
Docker-Datacamp-chapter3
Docker-Datacamp-chapter3
Docker images
INTRODUCTION TO DOCKER
Tim Sangster
Software Engineer @ DataCamp
Creating images with Dockerfiles
INTRODUCTION TO DOCKER
Starting a Dockerfile
A Dockerfile always start from another image, specified using the FROM instruction.
FROM postgres
FROM ubuntu
FROM hello-world
FROM my-custom-data-pipeline
FROM postgres:15.0
FROM ubuntu:22.04
FROM hello-world:latest
FROM my-custom-data-pipeline:v1
INTRODUCTION TO DOCKER
Building a Dockerfile
Building a Dockerfile creates an image.
INTRODUCTION TO DOCKER
Naming our image
In practice we almost always give our images a name using the -t flag:
...
=> => writing image sha256:a67f41b1d127160a7647b6709b3789b1e954710d96df39ccaa21..
=> => naming to docker.io/library/first_image
INTRODUCTION TO DOCKER
Customizing images
RUN <valid-shell-command>
FROM ubuntu
RUN apt-get update
RUN apt-get install -y python3
...
After this operation, 22.8 MB of additional disk space will be used.
Do you want to continue? [Y/n]
INTRODUCTION TO DOCKER
Building a non-trivial Dockerfile
When building an image Docker actually runs commands after RUN
Docker running RUN apt-get update takes the same amount of time as us running it!
INTRODUCTION TO DOCKER
Summary
Usage Dockerfile Instruction
Start a Dockerfile from an image FROM <image-name>
Add a shell command to image RUN <valid-shell-command>
Make sure no user input is needed for the shell-command. RUN apt-get install -y python3
INTRODUCTION TO DOCKER
Let's practice!
INTRODUCTION TO DOCKER
Managing files in
your image
INTRODUCTION TO DOCKER
Tim Sangster
Software Engineer @ DataCamp
COPYing files into an image
The COPY instruction copies files from our local machine into the image we're building:
If the destination path does not have a filename, the original filename is used:
INTRODUCTION TO DOCKER
COPYing folders
Not specifying a filename in the src-path will copy all the file contents.
/projects/
pipeline_v3/
pipeline.py
requirements.txt
tests/
test_pipeline.py
INTRODUCTION TO DOCKER
Copy files from a parent directory
/init.py
/projects/
Dockerfile
pipeline_v3/
pipeline.py
INTRODUCTION TO DOCKER
Downloading files
Instead of copying files from a local directory, files are often downloaded in the image build:
Download a file
RUN rm <copy_directory>/<filename>.zip
INTRODUCTION TO DOCKER
Downloading files efficiently
Each instruction that downloads files adds to the total size of the image.
Even if the files are later deleted.
INTRODUCTION TO DOCKER
Summary
Usage Dockerfile Instruction
COPY <src-path-on-host> <dest-path-
Copy files from host to the image on-image>
Copy a folder from host to the image COPY <src-folder> <dest-folder>
We can't copy from a parent directory where we
COPY ../<file-in-parent-directory> /
build a Dockerfile
Keep images small by downloading, unzipping, and cleaning up in a single RUN instruction:
INTRODUCTION TO DOCKER
Let's practice!
INTRODUCTION TO DOCKER
Choosing a start
command for your
Docker image
INTRODUCTION TO DOCKER
Tim Sangster
Software Engineer @ DataCamp
What is a start command?
The hello-world image prints text and then stops.
INTRODUCTION TO DOCKER
What is a start command?
An image with python could start python on startup.
....
>>> exit()
repl@host:/#
INTRODUCTION TO DOCKER
Running a shell command at startup
CMD <shell-command>
INTRODUCTION TO DOCKER
Typical usage
Starting an application to run a workflow or that accepts outside connections.
CMD postgres
CMD start.sh
INTRODUCTION TO DOCKER
When will it stop?
INTRODUCTION TO DOCKER
Overriding the default start command
Starting an image
INTRODUCTION TO DOCKER
Summary
Usage Dockerfile Instruction
Add a shell command run when a container is started from the CMD <shell-
image. command>
INTRODUCTION TO DOCKER
Let's practice!
INTRODUCTION TO DOCKER
Introduction to
Docker layers and
caching
INTRODUCTION TO DOCKER
Tim Sangster
Software Engineer @ DataCamp
Docker build
Downloading and unzipping a file using the Docker instructions.
/example_folder.zip
/example_folder/
example_file1
example_file2
INTRODUCTION TO DOCKER
Docker instructions are linked to File system changes
Each instruction in the Dockerfile is linked to the changes it made in the image file system.
FROM docker.io/library/ubuntu
=> Gives us a file system to start from with all files needed to run Ubuntu
INTRODUCTION TO DOCKER
Docker layers
Docker layer: All changes caused by a single Dockerfile instruction.
Docker image: All layers created during a build
--> Docker image: All changes to the file system by all Dockerfile instructions.
INTRODUCTION TO DOCKER
Docker caching
Consecutive builds are much faster because Docker re-uses layers that haven't changed.
Re-running a build:
INTRODUCTION TO DOCKER
Understanding Docker caching
When layers are cached helps us understand why sometimes images don't change after a
rebuild.
Docker will use cached layers because the instructions are identical to previous builds.
INTRODUCTION TO DOCKER
Understanding Docker caching
Helps us write Dockerfiles that build faster because not all layers need to be rebuilt.
In the following Dockerfile all instructions need to be rebuild if the pipeline.py file is changed:
FROM ubuntu
COPY /app/pipeline.py /app/pipeline.py
RUN apt-get update
RUN apt-get install -y python3
INTRODUCTION TO DOCKER
Understanding Docker caching
Helps us write Dockerfiles that build faster because not all layers need to be rebuilt.
In the following Dockerfile, only the COPY instruction will need to be re-run.
FROM ubuntu
RUN apt-get update
RUN apt-get install -y python3
COPY /app/pipeline.py /app/pipeline.py
INTRODUCTION TO DOCKER
Let's practice!
INTRODUCTION TO DOCKER