Understanding Docker and Docker ComposeYuva LoganathanBlockedUnblockFollowFollowingMay 9In order to understand docker, we have to go back in time and study the evolution of containers and how we got to where we are!What is a container?From the docker site“Containers are a way to package software in a format that can run isolated on a shared operating system.
Unlike VMs, containers do not bundle a full operating system — only libraries and settings required to make the software work are needed.
This makes for efficient, lightweight, self-contained systems and guarantees that software will always run the same, regardless of where it’s deployed.
”Lets unpack it a bitBack in the late nineties, VMWare introduced the concept of running multiple OS in the same hardware.
In the late 2000s, kernel level name-spacing was introduced that allows shared global resources like network and disk to be isolated by namespaces.
In early 2010s, Containerization was born and it took virtualization at the OS level and added shared libs/bin as well.
This also means we cannot run two containers that are dependent on different operating systems in the same host unless we are using a VM.
Namespaces are the true magic behind containers.
Principles are from linux containers and docker implemented its own OCI runtime called runcVirtual Machines are virtualization at the hardware level.
Containers are virtualization at the OS/Software levelAdvantages of using containersSpeedExecution speed — Because containers use underlying host os, we get speeds as close a process natively running on the host os.
Startup speed — Containers can be started in less than a second.
They are very modular and can share the underlying libs/bins when needed along with host os.
Operational speed — Containers enable faster iterations of application.
There is less overhead in creating a container with new code changes and move it through the pipeline to production.
ConsistencyBuild an image once and use it any where.
The same image that is used to run the tests is used in production.
This avoids the works in my machine problems.
Not just in production.
Containers helps in running tests consistently.
Ever had a scenario where all tests passed in your machine but the CI failed your tests?3.
ScalabilityWe can specify exactly how much resources a single container can consume(CPU and memory).
By understanding the available resources, containers can be packed quite densely to minimize wastage of CPU and memory.
Scale containers within one instance before scaling the instance.
FlexibilityContainers are portable.
Well to an extent (as long as the host is running some form of linux or linux vm).
You can move a container from one machine to another very quickly.
Imagine something went wrong while patching a security hole in the host OS, we simply move the container to a different host and resume service very quickly.
Enter DockerDocker as a companyIn 2013, Docker created the first container platform.
In 2015, Docker created the Open Container Initiative — governance structure around container image and runtime specification.
They also donated the first runtime to OCI.
Docker runtime/daemon/engineDocker Engine is built for linux.
Docker for Mac uses HyperKit to run a lightweight Alpine Linux virtual machine.
Docker teamed up Microsoft to create Windows OCI runtime available in Windows 10 or Windows server 2016Docker CliDocker cli commands look very similar to git commands.
Many of them share the context as well.
git pull will get source from origin to local.
docker pull <image> will get the docker image from remote registry to localDocker follows a client server model so the cli can connect to local docker server or the remote serverDocker ImagesAn image is a read-only template with instructions for creating a Docker container.
Often, an image is based on another image, with some additional customization.
For example, we may build an image which is based on the ubuntu image, but installs the Apache web server and your application, as well as the configuration details needed to make your application run.
We need a Dockerfile to create an image.
Let’s look at an example of a python flask application run using gunicorn.
/codeCOPY Pipfile Pipfile.
lock /code/RUN apt-get updateRUN apt-get install postgresql postgresql-client — yes && apt-get -qy install netcat && pip install — upgrade pip setuptools wheel && pip install — upgrade pipenv && pipenv install — dev — system — ignore-pipfileCMD [“/usr/local/bin/gunicorn”, “ — config”, “wsgi.
Images are a collection of immutable layers.
Each instruction in a Dockerfile above creates a layer in the image.
When we change the Dockerfile and rebuild the image, only those layers which have changed are rebuilt.
This is part of what makes images so lightweight, small, and fast, when compared to other virtualization technologies.
Images can also be built on top of other images.
The first line in the Dockerfile is FROM which specifies the image that the current image is being built from.
Let’s look at the Dockerfile that is used to create the python image.
This image is built from buildpack-deps:stretch which provides all the basic tools to support any language.
FROM buildpack-deps:stretch# ensure local python is preferred over distribution pythonENV PATH /usr/local/bin:$PATH# http://bugs.
org/issue19846# > At the moment, setting “LANG=C” on a Linux system *fundamentally breaks Python 3*, and that’s not OK.
ENV LANG C.
UTF-8# extra dependencies (over what buildpack-deps already includes)RUN apt-get update && apt-get install -y — no-install-recommends ❗tk-dev uuid-dev && rm -rf /var/lib/apt/lists/*ENV GPG_KEY 0D96DF4D4110E5C43FBFB17F2D347EA6AA65421DENV PYTHON_VERSION 3.
3…buildpack-deps:stretch is built from buildpack-deps:stretch-scm which is built from buildpack-deps:stretch-curl which is built from debian:stretch which is built from scratch.
If there I had 1000 Dockerfiles that are all built from python:3.
3-stretch, the related layers are not downloaded 1000 times but only once.
Same goes with containers, when we run a python container 1000 times, python is installed only once and reused.
Docker registryRegistry is a place to store all the images.
When we install docker, we have a local registry where all the images we create a stored.
Try docker images to list all the images currently in your local registrydocker hub is a public repository that has over 100,000 images.
That would be the first place to look for pre-built images that we could use directly or use as a base image to build on.
We can move the images between local and remote registries using docker push and docker pull commands.
The default remote registry is docker hub unless we specify explicitly.
At Peaksware, we use Amazon ECR to store our production docker images.
Docker ComposeCompose was introduced so we do not have to build and start every container manually.
Compose is a tool for defining and running multi-container Docker applications.
Compose was initially created for development and testing purpose.
Docker with recent releases have made compose yml to be used to create a docker swarm.
A microservice that our team works on needs the right postgres database with all the migrations, python and all the related libraries setup in-order to run the service locally in a developer machine.
The service was fairly new and the backend kept evolving and we need a quick way to spin up everything that is needed to get the service up for front-end developers that are dependent on it.
Docker compose came in handy, the compose file below would spin up two containersPython docker and install all dependencies and start the webserverPostgres databaseWe can specify the dependencies using links.
The web service container will wait until the db container is up before executing the entry command bash -c “flask migrate && flask run -p 5000 -h 0.
0” which would run the migration and start the server.
There is no need to install python or flask or postgres to run the service locally.
Instead the developer runs docker-compose -f docker-compose.
yml up and wait for the api to be available at localhost:5000version: ‘3.
7’services: web: build: context: .
Dockerfileenvironment: – DB_URI=postgres://postgres:postgres@db/idea_boxcommand: bash -c “flask migrate && flask run -p 5000 -h 0.
0” ports: – 5000:5000 links: – db volumes: – .
/:/codedb: image: postgres:10.
1 environment: POSTGRES_DB: idea_boxThis works great when your service is that simple.
In reality, we had to add a queue and lambda function to process the queue and send messages to a different service.
Unfortunately, I I have not figured out how to run AWS services in a docker.
I found localstack which emulates the aws service and for now we only needed the queue service.
localstack worked great and it spun up a SQS instance locally and we had to create the queue using a shell script and call the script in the entry point of the localstack docker.
This still does not represent the complete service.
Now, I need a local lambda function that would read from the queue and push the message to another service.
This is where I found the benefits of docker compose out weighs the effort it takes to set it up at least for my team’s situation.
7’services: web: build: context: .
Dockerfileenvironment: – DB_URI=postgres://postgres:postgres@db/idea_box – AWS_ACCESS_KEY_ID=foo – AWS_SECRET_ACCESS_KEY=bar – AWS_ENDPOINT=http://aws:4576command: bash -c “flask migrate && flask run -p 5000 -h 0.
0” ports: – 5000:5000 links: – db – awsvolumes: – .
/:/codedb: image: postgres:10.
1 environment: POSTGRES_DB: idea_boxaws: image: localstack/localstack ports: – 4576:4576 – 8080:8080 environment: – SERVICES=sqs – DEBUG=True volumes: – .
dEven with some of the complexities involved in using docker compose for a real service, I would recommend experimenting with it to see if it works for your team.
I would love to hear how you/your team use docker for development and testing.
Developing with DockerNo painful developer machine setup — With compose, anyone can spin up a service pretty quick without having to install all dependencies that they will never use.
Consistent outcome — Builds are reproducible, reliable, and tested to function exactly as expected on production.
Speed up testing — We have tests that need test database and we have the tear-down after each test/group of tests to clean up the database.
I am working on ways to run parallel tests with database running in multiple containers for our project.
Code reviews can be painless — each dev can attach an image with their code review that the reviewer can quickly spin up without having to interrupt what they are doing and test a different version of the code.
Quick fixes can be quick — When developers find bugs, they can fix them in the development environment and redeploy them to the test environment for testing and validation.
When testing is complete, getting the fix to the customer is as simple as pushing the updated image to the production environment.