Docker for Python Development?Chinmay ShahBlockedUnblockFollowFollowingFeb 17Part 1 covered what is docker.
In this article, we’ll be talking about how to start using Docker for python development.
A standard python installation involves setting up environment variables and if you’re dealing with different versions of python, there are tons of environment variables to be dealt with be it Windows or Linux.
And if you’re dealing with production where you’re very particular about the python version, you need to make sure your program is tested on this version and there are no additional as well as missing dependencies which you might a have taken granted on your local system.
Dealing with Python 2 and 3 simultaneously is still manageable, but think if you have to deal with python 3.
6 installed and need to try out python 3.
Docker comes to the rescue!Changing the python version is as simple as FROM python:3.
7 to FROM python:3.
Docker not only allows to do quick experimentation but also helps restrict to particular package version that is required to maintain stability.
Think of it as a virtual environment in Anaconda.
But, when you have Anaconda configured, most of the packages are already pre-installed on the system and when you’ll try to deploy it, you might just take them for granted.
Additionally, Anaconda for python 3.
7 takes up more than 3GB on disk space.
On the other hand, docker image hardly takes a gigabyte.
Sizes of Python Full and Alpine ImagesPrepping the DockerFileDecide the python version you need and if you’re just looking for the latest version, just write python:latest .
The way images are usually named image_name:tag which needs to be provided when you build an image using docker build -t image_name:tag .
If you don’t provide any tag during the build, docker automatically assigns it latest.
On docker hub, Python offers different images to choose from an Alpine version which is a very minimal image to a full version.
# Python 3.
6# author of fileLABEL maintainer=”Chinmay Shah <chinmayshah3899@gmail.
com>”RUN — This instruction will execute any commands in a new layer on top of the current image and commit the results.
The resulting committed image will be used for the next step in the Dockerfile , usually in shell form, unless specified otherwise.
# Packages that we need COPY requirement.
txt /app/WORKDIR /app# instruction to be run during image buildRUN pip install -r requirement.
txt# Copy all the files from current source duirectory(from your system) to# Docker container in /app directory COPY .
/appYou might notice that in we’re copying the requirement.
It’s for build optimization.
Each line in Dockerfile is executed as a separate step and creates a new container.
This is highly inefficient because you don’t always want to start rebuilding everything from scratch during every build command.
To deal with that, Docker uses something known as Layer caching, which is a fancy way of saying, rather than re-creating a container each time, it re-uses a container previously created in the previous build and stores it.
So unless you make a change in a previous step, the build process can re-use the old container.
The idea is that requirement will hardly change, but the code will often, thus using cache is a better way, rather than downloading requirement packages each time image is built.
The build processNote that each line in dockerfile is a separate step but also, each step has a unique alphanumeric name attached to it.
ENTRYPOINT — specifies a command that will always be executed when the container starts.
CMD — Specifies arguments that will be fed to the ENTRYPOINT.
It is used when you want to run a container as an executable.
Without entrypoint, the default argument is a command that is executed.
With entrypoint, CMD is passed to entrypoint as the argument.
More about it here.
# Specifies a command that will always be executed when the # container starts.
# In this case we want to start the python interpreterENTRYPOINT [“python”]# We want to start app.
(change it with your file name) # Argument to python commandCMD [“app.
py”]Running the python fileBuilding the file:docker build -t python:test .
Starting the container using a cmd or bash:docker run — rm -it -v /c/Users/chinm/Desktop/python_run:/teste pythontest:alpine-v or –volume is used to attach a volume.
What is volume?Volumes are the preferred mechanism for persisting data generated by and used by Docker containers.
They are completely managed by Docker.
To connect a directory to a Docker container:-v source_on_local_machine:destination_on_docker_containerIf you want to create a volume so that it can be accessed across different docker containers, creating a docker volume would be a way to go.
More about it here.
If you’re trying to use Python for DataScience, you’ll probably use volume so that you can share the raw data with the container, do some operations on it, and write the cleaned data back to the host machine.
Note: If you’re using docker using Docker Toolbox, the -v will map your container to VM, which has your local machine’s C directory mapped, thus making it difficult to use your other logical drives as volumes on docker.
ConclusionI hope you’re equipped with the basic knowledge to get started with Python on docker.
Happy Hacking!Feel free to reach out on Twitter, Linkedin or E-Mail.