You're actually injecting your source code using volumes:, not during the image build, and this doesn't honor .dockerignore.

Running a Docker application like this happens in two phases:

  1. You build a reusable image that contains the application runtime, any OS and language-specific library dependencies, and the application code; then
  2. You run a container based on that image.

The .dockerignore file is only considered during the first build phase. In your setup, you don't actually COPY anything in the image beyond the requirements.txt file. Instead, you use volumes: to inject parts of the host system into the container. This happens during the second phase, and ignores .dockerignore.

The approach I'd recommend for this is to skip the volumes:, and instead COPY the required source code in the Dockerfile. You should also generally indicate the default CMD the container will run in the Dockerfile, rather than requiring it it the docker-compose.yml or docker run command.

FROM python:3.9-slim-buster

# Do the OS-level setup _first_ so that it's not repeated
# if Python dependencies change
RUN apt-get update && apt-get install -y ...

WORKDIR /django-app

# Then install Python dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt

# Then copy in the rest of the application
# NOTE: this _does_ honor .dockerignore
COPY . .

# And explain how to run it
ENV PYTHONUNBUFFERED=1
EXPOSE 8000
USER userapp
# consider splitting this into an ENTRYPOINT that waits for the
# the database, runs migrations, and then `exec "$@"` to run the CMD
CMD sleep 7; python manage.py migrate; python manage.py runserver 0.0.0.0:8000

This means, in the docker-compose.yml setup, you don't need volumes:; the application code is already inside the image you built.

version: "3.8"
services:
  app: 
    build: .
    ports: 
      - 8000:8000
    depends_on: 
      - db
    # environment: [PGHOST=db]
    # no volumes: or container_name:

  db:
    image: postgres
    volumes: # do keep for persistent database data
      - ./data:/var/lib/postgresql/data
    environment: 
      - POSTGRES_DB=${DB_NAME}
      - POSTGRES_USER=${DB_USER}
      - POSTGRES_PASSWORD=${DB_PASSWORD}
    # ports: ['5433:5432']

This approach also means you need to docker-compose build a new image when your application changes. This is normal in Docker.

For day-to-day development, a useful approach here can be to run all of the non-application dependencies in Docker, but the application itself outside a container.

# Start the database but not the application
docker-compose up -d db

# Create a virtual environment and set it up
python3 -m venv venv
. venv/bin/activate
pip install -r requirements.txt

# Set environment variables to point at the Docker database
export PGHOST=localhost PGPORT=5433

# Run the application locally
./manage.py runserver

Doing this requires making the database visible from outside Docker (via ports:), and making the database location configurable (probably via environment variables, set in Compose with environment:).

Answer from David Maze on Stack Overflow
🌐
GitHub
github.com › themattrix › python-pypi-template › blob › master › .dockerignore
python-pypi-template/.dockerignore at master · themattrix/python-pypi-template
Template for quickly creating a new Python project and publishing it to PyPI. - python-pypi-template/.dockerignore at master · themattrix/python-pypi-template
Author   themattrix
🌐
GitHub
gist.github.com › KernelA › 04b4d7691f28e264f72e76cfd724d448
.dockerignore example for Python projects · GitHub
.dockerignore example for Python projects. GitHub Gist: instantly share code, notes, and snippets.
🌐
.dockerignore
dockerignore.com › dockerignores › languages-python
Python .dockerignore
Ready-to-use .dockerignore template for Python languages projects. Copy and paste directly into your project.
🌐
GitHub
gist.github.com › LondonAppDev › 8d89fcbc43fb0a4c593263ce80b0cbe3
Template .dockerignore. · GitHub
Template .dockerignore. GitHub Gist: instantly share code, notes, and snippets.
Top answer
1 of 2
21

You're actually injecting your source code using volumes:, not during the image build, and this doesn't honor .dockerignore.

Running a Docker application like this happens in two phases:

  1. You build a reusable image that contains the application runtime, any OS and language-specific library dependencies, and the application code; then
  2. You run a container based on that image.

The .dockerignore file is only considered during the first build phase. In your setup, you don't actually COPY anything in the image beyond the requirements.txt file. Instead, you use volumes: to inject parts of the host system into the container. This happens during the second phase, and ignores .dockerignore.

The approach I'd recommend for this is to skip the volumes:, and instead COPY the required source code in the Dockerfile. You should also generally indicate the default CMD the container will run in the Dockerfile, rather than requiring it it the docker-compose.yml or docker run command.

FROM python:3.9-slim-buster

# Do the OS-level setup _first_ so that it's not repeated
# if Python dependencies change
RUN apt-get update && apt-get install -y ...

WORKDIR /django-app

# Then install Python dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt

# Then copy in the rest of the application
# NOTE: this _does_ honor .dockerignore
COPY . .

# And explain how to run it
ENV PYTHONUNBUFFERED=1
EXPOSE 8000
USER userapp
# consider splitting this into an ENTRYPOINT that waits for the
# the database, runs migrations, and then `exec "$@"` to run the CMD
CMD sleep 7; python manage.py migrate; python manage.py runserver 0.0.0.0:8000

This means, in the docker-compose.yml setup, you don't need volumes:; the application code is already inside the image you built.

version: "3.8"
services:
  app: 
    build: .
    ports: 
      - 8000:8000
    depends_on: 
      - db
    # environment: [PGHOST=db]
    # no volumes: or container_name:

  db:
    image: postgres
    volumes: # do keep for persistent database data
      - ./data:/var/lib/postgresql/data
    environment: 
      - POSTGRES_DB=${DB_NAME}
      - POSTGRES_USER=${DB_USER}
      - POSTGRES_PASSWORD=${DB_PASSWORD}
    # ports: ['5433:5432']

This approach also means you need to docker-compose build a new image when your application changes. This is normal in Docker.

For day-to-day development, a useful approach here can be to run all of the non-application dependencies in Docker, but the application itself outside a container.

# Start the database but not the application
docker-compose up -d db

# Create a virtual environment and set it up
python3 -m venv venv
. venv/bin/activate
pip install -r requirements.txt

# Set environment variables to point at the Docker database
export PGHOST=localhost PGPORT=5433

# Run the application locally
./manage.py runserver

Doing this requires making the database visible from outside Docker (via ports:), and making the database location configurable (probably via environment variables, set in Compose with environment:).

2 of 2
14

That's not actually your case, but in general an additional cause of ".dockerignore not ignoring" is that it applies the filters to whole paths relative to the context dir, not just basenames, so the pattern:

__pycache__
*.pyc

applies only to the docker context's root directory, not to any of subdirectories.

In order to make it recursive, change it to:

**/__pycache__
**/*.pyc
🌐
GitHub
github.com › unit9 › boilerplate-python › blob › master › .dockerignore
boilerplate-python/.dockerignore at master · unit9/boilerplate-python
# Usually these files are written by a python script from a template · # before PyInstaller builds the exe, so as to inject date/other infos into it. *.manifest · *.spec · · # Installer logs · pip-log.txt · pip-delete-this-directory.txt · · # Unit test / coverage reports ·
Author   unit9
Find elsewhere
🌐
JiHu GitLab
git.pinlandata.com › xingci.xu › django product template › repository
.dockerignore · main · xingci.xu / Django Product Template · JiHu GitLab
django-product-template · .dockerignore · Find file Blame Permalink · Apr 17, 2022 2c78bb9e · 创建项目模版 · 2c78bb9e · xingci.xu authored Apr 17, 2022 · 2c78bb9e · 创建项目模版 · xingci.xu authored Apr 17, 2022 ·
🌐
TestDriven.io
testdriven.io › tips › 6850ab62-9323-4dca-8ddf-8db1d479accc
Tips and Tricks - Use a .dockerignore File | TestDriven.io
A properly structured .dockerignore file can help: Decrease the size of the Docker image · Speed up the build process · Prevent unnecessary cache invalidation · Prevent leaking secrets · Example: **/.git **/.gitignore **/.vscode **/coverage **/.env **/.aws **/.ssh Dockerfile README.md docker-compose.yml **/.DS_Store **/venv **/env ·
🌐
GitHub
github.com › BrianPugh › python-template › blob › main › .dockerignore
python-template/.dockerignore at main · BrianPugh/python-template
Python project and library template for clean, reliable, open-source projects. - python-template/.dockerignore at main · BrianPugh/python-template
Author   BrianPugh
🌐
PyPI
pypi.org › project › dockerignore-generate
dockerignore-generate
August 14, 2018 - JavaScript is disabled in your browser. Please enable JavaScript to proceed · A required part of this site couldn’t load. This may be due to a browser extension, network issues, or browser settings. Please check your connection, disable any ad blockers, or try using a different browser
🌐
Reddit
reddit.com › r/docker › a collection of .dockerignore templates
A collection of .dockerignore templates : r/docker
November 4, 2020 - So if you are building on a remote machine with the remote Docker API, dockerignore can drastically reduce the amount of data you have to send it over the network. When building Python projects, I don't want my entire local virtualenv folder to get copied to the server.
🌐
GitHub
github.com › topics › dockerignore
dockerignore · GitHub Topics · GitHub
api docker django-rest-framework generic dockerignore ... A Python module for filtering a list of files according to patterns.
🌐
GitLab
git.astron.nl › astron templates › python package › repository
.dockerignore · main · ASTRON Templates / Python Package · GitLab
A cookiecutter template ready to create a new Python project that will automatically contain a CI/CD pipeline for building, testing and deploying a python package
Top answer
1 of 1
22

Yes, it's a recommended practice. There are several reasons:

Reduce the size of the resulting image

In .dockerignore you specify files that won't go to the resulting image, it may be crucial when you're building the smallest image. Roughly speaking the size of bytecode files is equal to the size of actual files. Bytecode files aren't intended for distribution, that's why we usually put them into .gitignore as well.


Cache related problems

In earlier versions of Python 3.x there were several cached related issues:

Python’s scheme for caching bytecode in .pyc files did not work well in environments with multiple Python interpreters. If one interpreter encountered a cached file created by another interpreter, it would recompile the source and overwrite the cached file, thus losing the benefits of caching.

Since Python 3.2 all the cached files prefixed with interpreter version as mymodule.cpython-32.pyc and presented under __pychache__ directory. By the way, starting from Python 3.8 you can even control a directory where the cache will be stored. It may be useful when you're restricting write access to the directory but still want to get benefits of cache usage.

Usually, the cache system works perfectly, but someday something may go wrong. It worth to note that the cached .pyc (lives in the same directory) file will be used instead of the .py file if the .py the file is missing. In practice, it's not a common occurrence, but if some stuff keeps up being "there", thinking about remove cache files is a good point. It may be important when you're experimenting with the cache system in Python or executing scripts in different environments.


Security reasons

Most likely that you don't even need to think about it, but cache files can contain some sort of sensitive information. Due to the current implementation, in .pyc files presented an absolute path to the actual files. There are situations when you don't want to share such information.


It seems that interacting with bytecode files is a quite frequent necessity, for example, django-extensions have appropriate options compile_pyc and clean_pyc.

🌐
Hugging Face
huggingface.co › spaces › jbilcke-hf › template-node-python-express › blob › main › .dockerignore
.dockerignore · jbilcke-hf/template-node-python-express at main
June 21, 2023 - template-node-python-express / .dockerignore · Julian Bilcke · initial commit log 🦫 4f54f9b over 2 years ago · raw · Copy download link history blame contribute delete ·