Dockerfile
The Dockerfile is a text document that contains all the commands, in order, needed to build a container image. It is the blueprint for your application's environment. How you write this file has a significant impact on your image's size, build speed, and security.
The Build Context and .dockerignore
When you run docker build, the first thing the Docker client does is send the "build context" to the daemon. The build context is the set of files at the specified path (e.g., . for the current directory).
- This entire directory is sent to the daemon, not just the
Dockerfile.
This is why the .dockerignore file is critical. It is a file in the root of your build context that lists files and directories to exclude from the build context, similar to a .gitignore file.
Why .dockerignore is essential:
Build Speed: It prevents large, unnecessary files (like
node_modules,build/,.git/,.venv/) from being sent to the daemon, speeding up the build.Cache Invalidation: It prevents changes to non-essential files (like a
README.mdor test logs) from breaking the build cache.Security: It prevents sensitive files (like
.env,aws_credentials,.ssh/) from being accidentally copied into the image.
Example .dockerignore:
# Exclude node.js dependencies
node_modules
# Exclude python virtual env
.venv
# Exclude build artifacts
build/
dist/
# Exclude git and OS files
.git
.DS_Store
# Exclude local secrets
.env
*.logCore Instructions
FROM: Define the Base
The FROM instruction must be the first in a Dockerfile. It specifies the parent image your image is "based on."
Best Practice: Always use a specific tag (e.g.,
FROM node:18-alpineinstead ofFROM node). Usinglatestcan lead to unpredictable builds as the base image may change.Best Practice: Use minimal base images (like
alpine,slim, ordistroless) for production. They are smaller and have a reduced attack surface.
WORKDIR: Set a Working Directory
The WORKDIR instruction sets the working directory for any subsequent RUN, CMD, ENTRYPOINT, COPY, and ADD instructions.
- Best Practice: Always use
WORKDIRinstead ofRUN cd /app.WORKDIRautomatically creates the directory if it doesn't exist and ensures all future commands execute in that context, making yourDockerfilecleaner and more reliable.
# Good
WORKDIR /app
COPY . .
# Bad
RUN mkdir /app
RUN cd /app
COPY . .COPY vs. ADD: Moving Files
Both instructions copy files from the build context into the image.
COPY: This instruction is simple and explicit. It copies files and directories from the context to the image.COPY <src> <dest>ADD: This instruction has "magic" features:It can copy and auto-extract local tarballs (
.tar.gz).It can download files from a remote URL.
Best Practice: Always prefer COPY. Its behavior is explicit and transparent. The "magic" of ADD can be dangerous (e.g., downloading a malicious file, "zip bomb" extraction). Only use ADD if you specifically need to auto-extract a local tarball.
RUN: Executing Commands
The RUN instruction executes any command in a new layer on top of the current image. This is used for installing packages, compiling code, etc.
- Best Practice (Layer Reduction): Chain related commands together using
&&and backslashes (\) to reduce the number of image layers. EachRUNinstruction creates a new layer, and more layers can mean a larger image.
# Good: One layer
RUN apt-get update && apt-get install -y \
curl \
git \
vim \
&& rm -rf /var/lib/apt/lists/*
# Bad: Four layers
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y git
RUN apt-get install -y vimNotice the rm -rf /var/lib/apt/lists/* in the good example. This cleans up the package cache in the same layer it was created, reducing the final image size.
EXPOSE: Documenting Ports
The EXPOSE instruction informs Docker that the container listens on the specified network ports at runtime.
What it does: This is purely metadata. It serves as documentation for the user and can be used by automation tools.
What it does NOT do: It does not actually publish the port. You still must use the
-por-Pflag (docker run -p 8080:80 ...) to map the port from the host to the container.
ARG vs. ENV: Setting Variables
ARG(Build-Time):ARGdefines a variable that can be passed at build-time (e.g.,docker build --build-arg VERSION=1.2). This variable exists only during the build; it is not available to the running container.ENV(Runtime):ENVsets a permanent environment variable in the image. This variable is available both during the build (after it's defined) and when the container is running.
CMD vs. ENTRYPOINT: Defining the Runtime
This is one of the most confusing but important distinctions.
CMD: Sets the default command and/or parameters to be executed when the container starts.CMDcan be easily overridden by the user at runtime (e.g.,docker run my-image /bin/bash).If a
Dockerfilehas multipleCMDinstructions, only the last one takes effect.
ENTRYPOINT: Configures the container to run as a specific executable.ENTRYPOINTis not easily overridden. If you provide arguments at runtime, they are appended to theENTRYPOINT.This is ideal for creating images that are "for" a specific application (e.g., an image that is the
redis-server).
Exec Form (Preferred) vs. Shell Form (Avoid)
Both CMD and ENTRYPOINT can be written in two forms:
Exec Form (JSON Array):
CMD ["executable", "param1", "param2"]This is the preferred form.
It executes the command directly, without a shell.
Crucially, this allows process signals (like
SIGTERMfromdocker stop) to be sent directly to your application, allowing for graceful shutdowns.
Shell Form:
CMD executable param1 param2This form runs your command inside
/bin/sh -c "...".This can cause problems with signal handling and is less explicit. Your application becomes a child process of the shell, which may not forward signals correctly.
The Best-Practice Combination: Use ENTRYPOINT in exec form to set the main executable and CMD in exec form to set the default parameters.
ENTRYPOINT ["/usr/bin/my-app"]
CMD ["--mode", "production"]docker run my-image-> Runs/usr/bin/my-app --mode productiondocker run my-image --mode staging-> Runs/usr/bin/my-app --mode staging(TheCMDis overridden)docker run my-image -h-> Runs/usr/bin/my-app -h(TheCMDis overridden)
Build Cache Optimization
Docker builds images in layers. If nothing has changed in a layer, Docker reuses it from the cache. This is the key to fast builds.
The Golden Rule: Order your Dockerfile instructions from least-frequently changed to most-frequently changed.
A change in one layer invalidates the cache for all subsequent layers.
Bad Example (Node.js):
WORKDIR /app
COPY . . # <-- Any file change breaks the cache
RUN npm install # <-- npm install runs EVERY time
CMD ["node", "src/index.js"]Good Example (Node.js):
WORKDIR /app
COPY package*.json ./ # 1. Copy only package.json
RUN npm install # 2. Install dependencies
# (This layer is only rebuilt if package.json changes)
COPY . . # 3. Copy source code
# (This layer is rebuilt on code changes, but npm install is cached)
CMD ["node", "src/index.js"]Annotated Example: A Multistage Node.js Build
This example uses all the best practices: multistage builds, cache optimization, WORKDIR, USER, and the exec form of CMD.
# --- Stage 1: The "Builder" ---
# Use a full-featured base image for building the app
FROM node:18-alpine AS builder
# Set the working directory in the container
WORKDIR /app
# Copy package.json and package-lock.json first to leverage build cache
COPY package*.json ./
# Install dependencies
RUN npm install
# Copy the rest of the application source code
COPY . .
# Run the build script (e.g., for a React, Vue, or TypeScript app)
RUN npm run build
# --- Stage 2: The "Production" Image ---
# Use a minimal base image for the final, lean image
FROM node:18-alpine
# Set the working directory
WORKDIR /app
# Create a non-root user and group for security
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
# Copy only the necessary build artifacts from the "builder" stage
# This copies the 'dist' folder from the 'builder' stage
COPY --from=builder /app/dist ./dist
# Copy production node_modules (if different from dev)
# A more complex build might copy node_modules from the builder too.
# Expose the port the app runs on (metadata)
EXPOSE 3000
# Set the default command to run the app
# Use 'exec' form for correct signal handling
CMD ["node", "dist/main.js"]