Skip to content

Introduction and Core Concepts

The Promise of Docker

Docker was introduced on March 15, 2013, by Solomon Hykes, founder and CEO of dotCloud, during a five-minute lightning talk at the Python Developers Conference. The source code was quickly released on GitHub as a public and fully open source project.

Docker is a platform for developing, shipping, and running applications in isolated environments called containers. Its fundamental promise is to encapsulate an application and its dependencies into a single, distributable artifact. This artifact enables application deployment at scale into any environment, from a developer's laptop to a production cluster.

Docker's value extends beyond technology to organizational workflow:

  • Reduces Complexity: It helps build a layer of isolation between software and infrastructure, reducing the burden of communication and conflict between development and operations teams (e.g., developers no longer need to request specific library versions on host machines).

  • Standardizes Workflow: It provides a standardized toolset and packaging format (the OCI standard) that simplifies the build, testing, and deployment processes, regardless of the application's programming language.

  • Enforces Robust Design: The Docker philosophy centered on atomic or throwaway containers, encourages the creation of robust, scalable, and reliable applications by favoring immutability and stateless design.

The Core Problem: "Dependency Hell"

Before containers, a common challenge was the "works on my machine" problem. An application would run perfectly on a developer's laptop but fail in staging or production. This was typically due to subtle differences in the environment:

  • Different operating system patch levels.

  • Varying versions of shared libraries (e.g., SSL, image processing).

  • Inconsistent language runtime versions (e.g., Python 3.8 vs. 3.9).

This "dependency hell" made deployments fragile and difficult to reproduce. Docker solves this by packaging the application, its configurations, and all its dependencies (libraries, runtimes, etc.) into one self-contained unit. This unit, the container image, is immutable and runs the same way everywhere, ensuring consistency from development to production.

What Docker Is (and Isn't)

Docker is a tool that simplifies the management of Linux containers. However, it is often confused with other technologies. It is important to understand what Docker is not:

  • Not a Cloud Platform: Docker does not provision host machines, block storage, or other cloud resources. It runs on hosts that are managed by cloud platforms (like AWS, Azure) or on-premises servers. Docker only handles deploying, running, and managing containers on preexisting Docker hosts.

  • Not a Configuration Management (CM) Tool: Dockerfiles define an image at build time. Docker does not manage the ongoing state of a running container or the host system itself, which is the traditional role of tools like Puppet or Chef. However, by shipping complex requirements inside an image, Docker can significantly lessen the need for complex CM code.

  • Not a Workload Management Tool (Orchestrator): The core Docker Engine manages containers on a single host. To coordinate and schedule containers across a cluster of hosts, a dedicated orchestration layer like Kubernetes or Docker's built-in Swarm mode is required.

  • Not an Enterprise Virtualization Platform: This is the most common confusion. Containers and Virtual Machines (VMs) are fundamentally different.

The Core Distinction: Containers vs. Virtual Machines

The primary difference between a container and a VM is in their approach to virtualization. A VM virtualizes the hardware, while a container virtualizes the operating system.

  • Virtual Machines (VMs): A hypervisor (e.g., VMware, KVM) runs on a host OS and emulates a complete set of physical hardware (CPU, RAM, disk). This allows you to run multiple guest operating systems on a single host. Each VM contains a full OS, its own kernel, and the application.

  • Linux Containers: A container engine (like Docker) runs on a host OS and leverages the host's kernel. All containers on that host share that single kernel. A container is simply an isolated process (or group of processes) running in its own namespace, consuming fewer resources.

Architectural Comparison

FeatureVirtual Machines (VMs)Linux Containers
Kernel UsageEach VM contains a complete operating system running its own kernel.All containers share a single kernel (the host’s kernel).
IsolationProvided by a hypervisor that fully virtualizes the hardware. This provides very strong isolation.Implemented entirely within the shared kernel using mechanisms like namespaces and cgroups.
Resource AllocationResources (CPU, memory) are pre-allocated and tightly controlled by the hypervisor.Resources are shared by default, behaving like colocated processes unless constrained by cgroups.
PerformanceSignificant startup time (booting a full OS) and resource overhead (RAM for each guest OS).Near-instant startup time (starting a process) and minimal overhead.
SizeLarge. A typical VM image is measured in Gigabytes (GiB).Small. A container image is measured in Megabytes (MiB) and can be 100x smaller than a VM.

Size, Efficiency, and the Layered Filesystem

Containers are "lightweight" because they are significantly smaller and faster than VMs. This efficiency is achieved through the layered filesystem.

A container image is not a single, monolithic file; it is a composition of read-only layers. Each instruction in a Dockerfile (e.g., RUN, COPY, ADD) creates a new layer. When a container is created, these layers are stacked, and a new, thin writable layer is added on top.

This design provides:

  • Efficiency: Layers are shared and reused. If multiple containers are based on the same base image (e.g., ubuntu:22.04), that base layer is only stored once on the host.

  • Speed: Pulling an updated image only requires downloading the layers that have changed.

  • Minimal Footprint: A newly created container takes up very little disk space (as little as 12 KB), as it only consists of metadata and its writable layer.

Operating System Compatibility

The shared kernel model has one critical implication: containers are OS-specific.

  • A Linux host can only run Linux containers (as they all share the Linux kernel).

  • A Windows host can only run Windows containers.

  • A Windows binary cannot run natively inside a Linux container, and vice-versa.

When Docker is run on a non-Linux system (like macOS or Windows), it uses a lightweight Linux virtual machine (provided by Docker Desktop) to host the Linux-based Docker server and run Linux containers.

Conceptual Summary

  • A VM provides full hardware virtualization, enabling it to run a completely separate guest operating system with strong isolation, but at the cost of significant resource overhead.
  • A container provides operating-system-level virtualization, sharing the host kernel to achieve a much lighter, faster, and more efficient process-level isolation.

Core Terminology

The following terms are the foundational vocabulary for working with Docker.

  • Docker Client (docker): The command-line tool used to interact with the Docker server. It sends commands to the daemon and used to control most of docker workflow.

  • Docker Server / Daemon (dockerd): The background process that listens for Docker API requests and manages Docker objects, including building, running, and managing containers.

  • Docker / OCI Image: A read-only template used to create containers. An image consists of one or more stacked filesystem layers and metadata (e.g., the default command to run). An image typically has a repository address, a name, and a tag (e.g., docker.io/superorbital/wordchain:v1.0.1).

    • OCI (Open Container Initiative) Standard: An open industry standard for container formats and runtimes. An OCI-compliant image is guaranteed to work with any OCI-compliant tool (including Docker).

    • Tag: A label applied to an image to differentiate versions (e.g., :v1.0.1 or :latest).

  • Registry: A stateless, server-side application that stores and distributes Docker/OCI images. It is the central hand-off point between the build and deployment stages. A common public registry is Docker Hub.

  • Linux Container: A runnable instance that has been instantiated from a Docker or OCI image . A container is a live, running process (or set of processes) on the host's kernel, but it is isolated from the host and other containers. A specific container can only exist once, although you can easily create multiple containers from the same image.

  • Atomic or Immutable Host: A small, finely tuned operating system (e.g., Fedora CoreOS, Bottlerocket OS) designed specifically to host container workloads. These hosts are designed to be immutable and support atomic upgrades, enhancing system consistency and reliability.

  • These systems are designed to minimize configuration divergence and introduce less unexpected behavior, contrasting with traditional servers that are patched and updated in place.