MOBI BOOT CAMP CORP. logoLearning Buddy
  • SIGN IN
  • Foundations
    • Introduction
    • Virtualization: VMs & Containers
    • Slides
  • The Hadoop Ecosystem: Batch at Scale
  • The Spark Ecosystem: In-Memory Processing
  • Data Pipelines and Transport
  • Search & Information Retrieval
  • The Modern Data Stack
  • Glossary

Virtualization: VMs and Containers

Modern software deployment relies heavily on virtualization—the process of creating a virtual version of a computing resource. This allows for efficient use of hardware and consistent, reproducible application environments. There are two primary approaches to virtualization: Virtual Machines and Containers.

1. Hardware Virtualization: The Virtual Machine (VM)

A Virtual Machine (VM) is an emulation of a complete physical computer. It abstracts the hardware, allowing you to run a full "guest" operating system on top of a "host" operating system.

  • How it works: A piece of software called a Hypervisor runs on the host machine's hardware. The hypervisor creates and manages the VMs, allocating a dedicated slice of the physical CPU, memory, and storage to each one.
  • Key Characteristic: Each VM has its own complete, isolated operating system kernel. You can run a Linux VM on a Windows host, or vice-versa.

A Virtual Machine abstracts the hardware

2. OS Virtualization: The Container

A Container is a lighter-weight form of virtualization that abstracts the operating system. Instead of virtualizing the hardware, containers share the host machine's OS kernel.

  • How it works: A Container Engine (like Docker) runs on the host OS. It creates isolated user-space environments (the containers) for applications to run in.
  • Key Characteristic: All containers on a host share the same OS kernel. They only package up their application code and its specific libraries and dependencies. This makes them much smaller and faster to start than VMs.

The most popular technology for creating and managing containers is Docker.

The Docker Workflow: Build, Ship, and Run

VM vs. Container: A Comparison

The key difference is the layer of abstraction. A VM virtualizes the hardware, while a container virtualizes the operating system.

Hypervisor-based (VM) vs. Container-based Virtualization

Feature Virtual Machine (VM) Container
Abstraction Hardware Operating System
Size Large (Gigabytes) Small (Megabytes)
Startup Time Slow (Minutes) Fast (Seconds)
Isolation Full OS-level isolation Process-level isolation
Overhead High Low

3. Managing Containers at Scale: Orchestration

While you can manage a few containers manually, running complex, multi-container applications in production requires container orchestration. An orchestrator automates the deployment, scaling, networking, and management of containerized applications.

The dominant open-source container orchestration platform is Kubernetes (K8s).

Key Kubernetes Concepts

  • Image: A snapshot or blueprint of a container. It's a lightweight, standalone, executable package that includes everything needed to run an application: code, runtime, system tools, and libraries. You build an image, and then you run it to create a container.
  • Pod: The smallest deployable unit in Kubernetes. A Pod is a group of one or more containers that are deployed together and share storage and network resources. For example, a web application pod might contain a container for the web server and another container for a logging sidecar.
  • Orchestration: Kubernetes automatically handles tasks like scheduling pods onto nodes in a cluster, restarting failed containers, scaling the number of pods up or down based on load, and managing network load balancing.
Privacy Policy | Terms & Conditions