How the Diego Auction Allocates Jobs

Page last updated:

This topic provides an overview of the structure and components of Diego, the new container management system for Cloud Foundry.

This topic includes the following sections:

Architecture Diagram

Cloud Foundry uses the Diego architecture to manage application containers. Diego components assume application scheduling and management responsibility from the Cloud Controller.

Refer to the following diagram and descriptions for information about the way Diego handles application requests.

Diego flow other View a larger version of this image.

  1. The Cloud Controller passes requests to stage and run applications to several components on the Diego Brain.

  2. The Diego Brain components translate staging and running requests into Tasks and Long Running Processes (LRPs), then submit these to the Bulletin Board System (BBS) through an API over HTTP.

  3. The BBS submits the Tasks and LRPs to the Auctioneer, part of the Diego Brain.

  4. The Auctioneer distributes these Tasks and LRPs to Cells through an Auction. The Diego Brain communicates with Diego Cells using SSL/TLS protocol.

  5. Once the Auctioneer assigns a Task or LRP to a Cell, an in-process Executor creates a Garden container in the Cell. The Task or LRP runs in the container.

  6. The BBS tracks desired LRPs, running LRP instances, and in-flight Tasks. It also periodically analyzes this information and corrects discrepancies to ensure consistency between ActualLRP and DesiredLRP counts.

  7. The Metron Agent, part of the Cell, forwards application logs, errors, and metrics to the Cloud Foundry Loggregator. For more information, see the Application Logging in Cloud Foundry topic.

Diego Components

Diego components run and monitor Tasks and LRPs.

Diego Brain

Diego Brain components distribute Tasks and LRPs to Diego Cells, and correct discrepancies between ActualLRP and DesiredLRP counts to ensure fault-tolerance and long-term consistency.

The Diego Brain consists of the following:

Auctioneer

  • Uses the auction package to run Diego Auctions for Tasks and LRPs

  • Communicates with Cell Reps over SSL/TLS

  • Maintains a lock in the BBS that restricts auctions to one Auctioneer at a time

Refer to the Auctioneer repository on GitHub for more information.

CC-Uploader

  • Mediates uploads from the Executor to the Cloud Controller

  • Translates simple HTTP POST requests from the Executor into complex multipart-form uploads for the Cloud Controller

Refer to the CC-Uploader repository on GitHub for more information.

File Server

  • This “blobstore” serves static assets that can include general-purpose App Lifecycle binaries and application-specific droplets and build artifacts.

Refer to the File Server repository on GitHub for more information.

SSH Proxy

  • Brokers connections between SSH clients and SSH servers running inside instance containers

Refer to Understanding Application SSH, Application SSH Overview, or the Diego SSH Github repository for more information.

TPS Watcher

  • Provides the Cloud Controller with information about currently running LRPs to respond to cf apps and cf app APP_NAME requests

  • Monitors ActualLRP activity for crashes and reports them the Cloud Controller

Refer to the TPS repository on GitHub for more information.

TCP Route-emitter

  • Monitors DesiredLRP and ActualLRP states, emitting TCP route registration and unregistration messages to the Cloud Foundry routing API when it detects changes

  • Periodically emits TCP routes to the Cloud Foundry routing API

Nsync

  • Listens for app requests to update the DesiredLRPs count and updates DesiredLRPs through the BBS

  • Periodically polls the Cloud Controller for each app to ensure that Diego maintains accurate DesiredLRPs counts

Refer to the Nsync repository on GitHub for more information.

Stager

  • Translates staging requests from the Cloud Controller into generic Tasks and LRPs

  • Sends a response to the Cloud Controller when a Task completes

Refer to the Stager repository on GitHub for more information.

Diego Cell

Diego Cell components manage and maintain Tasks and LRPs.

The Diego Cell consists of the following:

Rep

  • Represents a Cell in Diego Auctions for Tasks and LRPs

  • Mediates all communication between the Cell and the BBS

  • Ensures synchronization between the set of Tasks and LRPs in the BBS with the containers present on the Cell

  • Maintains the presence of the Cell in the BBS

  • Runs Tasks and LRPs by asking the in-process Executor to create a container and RunAction recipes

Refer to the Rep repository on GitHub for more information.

Executor

  • Runs as a logical process inside the Rep

  • Implements the generic Executor actions detailed in the API documentation

  • Streams STDOUT and STDERR to the Metron agent running on the Cell

Refer to the Executor repository on GitHub for more information.

Garden

  • Provides a platform-independent server and clients to manage Garden containers

  • Defines the Garden-runC interface for container implementation

See the Garden topic or the Garden repository on GitHub for more information.

Metron Agent

Forwards application logs, errors, and application and Diego metrics to the Loggregator Doppler component

Refer to the Metron repository on GitHub for more information.

Route-emitter

  • Monitors DesiredLRP and ActualLRP states, emitting route registration and unregistration messages to the Cloud Foundry Gorouter when it detects changes

  • Periodically emits the entire routing table to the Cloud Foundry Gorouter

Refer to the Route-Emitter repository on GitHub for more information.

Database VMs

The Diego database VM consists of the following components.

Diego Bulletin Board System

  • Maintains a real-time representation of the state of the Diego cluster, including all desired LRPs, running LRP instances, and in-flight Tasks

  • Provides an RPC-style API over HTTP to Diego Core components and external clients, including the SSH Proxy and Route Emitter.

  • Ensure consistency and fault tolerance for Tasks and LRPs by comparing desired state (stored in the database) with actual state (from running instances)

  • Acts to keep DesiredLRP count and ActualLRP count synchronized in the following ways:

    • If the DesiredLRP count exceeds the ActualLRP count, requests a start auction from the Auctioneer
    • If the ActualLRP count exceeds the DesiredLRP count, sends a stop message to the Rep on the Cell hosting an instance
  • Monitors for potentially missed messages, resending them if necessary

Refer to the Bulletin Board System repository on GitHub for more information.

MySQL

  • Provides a consistent key-value data store to Diego

Locket

  • Provides a consistent key-value store for maintenance of distributed locks and component presence

Go MySQL Driver

The Diego BBS stores data in MySQL. Diego uses the Go MySQL Driver to communicate with MySQL.

Refer to the Go MySQL Driver repository on GitHub for more information.

Consul

  • Provides dynamic service registration and load balancing through DNS resolution

Refer to the Consul repository on GitHub for more information.

Platform-specific Components

Garden Backends

Garden contains a set of interfaces that each platform-specific backend must implement. See the Garden topic or the Garden repository on GitHub for more information.

App Lifecycle Binaries

The following three platform-specific binaries deploy applications and govern their lifecycle:

  • The Builder, which stages a CF application. The Builder runs as a Task on every staging request. It performs static analysis on the application code and does any necessary pre-processing before the application is first run.

  • The Launcher, which runs a CF application. The Launcher is set as the Action on the DesiredLRP for the application. It executes the start command with the correct system context, including working directory and environment variables.

  • The Healthcheck, which performs a status check on running CF application from inside the container. The Healthcheck is set as the Monitor action on the DesiredLRP for the application.

Current Implementations

Create a pull request or raise an issue on the source for this page in GitHub