Using App Health Checks

This topic describes how to configure health checks for your Pivotal Web Services (PWS) apps.

Overview

An app health check is a monitoring process that continually checks the status of a running app.

Developers can configure a health check for an app using the Cloud Foundry Command Line Interface (cf CLI) or by specifying the health-check-http-endpoint and health-check-type fields in an app manifest.

To configure a health check using the cf CLI, see Configure Health Checks when Creating or Updating and Configure Health Checks for an Existing App below. For more information about using an app manifest to configure a health check, see health-check-http-endpoint and health-check-type in Deploying with App Manifests.

App health checks function as part of the app lifecycle managed by Diego architecture. For more information, see Diego Components and Architecture.

Configure Health Checks when Creating or Updating

To configure a health check while creating or updating an app, run:

cf push APP-NAME -u HEALTH-CHECK-TYPE -t HEALTH-CHECK-TIMEOUT

Where:

  • APP-NAME is the name of your app.

  • HEALTH-CHECK-TYPE is the type of health check that you want to configure. Valid health check types are port, process, and http. For more information, see Health Check Types below.

  • HEALTH-CHECK-TIMEOUT is the amount of time allowed to elapse between starting an app and the first healthy response. For more information, see Health Check Timeouts below.

For more information about the cf push command, see push in the Cloud Foundry CLI Reference Guide.

Note: The health check configuration that you provide with cf push overrides any configuration in the app manifest.

Configure Health Checks for an Existing App

To configure a health check for an existing app or to add a custom HTTP endpoint, run:

cf set-health-check APP-NAME HEALTH-CHECK-TYPE --endpoint CUSTOM-HTTP-ENDPOINT

Where:

  • APP-NAME is the name of your app.

  • HEALTH-CHECK-TYPE is the type of health check that you want to configure. Valid health check types are port, process, and http. For more information, see Health Check Types below.

  • CUSTOM-HTTP-ENDPOINT is the custom HTTP endpoint that you want to add to the health check. An http health check defaults to using / as its endpoint unless you specify a custom endpoint. For more information, see Health Check HTTP Endpoints below.

For more information about the cf set-health-check command, see set-health-check in the Cloud Foundry CLI Reference Guide.

Note: After you set the health check configuration of a deployed app with the cf set-health-check command, you must restart the app for the change to take effect.

Note: You can also change the health check invocation timeout for an app. If you have installed cf CLI v6, use cf v3-set-health-check. If you have installed cf CLI v7, use cf set-health-check. This option also requires restarting the app. For more information, see Apps (experimental) or Apps in Cloud Foundry CLI Reference Guide.

Understand Health Checks

Health Check Lifecycle

The following table describes how app health checks work.

Stage Description
1 The app developer deploys an app to PWS.
2 When deploying the app, the developer specifies a health check type for the app and, optionally, a timeout. If the developer does not specify a health check type, then the monitoring process defaults to a port health check.
3 Cloud Controller stages, starts, and runs the app.
4 Based on the type specified for the app, Cloud Controller configures a health check that runs periodically for each app instance.
5 When Diego starts an app instance, the app health check runs every two seconds until a response indicates that the app instance is healthy or until the health check timeout elapses. The 2-second health check interval is not configurable.
6 When an app instance becomes healthy, its route is advertised, if applicable. Subsequent health checks are run every 30 seconds once the app becomes healthy. The 30-second health check interval is not configurable.
7 If a previously healthy app instance fails a health check, Diego considers that particular instance to be unhealthy. As a result, Diego stops and deletes the app instance, then reschedules a new app instance. This stoppage and deletion of the app instance is reported back to the Cloud Controller as a crash event.
8 When an app instance crashes, Diego immediately attempts to restart the app instance several times. After three failed restarts, PWS waits 30 seconds before attempting another restart. The wait time doubles each restart until the ninth restart, and remains at that duration until the 200th restart. After the 200th restart, PWS stops trying to restart the app instance.

Health Check Types

The following table describes the types of health checks available for apps and recommended circumstances in which to use them:

Health Check Type Recommended Use Case Explanation
http The app can provide an HTTP 200 response. The http health check performs a GET request to the configured HTTP endpoint on the app’s default port. When the health check receives an HTTP 200 response, the app is declared healthy. VMware recommends that you use the http health check type whenever possible. A healthy HTTP response ensures that the web app is ready to serve HTTP requests. The configured endpoint must respond within one second to be considered healthy.

Warning: To prevent false negatives, use a dedicated endpoint for health checks where response time and result do not depend on business logic.

port The app can receive TCP connections, including HTTP web apps. A health check makes a TCP connection to the port or ports configured for the app. For apps with multiple ports, a health check monitors each port. If you do not specify a health check type for your app, then the monitoring process defaults to a port health check. The TCP connection must be established within one second to be considered healthy.
process The app does not support TCP connections. An example of such an app is a worker. For a process health check, Diego ensures that any process declared for the app stays running. If the process exits, Diego stops and deletes the app instance.

Health Check Timeouts

The value configured for the health check timeout is the amount of time allowed to elapse between starting an app and the first healthy response from the app. If the health check does not receive a healthy response within the configured timeout, then the app is declared unhealthy.

In Pivotal Web Services, the default timeout is 60 seconds and the maximum configurable timeout is 180 seconds.

Health Check HTTP Endpoints

Only used by http type, the --endpoint flag of the cf set-health-check command specifies the path portion of a URI that must be served by the app and return HTTP 200 when the app is healthy.

This command only checks the health of the default port of the app.

Note: For HTTP apps, VMware recommends setting the health check type to http instead of a simple port check.