Density

Edit this page

The density transform performs one-dimensional kernel density estimation over an input data stream and generates a new data stream of samples of the estimated densities.

// Any View Specification
{
  ...
  "transform": [
    {"density": ...} // Density Transform
     ...
  ],
  ...
}

Density Transform Definition

Property Type Description
density String

Required. The data field for which to perform density estimation.

groupby String[]

The data fields to group by. If not specified, a single group containing all data objects will be used.

cumulative Boolean

A boolean flag indicating whether to produce density estimates (false) or cumulative density estimates (true).

Default value: false

counts Boolean

A boolean flag indicating if the output values should be probability estimates (false) or smoothed counts (true).

Default value: false

bandwidth Number

The bandwidth (standard deviation) of the Gaussian kernel. If unspecified or set to zero, the bandwidth value is automatically estimated from the input data using Scott’s rule.

extent Number[]

A [min, max] domain from which to sample the distribution. If unspecified, the extent will be determined by the observed minimum and maximum values of the density value field.

minsteps Number

The minimum number of samples to take along the extent domain for plotting the density.

Default value: 25

maxsteps Number

The maximum number of samples to take along the extent domain for plotting the density.

Default value: 200

resolve String

Indicates how parameters for multiple densities should be resolved. If "independent", each density may have its own domain extent and dynamic number of curve sample steps. If "shared", the KDE transform will ensure that all densities are defined over a shared domain and curve steps, enabling stacking.

Default value: "shared"

steps Number

The exact number of samples to take along the extent domain for plotting the density. If specified, overrides both minsteps and maxsteps to set an exact number of uniform samples. Potentially useful in conjunction with a fixed extent to ensure consistent sample points for stacked densities.

as String[]

The output fields for the sample value and corresponding density estimate.

Default value: ["value", "density"]

Usage

{"density": "measure", "groupby": ["key"]}

Performs density estimation for the "measure" field, with separate estimations performed for each group of records with a distinct "key" field value. The output data is of the form:

[
  {"key": "a", "value": 1, "density": 0.02},
  ...
]

Example: Density Plot

Example: Stacked Density Estimates

To plot a stacked graph of estimates, use a shared extent and a fixed number of subdivision steps to ensure that the points for each area align well. In addition, setting counts to true multiplies the densities by the number of data points in each group, preserving proportional differences:

Example: Faceted Density Estimates

Density estimates of body mass in grams for different penguin species: