KDE Transform

The kde transform ≥ 5.4 performs one-dimensional kernel density estimation over an input data stream and generates uniformly-spaced samples of the estimated densities. Unlike the related density transform, this transform supports groupby functionality and also scaling of densities to convey either probabilities or smoothed counts.

Transform Parameters

Property	Type	Description
field	Field	Required. The data field for which to perform density estimation.
groupby	Field[ ]	The data fields to group by. If not specified, a single group containing all data objects will be used.
cumulative	Boolean	A boolean flag indicating whether to produce density estimates (`false`, default) or cumulative density estimates (`true`).
counts	Boolean	A boolean flag indicating if the output values should be probability estimates (`false`, default) or smoothed counts (`true`).
bandwidth	Number	An optional parameter that determines the width of the Gaussian kernel. If set to `0` (the default), the bandwidth value is automatically estimated from the input data using Scott’s method.
extent	Number[ ]	A [min, max] domain from which to sample the distribution. If unspecified, the extent will be determined by the minimum and maximum values of the observed value field.
minsteps	Number	The minimum number of samples (default 25) to take along the extent domain for plotting the density.
maxsteps	Number	The maximum number of samples (default 200) to take along the extent domain for plotting the density.
resolve	String	Indicates how parameters for multiple densities should be resolved. If `"independent"` (the default), each density may have its own domain extent and dynamic number of curve sample steps. If `"shared"`, the KDE transform will ensure that all densities are defined over a shared domain and curve steps, enabling stacking. ≥ 5.5
steps	Number	The exact number of samples to take along the extent domain for plotting the density. If specified, overrides both minsteps and maxsteps to set an exact number of uniform samples. Potentially useful in conjunction with a fixed extent to ensure consistent sample points for stacked densities.
as	String[ ]	The output fields for the sample value and associated probability. The default is `["value", "density"]`.

Usage

Performs kernel density estimation (with automatically-selected bandwidth) over the numbers in the field value in the input data stream, with separate density estimates across groups defined by the key field:

{
  "type": "kde",
  "groupby": ["key"],
  "field": "value"
}