Density Transform
The density transform generates a new data stream of uniformly-spaced samples drawn from a one-dimensional probability density function (pdf) or cumulative distribution function (cdf). This transform is useful for representing probability distributions and generating continuous distributions from discrete samples using kernel density estimation.
Transform Parameters
Property | Type | Description |
---|---|---|
distribution | Distribution | Required. An object describing the distribution type and parameters. See the distribution reference for more. |
extent | Number[ ] | A [min, max] domain from which to sample the distribution. This argument is required in most cases, but can be omitted in the case of distributions (namely, kde ) that can deduce their own extent. |
method | String | The type of distribution to generate. One of pdf (default) or cdf . |
minsteps | Number | The minimum number of samples (default 25) to take along the extent domain for plotting the density. ≥ 5.4 |
maxsteps | Number | The maximum number of samples (default 200) to take along the extent domain for plotting the density. ≥ 5.4 |
steps | Number | The exact number of samples to take along the extent domain for plotting the density. If specified, overrides both minsteps and maxsteps to set an exact number of uniform samples. Potentially useful in conjunction with a fixed extent to ensure consistent sample points for stacked densities. |
as | String[ ] | The output fields for the sample value and associated probability. The default is ["value", "density"] . |
Distribution Reference
# normal
Represents a normal (Gaussian) probability distribution with a specified mean and standard deviation stdev.
Property | Type | Description |
---|---|---|
function | String | The value "normal" . |
mean | Number | The mean of the distribution (default 0 ). |
stdev | Number | The standard deviation of the distribution (default 1 ). |
# uniform
Represents a continuous uniform probability distribution over the interval [min, max).
Property | Type | Description |
---|---|---|
function | String | The value "uniform" . |
min | Number | The minimum value (default 0 ). |
max | Number | The maximum value (default 1 ). |
# kde
Represents a kernel density estimate for a set of numerical values. This method uses a Gaussian kernel to estimate a smoothed, continuous probability distribution.
Property | Type | Description |
---|---|---|
function | String | The value "kde" . |
from | Data | The name of the data set to analyze. |
field | Field | The data field containing the values to model. |
bandwidth | Number | An optional parameter that determines the width of the Gaussian kernel. If set to 0 (the default), the bandwidth value will be automatically estimated from the input data. |
# mixture
Represents a (weighted) mixture of probability distributions. The distributions argument should be an array of distribution objects. The optional weights array provides proportional numerical weights for each distribution.
Property | Type | Description |
---|---|---|
function | String | The value "mixture" . |
distributions | Distribution[ ] | An array of distribution definition objects. |
weights | Number[ ] | An optional array of weights for each distribution.If provided, the values will be normalized to ensure that weights sum to 1. Any unspecified weight values default to 1 prior to normalization. |
Usage
{
"type": "density",
"extent": [0, 10],
"distribution": {
"function": "normal",
"mean": 5,
"stdev": 2
}
}
Generates a data stream of data objects drawn from a normal distribution with mean 5 and standard deviation 2, sampling 100 steps along the domain [0, 10]
.
{
"type": "density",
"steps": 200,
"distribution": {
"function": "kde",
"from": "table",
"field": "value"
}
}
Performs kernel density estimation (with automatically-selected bandwidth) over the numbers in the field value
in the data stream named table
. Generates a data stream by drawing 200 uniformly-spaced samples between the minimum and maximum observed data value.