Regression

Edit this page

The regression transform fits two-dimensional regression models to smooth and predict data. This transform can fit multiple models for input data (one per group) and generates new data objects that represent points for summary trend lines. Alternatively, this transform can be used to generate a set of objects containing regression model parameters, one per group.

This transform supports parametric models for the following functional forms:

  • linear (linear): y = a + b * x
  • logarithmic (log): y = a + b * log(x)
  • exponential (exp): y = a + eb * x
  • power (pow): y = a * xb
  • quadratic (quad): y = a + b * x + c * x2
  • polynomial (poly): y = a + b * x + … + k * xorder

All models are fit using ordinary least squares. For non-parametric locally weighted regression, see the loess transform.

// Any View Specification
{
  ...
  "transform": [
    {"regression": ...} // Regression Transform
     ...
  ],
  ...
}

Regression Transform Definition

Property Type Description
regression FieldName

Required. The data field of the dependent variable to predict.

on FieldName

Required. The data field of the independent variable to use a predictor.

groupby FieldName[]

The data fields to group by. If not specified, a single group containing all data objects will be used.

method String

The functional form of the regression model. One of "linear", "log", "exp", "pow", "quad", or "poly".

Default value: "linear"

order Number

The polynomial order (number of coefficients) for the ‘poly’ method.

Default value: 3

extent Any[]

A [min, max] domain over the independent (x) field for the starting and ending points of the generated trend line.

as Any[]

The output field names for the smoothed points generated by the regression transform.

Default value: The field names of the input x and y values.

Usage

{"regression": "y", "on": "x"}

Generate a linear regression trend line that models field "y" as a function of "x". The output data stream can then be visualized with a line mark, and takes the form:

[
  {"x": 1, "y": 2.3},
  {"x": 2, "y": 2.7},
  {"x": 3, "y": 3.0},
  ...
]

If the groupby parameter is provided, separate trend lines will be fit per-group, and the output records will additionally include all groupby field values.

Example