Transforms

Transforms process a data stream to filter data, calculate new fields, or derive new data streams. Transforms are typically specified within the transform array of a data definition. In addition, transforms that do not filter or generate new data objects can be used within the transform array of a mark definition to specify post-encoding transforms.

The following example defines a new data set with transforms to filter values and then compute a stacked layout (e.g., for a stacked bar chart):

{
  "data": [
    {
      "name": "table",
      "transform": [
        { "type": "filter", "expr": "datum.value > 5" },
        { "type": "stack", "field": "value", "groupby": ["category"] }
      ]
    }
  ]
}

All transforms require a type property, specifying the name of the transform. Transforms that produce a value as a side-effect (in particular, the bin, extent, and crossfilter transforms) can include a signal property to specify a unique signal name to which to bind the transform’s state value.

Basic Transforms

Transforms for processing streams of data objects.

aggregate - Group and summarize a data stream.
bin - Discretize numeric values into uniform bins.
collect - Collect and sort all data objects in a stream.
countpattern - Count the frequency of patterns in text strings.
cross - Perform a cross-product of a data stream with itself.
density - Generate values drawn from a probability distribution.
dotbin - Perform density binning for dot plot construction. ≥ 5.7
extent - Compute minimum and maximum values over a data stream.
filter - Filter a data stream using a predicate expression.
flatten - Map array-typed fields to data objects, one per array entry. ≥ 3.1
fold - Collapse selected data fields into key and value properties.
formula - Extend data objects with derived fields using a formula expression.
identifier - Assign unique key values to data objects.
kde - Estimate smoothed densities for numeric values. ≥ 5.4
impute - Perform imputation of missing values.
joinaggregate - Extend data objects with calculated aggregate values.
loess - Fit a smoothed trend line using local regression. ≥ 5.4
lookup - Extend data objects by looking up key values on another stream.
pivot - Pivot unique values to new aggregate fields. ≥ 3.2
project - Generate derived data objects with a selected set of fields.
quantile - Calculate sample quantile values over an input data stream. ≥ 5.7
regression - Fit regression models to smooth and predict values. ≥ 5.4
sample - Randomly sample data objects in a stream.
sequence - Generate a new stream containing a sequence of numeric values.
timeunit - Discretize date-time values into time unit bins. ≥ 5.8
window - Calculate over ordered groups, including ranking and running totals.

Geographic and Spatial Transforms

Transforms for modeling spatial data, cartographic projection, and geographic guides.

contour - Deprecated. Model a spatial distribution using discrete levels.
geojson - Consolidate geographic data into a GeoJSON feature collection.
geopath - Map GeoJSON features to SVG path strings.
geopoint - Map (longitude, latitude) coordinates to (x, y) points.
geoshape - Map GeoJSON features to a shape instance for procedural drawing.
graticule - Generate a reference grid for cartographic maps.
heatmap - Generate heatmap images for raster grid data. ≥ 5.8
isocontour - Generate level set contours for raster grid data. ≥ 5.8
kde2d - Estimate 2D densities as output raster grids. ≥ 5.8

Layout Transforms

Transforms for calculating spatial coordinates to achieve various layouts.

force - Compute a force-directed layout via physical simulation.
label - Compute text position and opacity to label a chart. ≥ 5.16
linkpath - Route visual links between node elements.
pie - Compute angular layout for pie and donut charts.
stack - Compute stacked layouts for groups of values.
voronoi - Compute a Voronoi diagram for a set of points.
wordcloud - Compute a word cloud layout of text strings.

Hierarchy Transforms

Transforms for processing hierarchy (tree) data and performing tree layout.

nest - Generate a tree structure by grouping objects by field values.
stratify - Generate a tree structure using explicit key values.
treelinks - Generate link data objects for a tree structure.
pack - Tree layout based on circular enclosure.
partition - Tree layout based on spatial adjacency of nodes.
tree - Tree layout for a node-link diagram.
treemap - Tree layout based on recursive rectangular subdivision.

Cross-Filter Transforms

Transforms for supporting fast incremental filtering of multi-dimensional data.

crossfilter - Maintain a filter mask for multiple dimensional queries.
resolvefilter - Resolve crossfilter output to generate filtered data streams.

Custom Transforms

In addition to the above, custom transformations can also be added to Vega as part of its Extensibility API. See the Transformations section of the API documentation.