Transforms process a data stream to filter data, calculate new fields, or derive new data streams. Transforms are typically specified within the transform array of a data definition. In addition, transforms that do not filter or generate new data objects can be used within the transform array of a mark definition to specify post-encoding transforms.

The following example defines a new data set with transforms to filter values and then compute a stacked layout (e.g., for a stacked bar chart):

  "data": [
      "name": "table",
      "transform": [
        { "type": "filter", "expr": "datum.value > 5" },
        { "type": "stack", "field": "value", "groupby": ["category"] }

All transforms require a type property, specifying the name of the transform. Transforms that produce a value as a side-effect (in particular, the bin, extent, and crossfilter transforms) can include a signal property to specify a unique signal name to which to bind the transform’s state value.

Basic Transforms

Transforms for processing streams of data objects.

  • aggregate - Group and summarize a data stream.
  • bin - Discretize numeric values into uniform bins.
  • collect - Collect and sort all data objects in a stream.
  • countpattern - Count the frequency of patterns in text strings.
  • cross - Perform a cross-product of a data stream with itself.
  • density - Generate values drawn from a probability distribution.
  • dotbin - Perform density binning for dot plot construction. ≥ 5.7
  • extent - Compute minimum and maximum values over a data stream.
  • filter - Filter a data stream using a predicate expression.
  • flatten - Map array-typed fields to data objects, one per array entry. ≥ 3.1
  • fold - Collapse selected data fields into key and value properties.
  • formula - Extend data objects with derived fields using a formula expression.
  • identifier - Assign unique key values to data objects.
  • kde - Estimate smoothed densities for numeric values. ≥ 5.4
  • impute - Perform imputation of missing values.
  • joinaggregate - Extend data objects with calculated aggregate values.
  • loess - Fit a smoothed trend line using local regression. ≥ 5.4
  • lookup - Extend data objects by looking up key values on another stream.
  • pivot - Pivot unique values to new aggregate fields. ≥ 3.2
  • project - Generate derived data objects with a selected set of fields.
  • quantile - Calculate sample quantile values over an input data stream. ≥ 5.7
  • regression - Fit regression models to smooth and predict values. ≥ 5.4
  • sample - Randomly sample data objects in a stream.
  • sequence - Generate a new stream containing a sequence of numeric values.
  • timeunit - Discretize date-time values into time unit bins. ≥ 5.8
  • window - Calculate over ordered groups, including ranking and running totals.

Geographic and Spatial Transforms

Transforms for modeling spatial data, cartographic projection, and geographic guides.

  • contour - Deprecated. Model a spatial distribution using discrete levels.
  • geojson - Consolidate geographic data into a GeoJSON feature collection.
  • geopath - Map GeoJSON features to SVG path strings.
  • geopoint - Map (longitude, latitude) coordinates to (x, y) points.
  • geoshape - Map GeoJSON features to a shape instance for procedural drawing.
  • graticule - Generate a reference grid for cartographic maps.
  • heatmap - Generate heatmap images for raster grid data. ≥ 5.8
  • isocontour - Generate level set contours for raster grid data. ≥ 5.8
  • kde2d - Estimate 2D densities as output raster grids. ≥ 5.8

Layout Transforms

Transforms for calculating spatial coordinates to achieve various layouts.

  • linkpath - Route visual links between node elements.
  • pie - Compute angular layout for pie and donut charts.
  • stack - Compute stacked layouts for groups of values.
  • force - Compute a force-directed layout via physical simulation.
  • voronoi - Compute a Voronoi diagram for a set of points.
  • wordcloud - Compute a word cloud layout of text strings.

Hierarchy Transforms

Transforms for processing hierarchy (tree) data and performing tree layout.

  • nest - Generate a tree structure by grouping objects by field values.
  • stratify - Generate a tree structure using explicit key values.
  • treelinks - Generate link data objects for a tree structure.
  • pack - Tree layout based on circular enclosure.
  • partition - Tree layout based on spatial adjacency of nodes.
  • tree - Tree layout for a node-link diagram.
  • treemap - Tree layout based on recursive rectangular subdivision.

Cross-Filter Transforms

Transforms for supporting fast incremental filtering of multi-dimensional data.

  • crossfilter - Maintain a filter mask for multiple dimensional queries.
  • resolvefilter - Resolve crossfilter output to generate filtered data streams.