Encoding

An integral part of the data visualization process is encoding data with visual properties of graphical marks. Vega-Lite’s top-level encoding property represents key-value mappings between encoding channels (such as x, y, or color) and its definition object, which describes the encoded data field or constant value, and the channel’s scale and guide (axis or legend).

{
  "data": ... ,
  "mark": ... ,
  "encoding": {     // Encoding
    "column": ...,
    "row": ...,
    "x": ...,
    "y": ...,
    "color": ...,
    "opacity": ...,
    "size": ...,
    "shape": ...,
    "text": ...,
    "detail": ...
  },
  ...
}

Encoding Channels
Channel Definition

Encoding Channels

The keys in the encoding object are encoding channels. This section lists supported encoding channels in Vega-Lite.

Mark Properties Channels

Mark properties channels map data fields directly to visual properties of the marks. Unlike other channel types, they can be mapped to constant values as well. Here are the supported mark properties:

Property	Type	Description
x, y	ChannelDef	X and Y coordinates for `point`, `circle`, `square`, `line`, `text`, and `tick`. (or to width and height for `bar` and `area` marks).
color	ChannelDef	Color of the marks – either fill or stroke color based on mark type. By default, fill color for `area`, `bar`, `tick`, `text`, `circle`, and `square` / stroke color for `line` and `point`. Supported color values include hex-color (e.g., `#0099ff`) and standard HTML/CSS color names (e.g., `"goldenrod"`). Please see scale range for more detail about color palettes.
opacity	ChannelDef	Opacity of the marks – either can be a value or in a range. Default value: `[0.3, 0.8]` .)
shape	ChannelDef	The symbol’s shape (only for `point` marks). The supported values are `"circle"` (default), `"square"`, `"cross"`, `"diamond"`, `"triangle-up"`, `"triangle-down"`, or else a custom SVG path string.
size	ChannelDef	Size of the mark. • For `point`, `square` and `circle` – the symbol size, or pixel area of the mark. • For `bar` and `tick` – the bar and tick’s size. • For `text` – the text’s font size. • Size is currently unsupported for `line` and `area`.
text	ChannelDef	Text of the `text` mark.
column, row	ChannelDef	`row` and `column` are special encoding channels for faceting.

Additional Level of Detail Channel

Grouping data is another important operation in visualizing data. For aggregated plots, all encoded fields without aggregate functions are used as grouping fields in the aggregation (similar to fields in GROUP BY in SQL). For line and area marks, mapping a data field to color or shape channel will group the lines and stacked areas by the field.

detail channel allows providing an additional grouping field (level) for grouping data in aggregation without mapping data to a specific visual channel.

Property	Type	Description
detail	ChannelDef	Additional levels of detail for grouping data in aggregate views and in line and area marks without mapping data to a specific visual channel. (Example.)

Note: Since detail represents an actual data field in the aggregation, it cannot encode a constant value.

Examples

Here is a scatterplot showing average horsepower and displacement for cars from different origins. We map Origin to detail channel to use the field as a group-by field without mapping it to visual properties of the marks.

Here is a line chart showing stock prices of 5 tech companies over time. We map symbol variable (stock market ticker symbol) to detail to use them to group lines.

Mark Order Channels

order channel sorts the layer order or stacking order (for stacked charts) of the marks while path channel sorts the order of data points in line marks.

Property	Type	Description
order	ChannelDef	Layer order for non-stacked marks, or stack order for stacked marks.
path	ChannelDef	Order of data points in line marks.

Note: Since order and path represent actual data fields that are used to sort the data, they cannot encode constant value. In addition, in aggregate plots, they should have aggregate function specified.

Example: Sorting Layer Order

Given a colored scatterplot.

By default, layer order of the data points are determined by original order of the data.

Mapping the field Origin to order channel will sort the layer of data points by the field.

Here we can see that data points from Origin A appear on the top.

Example: Sorting Stack Order

Given a stacked bar chart:

By default, the stacked bar are sorted by the stack grouping fields (color in this example).

Mapping the sum of yield to order channel will sort the layer of stacked bar by sum of yield instead.

Here we can see that site with higher yields for each type of barley are put on the top of the stack (rightmost).

Example: Sorting Line Order

By default, line marks order their points in their paths by the field of channel x or y. However, to show a pattern of data change over time between gasoline price and average miles driven per capita we use path channel to sort the points in the line by time field (year).

row and column are special encoding channels that facets single plots into trellis plots (or small multiples).

Property	Type	Description
row, column	ChannelDef	Vertical and horizontal facets for vertical and horizontal trellis plots.

For more information, please see facet page.

Note: Since row and column represent actual data fields that are used to partition the data, they cannot encode constant value. In addition, in aggregate plots, they should not have aggregate function specified.

Channel Definition

Each channel definition object must describe the data field encoded by the channel and its data type, or a constant value directly mapped to the mark properties. In addition, it can describe the mapped field’s transformation and properties for its scale and guide.

Encoded Data

To encode a particular field in the data set with a particular channel, the channel must specify the field’s name with field property.

Property	Type	Description
field	String	Name of the field from which to pull a data value.

Data Type

If a field is specified, the channel definition must describe the encoded data’s type of measurement (level of measurement). The supported data types are:

Quantitative: Quantitative data expresses some kind of quantity. Typically this is numerical data. For example 7.3, 42.0, 12.1.
Temporal: Temporal data supports date-times and times. For example 2015-03-07 12:32:17, 17:01, 2015-03-16.
Ordinal: Ordinal data represents ranked order (1st, 2nd, …) by which the data can be sorted. However, as opposed to quantitative data, there is no notion of relative degree of difference between them. For illustration, a “size” variable might have the following values small, medium, large, extra-large. We know that medium is larger than small and same for extra-large larger than large. However, we cannot claim that compare the magnitude of difference, for example, between (1) small and medium and (2) medium and large.
Nominal: Nominal data, also known as categorical data, differentiates between values based only on their names or categories. For example, gender, nationality, music genre, names are all nominal data. Numbers maybe used to represent the variables but the number do not determine magnitude or ordering. For example, if a nominal variable contains three values 1, 2, and 3. We cannot claim that 1 is less than 2 nor 3.

Property	Type	Description
type	String	The encoded field’s type of measurement. This can be either a full type name (`"quantitative"`, `"temporal"`, `"ordinal"`, and `"nominal"`) or an initial character of the type name (`"Q"`, `"T"`, `"O"`, `"N"`). This property is case insensitive.

Note: Data type here describes semantic of the data rather than primitive data types in programming language sense (number, string, etc.). The same primitive data type can have different type of measurement. For example, numeric data can represent quantitative, ordinal, or nominal data.

Field Transforms

To facilitate data exploration, Vega-Lite provides inline field transforms as a part of the channel definition. If a field is provided, the channel definition supports the following transformations:

Property	Type	Description
bin¹	Boolean \| Object	Boolean flag for binning a `quantitative` field, or a bin property object for binning parameters. Default value: `false`
timeUnit¹	String	Time unit for a `temporal` field (e.g., `year`, `yearmonth`, `month`, `hour`). Default value: `undefined` (None)
aggregate^1,2	String	Aggregation function for the field (e.g., `mean`, `sum`, `median`, `min`, `max`, `count`). Default value: `undefined` (None)
sort^1,2	String \| Object	Sort order for a particular field. • For quantitative or temporal fields, this can be either `"ascending"` or , `"descending"` • For quantitative or temporal fields, this can be `"ascending"`, `"descending"`, `"none"`, or a sort field definition object for sorting by an aggregate calculation of a specified sort field. Default value: `"ascending"`

For more information about these field transforms, please see the following pages: bin, timeUnit, aggregate, and sort.

Notes:

¹ Inline field transforms are executed after the top-level transforms are executed, and are executed in this order: bin, timeUnit, aggregate, and sort.

² detail does not support aggregate and sort. When using path and detail, with non-grouping variables in aggregate plots, they should be aggregated to prevent additional groupings.

Constant Value

For mark properties channels, if a field is not specified, constant values for the properties (e.g., color, size) can be also set directly with the channel definition’s value property.

Property	Type	Description
value	String \| Number	A constant value in visual domain.

Note: detail, path, order, row, and column channels cannot encode constant value.

Example

For example, you can set color and shape of a scatter plot to constant values. Note that as the value is set directly to the color and shape values, there is no need to specify data type. In fact, the data type will be ignored if specified.

Similarly, value for size channel of bar marks will adjust the bar’s size. By default, there will be 1 pixel offset between bars. The following example sets the size to 10 to add more offset between bars.

Scale and Guide

For encoding channels that map data directly to visual properties of the marks, they must provide scales, or functions that transform values in the data domain (numbers, dates, strings, etc) to visual values (pixels, colors, sizes).

In addition, visualizations typically provide guides to aid interpretation of scales. There are two types of guides: axes and legends. Axes produces lines, ticks, and labels to convey how a spatial range represent a data range in position channel (x and y). Meanwhile, legends aid interpretation of color, size, and shape’s scales.

By default, Vega-Lite automatically generates a scale and a guide for each field. If no properties are specified, scale, axis, and legend’s properties are determined based on a set of rules by the compiler. scale, axis, legend properties of the channel definition can be used to customize their properties.

Property	Type	Description
scale	Object	A property object for a scale of a mark property channel.
axis	Boolean \| Object	Boolean flag for showing an axis (`true` by default), or a property object for an axis of a position channel (`x` or `y`) or a facet channel (`row` or `column`).
legend	Boolean \| Object	Boolean flag for showing a legend (`true` by default), or a config object for a legend of a non-position mark property channel (`color`, `size`, or `shape`).

For more information about scale, axis, and legend, please look at the respective pages.