# Scales

**Scales** map data values (numbers, dates, categories, *etc.*) to visual values (pixels, colors, sizes). Scales are a fundamental building block of data visualization, as they determine the nature of visual encodings. Vega includes a range of scales for both continuous and discrete input data, and supports mappings for position, shape, size and color encodings.

To visualize scales, Vega specifications may include axes or legends. For more about supported color encodings, see the color scheme reference. Internally, Vega uses the scales provided by the d3-scale library; for more background see Introducing d3-scale by Mike Bostock.

## Documentation Overview

## Scale Properties

Properties shared across scale types.

Property | Type | Description |
---|---|---|

name | String | A unique name for the scale. Scales and projections share the same namespace; names must be unique across both.Required. |

type | String | The type of scale (default `linear` ). See the scale type reference for more. |

domain | Domain | The domain of input data values for the scale. For quantitative data, this can take the form of a two-element array with minimum and maximum values. For ordinal or categorical data, this may be an array of valid input values. The domain may also be specified as a reference to a data source. See the scale domain reference for more. |

domainMax | Number | Sets the maximum value in the scale domain, overriding the domain property. The domainMax property is only intended for use with scales having continuous domains. |

domainMin | Number | Sets the minimum value in the scale domain, overriding the domain property. The domainMin property is only intended for use with scales having continuous domains. |

domainMid | Number | Inserts a single mid-point value into a two-element domain. The mid-point value must lie between the domain minimum and maximum values. This property can be useful for setting a midpoint for diverging color scales. The domainMid property is only intended for use with scales supporting continuous, piecewise domains. |

domainRaw | Array | An array of raw values that, if non-null, directly overrides the domain property. This is useful for supporting interactions such as panning or zooming a scale. The scale may be initially determined using a data-driven domain, then modified in response to user input by setting the rawDomain value. |

range | Range | The range of the scale, representing the set of visual values. For numeric values, the range can take the form of a two-element array with minimum and maximum values. For ordinal or quantized data, the range may be an array of desired output values, which are mapped to elements in the specified domain. See the scale range reference for more. |

reverse | Boolean | If true, reverses the order of the scale range. |

round | Boolean | If true, rounds numeric output values to integers. Helpful for snapping to the pixel grid. |

## Scale Types

In addition, Vega can be extended at runtime with additional scales using the `vega.scale`

method.

## Quantitative Scales

A quantitative scale maps a continuous domain (numbers or dates) to a continuous output range (pixel locations, sizes, colors). The available quantitative scale *type* values are `linear`

, `pow`

, `sqrt`

, `log`

, `time`

and `utc`

. All quantitative scales except for `time`

and `utc`

use a default *domain* of [0, 1] and default unit *range* [0, 1].

Property | Type | Description |
---|---|---|

clamp | Boolean | A boolean indicating if output values should be clamped to the range (default `false` ). If clamping is disabled and the scale is passed a value outside the domain, the scale may return a value outside the range through extrapolation. If clamping is enabled, the output value of the scale is always within the scale’s range. |

interpolate | String | Object | The interpolation method for range values. By default, a general interpolator for numbers, dates, strings and colors (in RGB space) is used. For color ranges, this property allows interpolation in alternative color spaces. Legal values include `rgb` , `hsl` , `hsl-long` , `lab` , `hcl` , `hcl-long` , `cubehelix` and `cubehelix-long` (‘-long’ variants use longer paths in polar coordinate spaces). If object-valued, this property accepts an object with a string-valued type property and an optional numeric gamma property applicable to rgb and cubehelix interpolators. For more, see the d3-interpolate documentation. |

padding | Number | Expands the scale domain to accommodate the specified number of pixels on each of the scale range. The scale range must represent pixels for this parameter to function as intended. Padding adjustment is performed prior to all other adjustments, including the effects of the zero, nice, domainMin, and domainMax properties. |

nice | Boolean | Number | Extends the domain so that it starts and ends on nice round values. This method typically modifies the scale’s domain, and may only extend the bounds to the nearest round value. Nicing is useful if the domain is computed from data and may be irregular. For example, for a domain of [0.201479…, 0.996679…], a nice domain might be [0.2, 1.0]. Domain values set via domainMin and domainMax (but not domainRaw) are subject to nicing. Using a number value for this parameter (representing a desired tick count) allows greater control over the step size used to extend the bounds, guaranteeing that the returned ticks will exactly cover the domain. |

zero | Boolean | Boolean flag indicating if the scale domain should include zero. The default value is `true` for `linear` , `sqrt` and `pow` , and `false` otherwise. |

### Linear Scales

Linear scales (`linear`

) are quantitative scales scales that preserve proportional differences. Each range value *y* can be expressed as a linear function of the domain value *x*: *y = mx + b*.

### Power Scales

Power scales (`pow`

) are quantitative scales scales that apply an exponential transform to the input domain value before the output range value is computed. Each range value *y* can be expressed as a polynomial function of the domain value *x*: *y = mx^k + b*, where *k* is the exponent value. Power scales also support negative domain values, in which case the input value and the resulting output value are multiplied by -1.

Property | Type | Description |
---|---|---|

exponent | Number | The exponent to use in the scale transform (default `1` ). |

### Square Root Scales

Square root (`sqrt`

) scales are a convenient shorthand for power scales with an exponent of `0.5`

, indicating a square root transform.

### Logarithmic Scales

Log scales (`log`

) are quantitative scales scales in which a logarithmic transform is applied to the input domain value before the output range value is computed. Log scales are particularly useful for plotting data that varies over multiple orders of magnitude. The mapping to the range value *y* can be expressed as a logarithmic function of the domain value *x*: *y = m log(x) + b*.

As log(0) = -∞, a log scale domain must be strictly-positive or strictly-negative; the domain must not include or cross zero. A log scale with a positive domain has a well-defined behavior for positive values, and a log scale with a negative domain has a well-defined behavior for negative values. (For a negative domain, input and output values are implicitly multiplied by -1.) The behavior of the scale is undefined if you run a negative value through a log scale with a positive domain or vice versa.

Property | Type | Description |
---|---|---|

base | Number | The base of the logarithm (default `10` ). |

### Time and UTC Scales

Time scales (`time`

and `utc`

) are quantitative scales with a temporal domain: values in the input domain are assumed to be Date objects or timestamps. The `time`

scale type uses the current local timezone setting. UTC scales (`utc`

) instead use Coordinated Universal Time. Both `time`

and `utc`

scales use a default *domain* of [2000-01-01, 2000-01-02], and a default unit *range* [0, 1].

Property | Type | Description |
---|---|---|

nice | String | Object | Number | Boolean | If specified, modifies the scale domain to use a more human-friendly value range. For `time` and `utc` scale types, the nice value can additionally be a string indicating the desired time interval. Legal values are `"millisecond"` , `"second"` , `"minute"` , `"hour"` , `"day"` , `"week"` , `"month"` , and `"year"` . Alternatively, `time` and `utc` scales can accept an object-valued interval specifier of the form `{"interval": "month", "step": 3}` , which includes a desired number of interval steps. Here, the domain would snap to quarter (Jan, Apr, Jul, Oct) boundaries. |

### Sequential Scales

Sequential scales (`sequential`

) are similar to linear scales, but use a fixed interpolator to determine the output range. The major use case for sequential scales is continuous quantitative color scales. Sequential scales default to a *domain* of [0, 1].

Akin to quantitative scales, sequential scales support piecewise *domain* settings with more than two entries. In such cases, the output range is subdivided into equal-sized segments for each piecewise segment of the domain. For example, the domain [1, 4, 10] would lead to the interpolants `1 -> 0`

, `4 -> 0.5`

, and `10 -> 1`

. Piecewise domains are useful for parameterizing diverging color encodings, where a middle domain value corresponds to the mid-point of the color range.

Property | Type | Description |
---|---|---|

clamp | Boolean | A boolean indicating if output values should be clamped to the range (default `false` ). If clamping is disabled and the scale is passed a value outside the domain, the scale may return a value outside the range through extrapolation. If clamping is enabled, the output value of the scale is always within the scale’s range. |

domainMax | Number | Sets the maximum value in the scale domain, overriding the domain property. |

domainMin | Number | Sets the minimum value in the scale domain, overriding the domain property. |

domainMid | Number | Inserts a single mid-point value into a two-element domain. The mid-point value must lie between the domain minimum and maximum values. This property can be useful for setting a midpoint for diverging color scales. |

range | Scheme | Color[ ] | The Required.range value should either be a color scheme object or an array of color strings. If an array of colors is provided, the array will be used to create a continuous interpolator via basis spline interpolation in the RGB color space. |

## Discrete Scales

Discrete scales map values from a discrete domain to a discrete range. In the case of `band`

and `point`

scales, the range is determined by discretizing a continuous numeric range.

### Ordinal Scales

Ordinal scales (`ordinal`

) have a discrete domain and range. For example, an ordinal scale might map a set of named categories to a set of colors, or to a set of shapes. Ordinal scales function as a “lookup table” from a domain value to a range value.

This example uses an ordinal scale for color-coded categories, with up to 20 unique colors:

```
{
"scales": [
{
"name": "color",
"type": "ordinal",
"domain": {"data": "table", "field": "category"},
"range": {"scheme": "category20"}
}
]
}
```

### Band Scales

Band scales (`band`

) accept a discrete domain similar to ordinal scales, but map this domain to a continuous, numeric output range such as pixels. Discrete output values are automatically computed by the scale by dividing the continuous range into uniform *bands*. Band scales are typically used for bar charts with an ordinal or categorical dimension.

In addition to a standard numerical *range* value (such as `[0, 500]`

), band scales can be given a fixed *step* size for each band. The actual range is then determined by both the step size and the cardinality (element count) of the input domain. The step size is specified by an object with a *step* property that provides the step size in pixels, for example `"range": {"step": 20}`

.

This image from the d3-scale documentation illustrates how a band scale works:

Property | Type | Description |
---|---|---|

align | Number | The alignment of elements within each band step, as a fraction of the step size (default `0.5` ). This value must lie in the range [0,1]. |

padding | Number | Sets paddingInner and paddingOuter to the same padding value (default `0` ). This value must lie in the range [0,1]. |

paddingInner | Number | The inner padding (spacing) within each band step, as a fraction of the step size (default `0` ). This value must lie in the range [0,1]. |

paddingOuter | Number | The outer padding (spacing) at the ends of the scale range, as a fraction of the step size (default `0` ). This value must lie in the range [0,1]. |

### Point Scales

Point scales (`point`

) are a variant of band scales where the internal band width is fixed to zero. Point scales are typically used for scatterplots with an ordinal or categorical dimension. Similar to band scales, point scale *range* values may be specified using either a numerical extent (`[0, 500]`

) or a step size (`{"step": 20}`

). As point scales do not have internal band widths (only step sizes between bands), they do not accept the *paddingInner* property.

This image from the d3-scale documentation illustrates how a point scale works:

Property | Type | Description |
---|---|---|

align | Number | The alignment of elements within each band step, as a fraction of the step size (default `0.5` ). This value must lie in the range [0,1]. |

padding | Number | An alias for paddingOuter (default `0` ). This value must lie in the range [0,1]. |

paddingOuter | Number | The outer padding (spacing) at the ends of the scale range, as a fraction of the step size (default `0` ). This value must lie in the range [0,1]. |

## Discretizing Scales

Discretizing scales break up a continuous domain into discrete segments, and then map values in each segment to a range value.

### Quantile Scales

Quantile scales (`quantile`

) map a sample of input domain values to a discrete range based on computed quantile boundaries. The domain is considered continuous and thus the scale will accept any reasonable input value; however, the domain is specified as a discrete set of sample values. The number of values in (*i.e.*, the cardinality of) the output range determines the number of quantiles that will be computed from the domain. To compute the quantiles, the domain is sorted, and treated as a population of discrete values. The resulting quantile boundaries segment the domain into groups with roughly equals numbers of sample points per group.

Quantile scales are particularly useful for creating color or size encodings with a fixed number of output values. Using a discrete set of encoding levels (typically between 5-9 colors or sizes) sometimes supports more accurate perceptual comparison than a continuous range. For related functionality see quantize scales, which partition the domain into uniform domain extents, rather than groups with equal element counts. Quantile scales have the benefit of evenly distributing data points to encoded values. In contrast, quantize scales uniformly segment the input domain and provide no guarantee on how data points will be distributed among the output visual values.

This example color-codes quantile values in five groups, using colors sampled from a continuous color scheme:

```
{
"name": "color",
"scale": "quantile",
"domain": {"data": "table", "field": "value"},
"range": {"scheme": "plasma", "count": 5}
}
```

### Quantize Scales

Quantize scales (`quantize`

) are similar to linear scales, except they use a discrete rather than continuous range. The continuous input domain is divided into uniform segments based on the number of values in (*i.e.*, the cardinality of) the output range. Each range value *y* can be expressed as a quantized linear function of the domain value *x*: *y = m round(x) + b*.

Quantize scales are particularly useful for creating color or size encodings with a fixed number of output values. Using a discrete set of encoding levels (typically between 5-9 colors or sizes) sometimes supports more accurate perceptual comparison than a continuous range. For related functionality see quantile scales, which partition the domain into groups with equal element counts, rather than uniform domain extents.

Property | Type | Description |
---|---|---|

nice | Boolean | Number | Extends the domain so that it starts and ends on nice round values. This method typically modifies the scale’s domain, and may only extend the bounds to the nearest round value. Nicing is useful if the domain is computed from data and may be irregular. For example, for a domain of [0.201479…, 0.996679…], a nice domain might be [0.2, 1.0]. Domain values set via domainMin and domainMax (but not domainRaw) are subject to nicing. Using a number value for this parameter (representing a desired tick count) allows greater control over the step size used to extend the bounds, guaranteeing that the returned ticks will exactly cover the domain. |

zero | Boolean | Boolean flag indicating if the scale domain should include zero (default `false` ). |

This example color-codes a quantized domain using a 7-point color scheme:

```
{
"name": "color",
"scale": "quantize",
"domain": {"data": "table", "field": "value"},
"range": {"scheme": "blues", "count": 7}
}
```

### Threshold Scales

Threshold scales (`threshold`

) are similar to quantize scales, except they allow mapping of *arbitrary* subsets of the domain (not uniform segments) to discrete values in the range. The input domain is still continuous, and divided into slices based on a set of threshold values provided to the *domain* property. The *range* property must have N+1 elements, where N is the number of threshold boundaries provided in the *domain*.

Given the following scale definition,

```
{
"name": "threshold",
"type": "threshold",
"domain": [0, 1],
"range": ["red", "white", "blue"]
}
```

the scale will map domain values to color strings as follows:

```
-1 => "red"
0 => "white"
0.5 => "white"
1.0 => "blue"
1000 => "blue"
```

### Bin-Linear Scales

Binned linear scales (`bin-linear`

) are a special type of linear scale for use with data that has been subdivided into bins (for example, using Vega’s bin transform). The *domain* values for a binned linear scale must be the set of all bin boundaries, from the minimum bin start to maximum bin end. Input domain values are discretized to the appropriate bin, and then run through a standard linear scale mapping. The main benefit of using `bin-linear`

scales is that they provide “bin-aware” routines for sampling values and generating labels for inclusion in legends. They are particularly useful for creating binned size encodings.

The trickiest part of using binned linear scales is retrieving the correct set of bin boundaries for the *domain* property. Here is one way to do this in conjunction with a bin transform:

```
{
"data": [
{
"name": "input",
"transform": [
{ "type": "extent", "field": "value", "signal": "extent" },
{ "type": "bin", "extent": {"signal": "extent"}, "signal": "bins" }
]
}
],
"scales": [
{
"name": "size",
"type": "bin-linear",
"domain": {"signal": "sequence(bins.start, bins.stop + bins.step, bins.step)"},
"range": [1, 1000]
}
]
}
```

### Bin-Ordinal Scales

Binned ordinal scales (`bin-ordinal`

) are a special type of ordinal scale for use with data that has been subdivided into bins (for example, using Vega’s bin transform). The *domain* values for a binned ordinal scale must be the set of all bin boundaries, from the minimum bin start to maximum bin end. Input domain values are discretized to the appropriate bin, which is then treated as a standard ordinal scale input. The main benefit of using `bin-ordinal`

scales is that they provide “bin-aware” routines for sampling values and generating labels for inclusion in legends. They are particularly useful for creating binned color encodings.

The trickiest part of using binned ordinal scales is retrieving the correct set of bin boundaries for the *domain* property. Here is one way to do this in conjunction with a bin transform:

```
{
"data": [
{
"name": "input",
"transform": [
{ "type": "extent", "field": "value", "signal": "extent" },
{ "type": "bin", "extent": {"signal": "extent"}, "signal": "bins" }
]
}
],
"scales": [
{
"name": "color",
"type": "bin-ordinal",
"domain": {"signal": "sequence(bins.start, bins.stop + bins.step, bins.step)"},
"range": {"scheme": "greens"}
}
]
}
```

## Scale Domains

Scale domains can be specified in multiple ways:

- As an array literal of domain values. For example,
`[0, 500]`

or`['a', 'b', 'c']`

. Array literals may include signal references as elements. - A signal reference that resolves to a domain value array. For example,
`{"signal": "myDomain"}`

. - A data reference object that specifies field values in one or more data sets.

### Basic Data Reference

A basic *data reference* indicates a data set, field name, and optional sorting for discrete scales:

Property | Type | Description |
---|---|---|

data | String | The name of the data set containing domain values.Required. |

field | Field | The name of the data field (e.g., Required.`"price"` ). |

sort | Boolean | Sort | If a boolean `true` value, sort the domain values in ascending order. If object-valued, sort the domain according to the provided sort parameters. Sorting is only supported for discrete scale types. |

For example, `"domain": {"data": "table", "field": "value"}`

, derives a scale domain from the `value`

field of data objects in the `table`

data set. If the scale type is quantitative or a `quantize`

, the derived domain will be a two-element [min, max] array. If the scale type is discrete, the derived domain will be an array of all distinct values. If the scale type is `quantile`

, all values will be used to compute quantile boundaries.

### Multi-Field Data References

Scale domains can also be derived using values from multiple fields. Multiple fields from the same data set can be specified by replacing the *field* property with a *fields* property that takes an array of field names:

Property | Type | Description |
---|---|---|

data | String | The name of the data set containing domain values.Required. |

fields | Field[ ] | The names of the data field (e.g., Required.`["price", "cost"]` ). |

sort | Boolean | Sort | If a boolean `true` value, sort the domain values in ascending order. If object-valued, sort the domain according to the provided sort parameters. Sorting is only supported for discrete scale types. |

More generally, scale domains may also use values pulled from different data sets. In this case, the domain object should have a *fields* property, which is an array of basic data references:

Property | Type | Description |
---|---|---|

fields | DataRef[ ] | An array of basic data references indicating each data set and field value to include in the domain. In addition, array literals (e.g., Required.`[0, 100]` , `["a", "b", "c"]` ) may be included as elements of the fields array for inclusion in the domain determination. |

sort | Boolean | Sort | If a boolean `true` value, sort the domain values in ascending order. If object-valued, sort the domain according to the provided sort parameters. Sorting is only supported for discrete scale types. |

Here is an example that constructs a domain using the fields `price`

and `cost`

drawn from two different data sets:

```
"domain": {
"fields": [
{"data": "table1", "field": "price"},
{"data": "table2", "field": "cost"}
]
}
```

### Sorting Domains

The *sort* property of a domain data reference can accept, in addition to a simple boolean, an object-valued sort definition:

Property | Type | Description |
---|---|---|

field | Field | The data field to sort by. If unspecified, defaults to the field specified in the outer data reference. |

op | String | An aggregate operation to perform on the field prior to sorting. Examples include `count` , `mean` and `median` . This property is required in cases where the sort field and the data reference field do not match. The input data objects will be aggregated, grouped by data reference field values. For a full list of operations, see the aggregate transform. |

order | String | The sort order. One of `ascending` (default) or `descending` . |

This example sorts distinct `category`

field values in descending order by the associated median of the `value`

field:

```
{
"domain": {
"data": "table",
"field": "category",
"sort": {"op": "median", "field": "value", "order": "descending"}
}
}
```

This example sorts a multi-field domain in descending order based on the counts of each of the domain values:

```
{
"domain": {
"data": "table",
"fields": ["category1", "category2"],
"sort": {"op": "count", "order": "descending"}
}
}
```

**Note:** For domains drawn from multiple fields, the *sort.field* property is not allowed and the only legal *op* is `count`

.

## Scale Ranges

Scale ranges can be specified in multiple ways:

- As an array literal of range values. For example,
`[0, 500]`

or`['a', 'b', 'c']`

. Array literals may include signal references as elements. - A signal reference that resolves to a range value array. For example,
`{"signal": "myRange"}`

. - A color scheme reference for a color palette. For example,
`{"scheme": "blueorange"}`

. - For
`ordinal`

scales only, a data reference for a set of distinct field values. For example,`{"data": "table", "field": "value"}`

. - For
`band`

and`point`

scales only, a step size for each range band. For example,`{"step": 20}`

. - A string indicating a pre-defined scale range default. For example,
`"width"`

,`"symbol"`

, or`"diverging"`

.

### Scale Range Defaults

Scale ranges can also accept string literals that map to default values. Default values can be modified, and new named defaults can be added, by using custom config settings.

Value | Description |
---|---|

`"width"` |
A spatial range determined by the value of the `width` signal. |

`"height"` |
A spatial range determined by the value of the `height` signal. The direction of the range (top-to-bottom or bottom-to-top) is automatically determined according to the scale type. |

`"symbol"` |
The default plotting symbol set to use for shape encodings. |

`"category"` |
The default categorical color scheme to use for nominal data. |

`"diverging"` |
The default diverging color scheme to use for quantitative data. |

`"ordinal"` |
The default sequential color scheme to use for ordinal data. |

`"ramp"` |
The default sequential color scheme to use for quantitative data. |

`"heatmap"` |
The default sequential color scheme to use for quantitative heatmaps. |