Modes for Handling Invalid Data
Edit this pageThis page discusses modes in Vega-Lite for handling invalid data (null
and NaN
in continuous scales).
The main configurations are mark.invalid
and config.scale.invalid
. In addition, you can use other Vega-Lite features including conditional encodings, layering, or window transform to handle invalid and missing data.
Note: Vega-Lite does not consider null
and NaN
in categorical scales and text encodings as invalid data:
- Categorical scales can treat nulls and NaNs as separate categories
- Similarly, text encodings can directly display nulls and NaNs.
Documentation Overview
Mark Invalid Mode
You can use mark.invalid
(or config.mark.invalid
) to configure how marks and their corresponding scales handle invalid data (null
and NaN
in continuous scales).
Property | Type | Description |
---|---|---|
invalid | String | Null |
Invalid data mode, which defines how the marks and corresponding scales should represent invalid values (
Note: If any channel’s scale has an output for invalid values defined in |
Examples
To understand how these modes affect common marks, see these examples below, which visualize this dataset:
[
{"a": null, "b": 100},
{"a": -10, "b": null},
{"a": -5, "b": -25},
{"a": -1, "b": -20},
{"a": 0, "b": null},
{"a": 1, "b": 30},
{"a": 5, "b": 40},
{"a": 10, "b": null}
]
by assigning "a"
to x-axis (as quantitative and ordinal fields) and "b"
to y-axis.
"filter"
The "filter"
invalid mode excludes all invalid values from the visualization’s marks and scales.
For path marks (for line, area, trail), this option will create paths that connect valid points, as if the points with invalid values do not exist.
"break-paths"
Break path marks (for line, area, trail) at invalid values. For non-path marks, this is equivalent to "filter"
. All scale domains will exclude these filtered data points.
"break-paths-show-domains"
This option is like "break-paths"
, except that all scale domains will instead include these filtered data points.
"show"
Include all data points in the marks and scale domains. Each scale will use the output for invalid values defined in config.scale.invalid
or, if unspecified, by default invalid values will produce the same visual values as zero (if the scale includes zero) or the minimum value (if the scale does not include zero).
"break-paths-show-path-domains"
(Default)
For historical reasons, Vega-Lite 5 currently uses "break-paths-show-path-domains"
as the default invalid data mode (to avoid breaking changes). This is equivalent to "break-path-keep-domains"
for path-based marks (line/area/trail) and "filter"
for other marks.
Scale Output for Invalid Values
You can use config.scale.invalid
to defines scale outputs per channel for invalid values.
Property | Type | Description |
---|---|---|
invalid | ScaleInvalidDataConfig |
An object that defines scale outputs per channel for invalid values (nulls and NaNs on a continuous scale).
Example: Setting this See https://vega.github.io/vega-lite/docs/invalid-data.html for more details. |
Example: Output Color and Size with “Filter” Mode
A visualization with "filter"
invalid data mode will not filter (not exclude) color and size encoding if config.scale.invalid.color
and config.scale.invalid.size
are specified.
Compare this with a similar spec, but without config.scale.invalid
:
Example: Output Color with “Show” Mode
As discussed earlier, by default invalid values will produce the same visual values as zero (if the scale includes zero) or the minimum value (if the scale does not include zero).
However, you may use config.scale.invalid
to override the output for invalid data values:
Other solutions
Note that mark.invalid
and config.scale.invalid
are options for handling invalid data without changing data or marks.
However, you may use other Vega-Lite features such as conditional encoding, layering, and window transforms to encode invalid data.
Example: Conditional Encoding
If you do not use color encoding, you may use conditional color encoding to use a specific color (e.g., gray) to encode invalid values.
Example: Layering
You may also use different marks (such as bars) to encode null data.