Data

Edit this page

Akin to Vega’s data model, the basic data model used by Vega-Lite is tabular data, similar to a spreadsheet or a database table. Individual data sets are assumed to contain a collection of records, which may contain any number of named data fields.

Vega-Lite’s data property describes the visualization’s data source as part of the specification, which can be either inline data (values) or a URL from which to load the data (url). Or, we can create an empty, named data source (name), which can be bound at runtime or populated from top-level datasets.

In addition, Vega-Lite includes data generators which can generate data sets such as numerical sequences or geographic reference elements such as GeoJSON graticule or sphere objects.

Documentation Overview

Types of Data Sources

Inline Data

Inline Data can be specified using values property. Here is a list of all properties of an inline data source:

Property Type Description
values Array

Required. The full data set, included inline. This can be an array of objects or primitive values, an object, or a string. Arrays of primitive values are ingested as objects with a data property. Strings are parsed according to the specified format type.

name String

Provide a placeholder name and bind data at runtime.

format DataFormat

An object that specifies the format for parsing the data.

For example, the following specification embeds an inline data table with nine rows and two columns (a and b).

If the input data is simply an array of primitive values, each value is mapped to the data property of a new object. For example [5, 3, 8, 1] is loaded as:

[{"data": 5}, {"data": 3}, {"data": 8}, {"data": 1}]

You can also inline a string that will be parsed according to the specified format type.

Data from URL

Data can be loaded from a URL using the url property. In addition, the format of the input data can be specified using the formatType property. By default Vega-Lite will infer the type from the file extension.

Here is a list of all properties describing a data source from URL:

Property Type Description
url String

Required. An URL from which to load the data set. Use the format.type property to ensure the loaded data is correctly parsed.

name String

Provide a placeholder name and bind data at runtime.

format DataFormat

An object that specifies the format for parsing the data.

For example, the following specification loads data from a relative url: data/cars.json. Note that the format type is implicitly "json" by default.

Named Data Sources

Data can also be added at runtime through the Vega View API. Data sources are referenced by name, which is specified in Vega-Lite with name.

Here is a list of all properties describing a named data source:

Property Type Description
name String

Required. Provide a placeholder name and bind data at runtime.

New data may change the layout but Vega does not always resize the chart. To update the layout when the data updates, set autosize or explicitly use view.resize.

format DataFormat

An object that specifies the format for parsing the data.

For example, to create a data source named myData, use the following data

{
  "name": "myData"
}

You can use the Vega view API to load data at runtime and update the chart. Here is an example using Vega-Embed:

vegaEmbed('#vis', spec).then(res =>
  res.view
    .insert('myData', [
      /* some data array */
    ])
    .run()
);

You can also use a changeset to modify the data on the chart as done on this data streaming demo

Format

The format object describes the data format and additional parsing instructions.

Property Type Description
type String

Type of input data: "json", "csv", "tsv", "dsv".

Default value: The default format type is determined by the extension of the file URL. If no extension is detected, "json" will be used by default.

parse Object | Null

If set to null, disable type inference based on the spec and only use type inference based on the data. Alternatively, a parsing directive object can be provided for explicit data types. Each property of the object corresponds to a field name, and the value to the desired data type (one of "number", "boolean", "date", or null (do not parse the field)). For example, "parse": {"modified_on": "date"} parses the modified_on field in each input record a Date value.

For "date", we parse data based using JavaScript’s Date.parse(). For Specific date formats can be provided (e.g., {foo: "date:'%m%d%Y'"}), using the d3-time-format syntax. UTC date format parsing is supported similarly (e.g., {foo: "utc:'%m%d%Y'"}). See more about UTC time

json

Loads a JavaScript Object Notation (JSON) file. Assumes row-oriented data, where each row is an object with named attributes. This is the default file format, and so will be used if no format property is provided. If specified, the format property should have a type property of "json", and can also accept the following:

Property Type Description
property String

The JSON property containing the desired data. This parameter can be used when the loaded JSON file may have surrounding structure or meta-data. For example "property": "values.features" is equivalent to retrieving json.values.features from the loaded JSON object.

csv

Load a comma-separated values (CSV) file. This format type does not support any additional properties.

tsv

Load a tab-separated values (TSV) file. This format type does not support any additional properties.

dsv

Load a delimited text file with a custom delimiter. This is a general version of CSV and TSV.

Property Type Description
delimiter String

Required. The delimiter between records. The delimiter must be a single character (i.e., a single 16-bit code unit); so, ASCII delimiters are fine, but emoji delimiters are not.

topojson

Load a JavaScript Object Notation (JSON) file using the TopoJSON format. The input file must contain valid TopoJSON data. The TopoJSON input is then converted into a GeoJSON format. There are two mutually exclusive properties that can be used to specify the conversion process:

Property Type Description
feature String

The name of the TopoJSON object set to convert to a GeoJSON feature collection. For example, in a map of the world, there may be an object set named "countries". Using the feature property, we can extract this set and generate a GeoJSON feature object for each country.

mesh String

The name of the TopoJSON object set to convert to mesh. Similar to the feature option, mesh extracts a named TopoJSON object set. Unlike the feature option, the corresponding geo data is returned as a single, unified mesh instance, not as individual GeoJSON features. Extracting a mesh is useful for more efficiently drawing borders or other geographic elements that you do not need to associate with specific regions such as individual countries, states or counties.

Data Generators

Sequence Generator

The sequence generator creates a set of numeric values based on given start, stop, and (optional) step properties. By default, new objects with a single field named data are generated; use the as property to change the field name.

Property Type Description
start Number

Required. The starting value of the sequence (inclusive).

stop Number

Required. The ending value of the sequence (exclusive).

step Number

The step value between sequence entries.

Default value: 1

as String

The name of the generated sequence field.

Default value: "data"

For example, the following specification generates a domain of number values and then uses calculate transforms to draw a sine curve:

Graticule Generator

A graticule is a grid formed by lines of latitude and longitude. The graticule generator creates a geographic grid (as GeoJSON data) to serve as a guiding element to include in maps. The graticule generator can be specified with either a boolean true value (indicating the default graticule) or a graticule property object:

Property Type Description
extent Array

Sets both the major and minor extents to the same values.

extentMajor Array

The major extent of the graticule as a two-element array of coordinates.

extentMinor Array

The minor extent of the graticule as a two-element array of coordinates.

precision Number

The precision of the graticule in degrees.

Default value: 2.5

step Array

Sets both the major and minor step angles to the same values.

stepMajor Array

The major step angles of the graticule.

Default value: [90, 360]

stepMinor Array

The minor step angles of the graticule.

Default value: [10, 10]

The following example generates a custom graticule and visualizes it using an orthographic projection:

Sphere Generator

A GeoJSON sphere represents the full globe. The sphere generator injects a dataset whose contents are simply [{"type": "Sphere"}]. The resulting sphere can be used as a background layer within a map to represent the extent of the Earth. The sphere generator requires either a boolean true value or an empty object {} as its sole property.

The following example generates a layered base map containing a sphere (light blue fill) and a default graticule (black strokes):

Datasets

Vega-Lite supports a top-level datasets property. This can be useful when the same data should be inlined in different places in the spec. Instead of setting values inline, specify datasets at the top level and then refer to the named datasource in the rest of the spec. datasets is a mapping from name to an inline dataset.

    "datasets": {
      "somedata": [1,2,3]
    },
    "data": {
      "name": "somedata"
    }