This website is for Vega-Lite v2. Go to the main Vega-Lite homepage for the latest release.

Data

Edit this page

Akin to Vega’s data model, the basic data model used by Vega-Lite is tabular data, similar to a spreadsheet or a database table. Individual data sets are assumed to contain a collection of records, which may contain any number of named data fields.

Vega-Lite’s data property describes the visualization’s data source as part of the specification, which can be either inline data (values) or a URL from which to load the data (url). Alternatively, we can create an empty, named data source (name), which can be bound at runtime or populated from top-level datasets.

Documentation Overview

Types of Data Sources

Inline Data

Inline Data can be specified using values property. Here is a list of all properties of an inline data source:

Property Type Description
values InlineDataset

Required. The full data set, included inline. This can be an array of objects or primitive values, an object, or a string. Arrays of primitive values are ingested as objects with a data property. Strings are parsed according to the specified format type.

name String

Provide a placeholder name and bind data at runtime.

format DataFormat

An object that specifies the format for parsing the data.

For example, the following specification embeds an inline data table with nine rows and two columns (a and b).

If the input data is simply an array of primitive values, each value is mapped to the data property of a new object. For example [5, 3, 8, 1] is loaded as:

[ {"data": 5}, {"data": 3}, {"data": 8}, {"data": 1} ]

You can also inline a string that will be parsed according to the specified format type.

Data from URL

Data can be loaded from a URL using the url property. In addition, the format of the input data can be specified using the formatType property. By default Vega-Lite will infer the type from the file extension.

Here is a list of all properties describing a data source from URL:

Property Type Description
url String

Required. An URL from which to load the data set. Use the format.type property to ensure the loaded data is correctly parsed.

name String

Provide a placeholder name and bind data at runtime.

format DataFormat

An object that specifies the format for parsing the data.

For example, the following specification loads data from a relative url: data/cars.json. Note that the format type is implicitly "json" by default.

Named Data Sources

Data can also be added at runtime through the Vega View API. Data sources are referenced by name, which is specified in Vega-Lite with name.

Here is a list of all properties describing a named data source:

Property Type Description
name String

Required. Provide a placeholder name and bind data at runtime.

format DataFormat

An object that specifies the format for parsing the data.

For example, to create a data source named myData, use the following data

{
    "name": "myData"
}

Format

The format object describes the data format and additional parsing instructions.

Property Type Description
type String

Type of input data: "json", "csv", "tsv", "dsv". The default format type is determined by the extension of the file URL. If no extension is detected, "json" will be used by default.

parse String | Parse | Null

If set to "auto" (the default), perform automatic type inference to determine the desired data types. If set to null, disable type inference based on the spec and only use type inference based on the data. Alternatively, a parsing directive object can be provided for explicit data types. Each property of the object corresponds to a field name, and the value to the desired data type (one of "number", "boolean", "date", or null (do not parse the field)). For example, "parse": {"modified_on": "date"} parses the modified_on field in each input record a Date value.

For "date", we parse data based using Javascript’s Date.parse(). For Specific date formats can be provided (e.g., {foo: 'date:"%m%d%Y"'}), using the d3-time-format syntax. UTC date format parsing is supported similarly (e.g., {foo: 'utc:"%m%d%Y"'}). See more about UTC time

json

Loads a JavaScript Object Notation (JSON) file. Assumes row-oriented data, where each row is an object with named attributes. This is the default file format, and so will be used if no format parameter is provided. If specified, the format parameter should have a type property of "json", and can also accept the following:

Property Type Description
property String

The JSON property containing the desired data. This parameter can be used when the loaded JSON file may have surrounding structure or meta-data. For example "property": "values.features" is equivalent to retrieving json.values.features from the loaded JSON object.

csv

Load a comma-separated values (CSV) file. This format type does not support any additional properties.

tsv

Load a tab-separated values (TSV) file. This format type does not support any additional properties.

dsv

Load a delimited text file with a custom delimiter. This is a general version of CSV and TSV.

Property Type Description
delimiter String

Required. The delimiter between records. The delimiter must be a single character (i.e., a single 16-bit code unit); so, ASCII delimiters are fine, but emoji delimiters are not.

topojson

Load a JavaScript Object Notation (JSON) file using the TopoJSON format. The input file must contain valid TopoJSON data. The TopoJSON input is then converted into a GeoJSON format. There are two mutually exclusive properties that can be used to specify the conversion process:

Property Type Description
feature String

The name of the TopoJSON object set to convert to a GeoJSON feature collection. For example, in a map of the world, there may be an object set named "countries". Using the feature property, we can extract this set and generate a GeoJSON feature object for each country.

mesh String

The name of the TopoJSON object set to convert to mesh. Similar to the feature option, mesh extracts a named TopoJSON object set. Unlike the feature option, the corresponding geo data is returned as a single, unified mesh instance, not as individual GeoJSON features. Extracting a mesh is useful for more efficiently drawing borders or other geographic elements that you do not need to associate with specific regions such as individual countries, states or counties.

Datasets

Vega-Lite supports a top-level datasets property. This can be useful when the same data should be inlined in different places in the spec. Instead of setting values inline, specify datasets at the top level and then refer to the named datasource in the rest of the spec. datasets is a mapping from name to an inline dataset.

"datasets": {
  "somedata": [1,2,3]
},
"data": {
  "name": "somedata"
} ```