Window

Edit this page

The window transform performs calculations over sorted groups of data objects. These calculations including ranking, lead/lag analysis, and aggregates such as running sums and averages. Calculated values are written back to the input data stream. If you only want to set the same aggregated value in a new field, you can use the simpler join aggregate transform.

Documentation Overview

Window Field Definition
Window Transform Definition
- Window Transform Field Definition
- Sort Field Definition
Window Only Operation Reference
Examples

Window Field Definition

// Any View Specification
{
  ...
  "transform": [
    {
      // Window Transform
      "window": [{
          "op": ...,
          "field": ...,
          "param": ...,
          "as": ...
      }],
      "sort": [
        {"field": ..., "order": ...}
      ],
      "ignorePeers": ...,
      "groupby": [
        "..."
      ],
      "frame": [...,...]
    }
     ...
  ],
  ...
}

Window Transform Definition

Property	Type	Description
window	WindowFieldDef[]	*Required.* The definition of the fields in the window, and what calculations to use.
frame	(Null \| Number)[]	A frame specification as a two-element array indicating how the sliding window should proceed. The array entries should either be a number indicating the offset from the current data object, or null to indicate unbounded rows preceding or following the current data object. The default value is `[null, 0]`, indicating that the sliding window includes the current object and all preceding objects. The value `[-5, 5]` indicates that the window should include five objects preceding and five objects following the current object. Finally, `[null, null]` indicates that the window frame should always include all data objects. If you this frame and want to assign the same value to add objects, you can use the simpler join aggregate transform. The only operators affected are the aggregation operations and the `first_value`, `last_value`, and `nth_value` window operations. The other window operations are not affected by this. Default value:: `[null, 0]` (includes the current object and all preceding objects)
ignorePeers	Boolean	Indicates if the sliding window frame should ignore peer values (data that are considered identical by the sort criteria). The default is false, causing the window frame to expand to include all peer values. If set to true, the window frame will be defined by offset values only. This setting only affects those operations that depend on the window frame, namely aggregation operations and the first_value, last_value, and nth_value window operations. Default value: `false`
groupby	String[]	The data fields for partitioning the data objects into separate windows. If unspecified, all data points will be in a single window.
sort	SortField[]	A sort field definition for sorting data objects within a window. If two data objects are considered equal by the comparator, they are considered “peer” values of equal rank. If sort is not specified, the order is undefined: data objects are processed in the order they are observed and none are considered peers (the ignorePeers parameter is ignored and treated as if set to `true`).

Window Transform Field Definition

Property	Type	Description
op	String \| String	*Required.* The window or aggregation operation to apply within a window (e.g., `"rank"`, `"lead"`, `"sum"`, `"average"` or `"count"`). See the list of all supported operations here.
param	Number	Parameter values for the window functions. Parameter values can be omitted for operations that do not accept a parameter. See the list of all supported operations and their parameters here.
field	String	The data field for which to compute the aggregate or window function. This can be omitted for window functions that do not operate over a field such as `"count"`, `"rank"`, `"dense_rank"`.
as	String	*Required.* The output name for the window operation.

Sort Field Definition

Property	Type	Description
field	String	*Required.* The name of the field to sort.
order	String \| Null	Whether to sort the field in ascending or descending order. One of `"ascending"` (default), `"descending"`, or `null` (do not sort).

Window Only Operation Reference

The valid operations include all aggregate operations plus the following window operations.

Operation	Parameter	Description
row_number	None	Assigns each data object a consecutive row number, starting from 1.
rank	None	Assigns a rank order value to each data object in a window, starting from 1. Peer values are assigned the same rank. Subsequent rank scores incorporate the number of prior values. For example, if the first two values tie for rank 1, the third value is assigned rank 3.
dense_rank	None	Assigns dense rank order values to each data object in a window, starting from 1. Peer values are assigned the same rank. Subsequent rank scores do not incorporate the number of prior values. For example, if the first two values tie for rank 1, the third value is assigned rank 2.
percent_rank	None	Assigns a percentage rank order value to each data object in a window. The percent is calculated as (rank - 1) / (group_size - 1).
cume_dist	None	Assigns a cumulative distribution value between 0 and 1 to each data object in a window.
ntile	Number	Assigns a quantile (e.g., percentile) value to each data object in a window. Accepts an integer parameter indicating the number of buckets to use (e.g., 100 for percentiles, 5 for quintiles).
lag	Number	Assigns a value from the data object that precedes the current object by a specified number of positions. If no such object exists, assigns `null`. Accepts an offset parameter (default `1`) that indicates the number of positions. This operation must have a corresponding entry in the fields parameter array.
lead	Number	Assigns a value from the data object that follows the current object by a specified number of positions. If no such object exists, assigns `null`. Accepts an offset parameter (default `1`) that indicates the number of positions. This operation must have a corresponding entry in the fields parameter array.
first_value	None	Assigns a value from the first data object in the current sliding window frame. This operation must have a corresponding entry in the fields parameter array.
last_value	None	Assigns a value from the last data object in the current sliding window frame. This operation must have a corresponding entry in the fields parameter array.
nth_value	Number	Assigns a value from the nth data object in the current sliding window frame. If no such object exists, assigns `null`. Requires a non-negative integer parameter that indicates the offset from the start of the window frame. This operation must have a corresponding entry in the fields parameter array.

Examples

Below are some common use cases for the window transform.

Cumulative Frequency Distribution

Here we use the window transform with frame: [null, 0] to accumulate count in a cumulative frequency distribution plot.

Rank Chart

We can also use rank operator to calculate ranks over time.

Top K

Here we use window transform to derive the total number of students along with the rank of the current student to determine the top K students and display their score.

Also see this example for a top-K plot with “others”.

Cumulative Running Average

Here we use window transform to visualize how the average MPG for vehicles have changed over the years.

Percent of Total

The window transform can be used to compute an aggregate and attach it to all records in order to derive a percent of total, however, a simpler approach is to use the join aggregate transform instead.