LLM-friendly URL

Querying data via API

You can use our API to query the data in your Luzmo dataset. This might be useful in case you'd like to:

Visualize some data yourself in your application (e.g. in your own custom widget)
Act based on the result of a query (e.g. only allow access to an embedded dashboard in case enough data was captured to power the dashboard)
Offer a (scoped) export of data that is not visualized in a dashboard.
- If you have an embedded widget that displays your desired data, you could instead use the getData() method in your frontend (more info on this method can be found here ).
...

In this guide, we'll first explain the different properties of the API Query Syntax , after which you can find some examples .

API Query Syntax

Via the Data API service , you can fire one or more queries simultaneously to any of the Luzmo datasets to which your token has at least can use access. Each query is specified as an object inside the queries array. Below you can see the full body structure of an API request to query data via Luzmo:

server.js

Shell

Node

Java

.NET

Python

PHP

const Luzmo = require('@luzmo/nodejs-sdk');
const client = new Luzmo({
  api_key: "< your API key >",
  api_token: "< your API token >"
});
const data = await client.query({
  queries: [
    {
      dimensions: [ { ... }, { ... },  ... ],
      measures: [ { ... }, { ... }, ... ],
      where: [ { ... }, { ... }, ... ],
      order: [ { ... }, { ... }, ... ],
      limit: { by: ###, offset: ### },
      options: { ... }
    },
    ...
  ]
});

ℹ️

Note: The above API request can be made with

an API key & token , and should in that case happen in your backend to not expose your API key and token
or with an end user's Embed key & token , and can be made in your frontend or backend in that case

The token must have at least can use access right to the dataset(s) to allow querying the dataset freely (more info about different access rights in this Academy article ).

When querying data with an Embed token, its context will be taken into account; e.g. parameter overrides to filter to specific data points, connection overrides (account_overrides) to point towards different data sources, etc.

Dimensions

Dimension columns are typically columns that are grouped on .

Each dimension column specified should use the following structure:

json

"dimensions": [
  {
    "column_id": "< column_id >",
    "dataset_id": "< dataset_id >"
  },
  ...
]

For numeric dimension columns you should additionally specify the discretization that will be used to create numeric bins:

json

"dimensions": [
  {
    "column_id": "< column_id >",
    "dataset_id": "< dataset_id >",
    "discretization": {
      "type": "linear" // Specify "type": "none" to disable the binning functionality.
      "bins": ##       // Width of bins. Omit if binning is not enabled.
  },
  ...
]

For Datetime and Hierarchy dimension columns you should additionally specify the date level (1 = years, ..., 5 = days, ..., 9 = millisecond) or hierarchy level (1 = top level). If not specified, the lowest level available in your column will be used.

json

"dimensions": [
  {
    "column_id": "< column_id >",
    "dataset_id": "< dataset_id >",
    "level": #
  },
  ...
]

Measures

Measure columns are aggregated fields.

Aggregating a column

When aggregating directly on a column, it's important to specify the appropriate aggregation type in the measure. The table below shows which aggregation types are supported for which column types, more information on the supported aggregation types can be found in this Academy article :

Aggregation type	Numeric	Hierarchy	Datetime
count *()**	✔	✔	✔
distinctcount	✔	✔	✔
sum	✔ ()**	✘	✘
average	✔	✘	✘
weightedaverage	✔	✘	✘
median	✔	✘	✘
min	✔	✔ ()*	✔
max	✔	✔ ()*	✔
stddev	✔	✘	✘
rate	✔	✘	✘
cumulativesum	✔	✘	✘

ℹ️

Notes for table above:

*(*) There are two types of counts possible via the Luzmo API. You can count non-empty values of a specific column by specifying column_id: "<column_id>" , or you can count the amount of rows (including NULLs) by specifying "*" as column_id. In the latter case you can even omit the aggregation property entirely.*
*(**) The sum aggregation will be used by default if no aggregation property is specified for a numeric column.*
*(***) The min and max of a hierarchy is based upon alphabetical sorting of the values.*

Each measure column specified should use the following structure:

json

"measures": [
  {
    "column_id": "< column_id >",
    "dataset_id": "< dataset_id >",
    "aggregation": {
      "type": "< aggregation_type >"
    }
  },
  ...
]

For numeric columns with aggregation type rate or weightedaverage , you can additionally specify the denominator/weight numeric column as follows:

json

"measures": [
  {
    "column_id": "< numeric_column_id >",
    "dataset_id": "< dataset_id >",
    "aggregation": {
      "type": "< rate / weightedaverage >",
      "column_id": "< denominator / weight numeric column id >",
      "dataset_id": "< dataset_id >"
    }
  },
  ...
]

Aggregation formula

In case you'd like to query an Aggregation Formula you created via our UI or via API , you can reference its ID as follows. When querying an Aggregation Formula, you do not specify an aggregation type anymore as the aggregation is already defined in the formula. You also do not need to specify the dataset id as the formula has its own globally unique ID.

json

{
  ...,
  "measures": [
    {
      "formula_id": "< formula_id >"
    },
    ...
  ],
  ...
}

You can also query ad-hoc formulas by using the formula property as follows. In this case you do need to specify the dataset id.

json

{
  ...,
  "measures": [
    {
      "dataset_id": "< dataset_id >",
      "formula": "AVG( < dataset_id >:< column_id > ) / MIN( < dataset_id >:< column_id > )"
    },
    ...
  ],
  ...
}

ℹ️

If desired, you can first validate the ad-hoc formula via API .

Where

The "where" property can be used to filter the requested data based on a combination of filters and filter groups. Filters can be nested within so-called Filter groups with logical operators and and or , allowing for more complex filtering conditions. Without any filter groups, filters will be concatenated using the and operator. All "where" filters will be applied before aggregating/grouping.

ℹ️

Note:

Filters or filter groups specified within the "Get Data" API request are always concatenated with filters applied on the Authorization token used to make the request, using "and" operators. This allows you to e.g. request an Embed token with specific parameter overrides , and use this token to request data via the Data API Service without having to specify the same multi-tenant filters in the request itself.

More information on the merging of data filters for Embed tokens can be found in the introduction of the "Create Authorization" documentation .

Filter groups

A filter group is a list of one or more filters or filter groups, which will be concatenated using the specified and / or operator. Each filter group is an object with as key the operator to use (either and or or ), and as value a list of one or more filters or nested filter groups.

json

/* Concatenating with "AND" */
[ <filters_or_filter_groups_to_AND> ]

/* Concatenating with "OR" */
{
  "or": [ <filters_or_filter_groups_to_OR> ]
}

ℹ️

Note:

Any items positioned directly within the outer "where" list will always be concatenated using an "and" operator. For example,

json

"where": [
  {<or_filter_group>},
  {<filter_2>},
  ...
]

is equivalent to filtering to WHERE ( <or_filter_group> ) AND <filter_2> AND ... in SQL.

Filters

Each Filter should use the following structure. The format of <value(s)> is dependent on which filter expression you specified and can be found in the table below.

json

{
  "expression": "< expression >",
  "parameters": [
    {
      "column_id": "< your_column_id >",
      "dataset_id": "< your_dataset_id >"
    },
    <value(s)>
  ]
}

You specify the dataset_id and column_id on which you'd like to set a filter. The supported expressions and expected values can be found in the table below:

Filter expression	UI equivalent	Supported column types	<value(s)>	<value(s)> example(s)
"? is null" "? is not null"	is missing is not missing	Numeric Hierarchy Datetime	No value!	Not applicable.
"? in ?" "? not in ?"	is in is not in	Numeric Hierarchy	Array with Numeric values Array with String values	[1] ["Value 1"]
"? = ?" "? != ?"	equals not equals	Numeric Hierarchy	Numeric value String value	1 "Value 1"
"? < ?" "? <= ?" "? > ?" "? >= ?"	less than less than or equal greater than greater than or equal	Numeric Datetime	Numeric value String date value in RFC 3339 format	1 "2018-05-05T12:23:45.000Z"
"? like ?" "? not like ?" "? starts with ?" "? not starts with ?" "? ends with ?" "? not ends with ?"	contains does not contain starts with does not start with ends with does not end with	Hierarchy	String value	"Value"
"last_now" "last_available" "next_now"	last (x) period(s) last available (x) period(s) next (x) period(s)	Datetime	Object value unit = date level (1 = years, ..., 5 = days, ..., 9 = millisecond) quantity = amount of units minus 1	{unit: 1, quantity: 0} will filter to last / last available / next 1 year
"last_completed" "to_date" "next_full"	last completed period period to date next full period	Datetime	Object value unit = date level (1 = years, ..., 5 = days, ..., 9 = millisecond) quantity = amount of units	{unit: 1, quantity: 0} will filter to last completed year / year to date / next full year
"more_than_ago" "or_more_ago"	more than (x) period(s) ago (x) or more period(s) ago	Datetime	Object value unit = date level (1 = years, ..., 5 = days, ..., 9 = millisecond) quantity = amount of units	{unit: 5, quantity: 30} "more_than_ago" will filter to more than 30 days ago (exclusive, >) "or_more_ago" will filter to 30 or more days ago (inclusive, >=)

Filter examples

The following examples show the Luzmo API filter and the SQL equivalent.

Single filter

json

{
  ...,
  "where": [
    {
      "expression": "? = ?",
      "parameters": [
        {
          "column_id": "< hierarchy_column_id >",
          "dataset_id": "< dataset_id >"
        },
        "value"
      ]
    }
  ]
}

ℹ️

SQL Equivalent of above API filter :

sql

WHERE text_column = 'value'

And filter group

json

{
  ...,
  "where": [
    {
      "expression": "? = ?",
      "parameters": [
        {
          "column_id": "< hierarchy_column_id >",
          "dataset_id": "< dataset_id >"
        },
        "value"
      ]
    },
    {
      "expression": "? > ?",
      "parameters": [
        {
          "column_id": "< numeric_column_id >",
          "dataset_id": "< dataset_id >"
        },
        10
      ]
    }
  ]
}

ℹ️

SQL Equivalent of the "and" filter group :

sql

WHERE
  text_column = 'value'
  AND number_column > 10

Or filter group

json

{
  ...,
  "where": [
    {
      "or": [
        {
          "expression": "? = ?",
          "parameters": [
            {
              "column_id": "< hierarchy_column_id >",
              "dataset_id": "< dataset_id >"
            },
            "value"
          ]
        },
        {
          "expression": "? > ?",
          "parameters": [
            {
              "column_id": "< numeric_column_id >",
              "dataset_id": "< dataset_id >"
            },
            10
          ]
        }
      ]
    }
  ]
}

ℹ️

SQL Equivalent of the "or" filter group :

sql

WHERE
  text_column = 'value'
  OR number_column > 10

Nested filter groups

json

{
  ...,
  "where": [
    {
      "or": [
        {
          "expression": "? = ?",
          "parameters": [
            {
              "column_id": "< hierarchy_column_id >",
              "dataset_id": "< dataset_id >"
            },
            "value"
          ]
        },
        {
          "expression": "? > ?",
          "parameters": [
            {
              "column_id": "< numeric_column_id >",
              "dataset_id": "< dataset_id >"
            },
            10
          ]
        }
      ]
    },
    {
      "expression": "? != ?",
      "parameters": [
        {
          "column_id": "< other_hierarchy_column_id >",
          "dataset_id": "< dataset_id >"
        },
        "other"
      ]
    }
  ]
}

ℹ️

SQL Equivalent of the nested API filter groups :

sql

WHERE
  (
    text_column = 'value'
    OR number_column > 10
  )
  AND another_text_column != 'other'

Having

In contrast to where filters, "having" filters are applied after grouping and aggregating ; this allows you to filter based on aggregated values (e.g. only retrieve groups in which the total price exceeds a specific value).

Just like in the where property, you can specify either a list of filters, or an "and / or" filter group with its nested filters or filter groups. Without any filter groups, filters will be concatenated using the and operator. The "Filter groups" syntax is identical to the "where" property's filter groups :

json

/* Concatenating with "AND" */
{
  "and": [ <having_filters_or_filter_groups_to_AND> ]
}

/* Concatenating with "OR" */
{
  "or": [ <having_filters_or_filter_groups_to_OR> ]
}

Having filters

You specify the dataset ID and column ID on which you'd like to set a "having" filter, as well as the aggregation type, filter expression, and filter value(s). Besides that, you can also apply having filters on Aggregation Formulas (see note below).

json

{
  "expression": "< expression >",
  "parameters": [
    {
      "column_id": "< your_column_id_to_aggregate >",
      "dataset_id": "< your_dataset_id >",
      "aggregation": {
        "type": "< aggregation_type >",
        "dataset_id": "< optional_denominator_or_average_weight_dataset_id >",
        "column_id": "< optional_denominator_or_average_weight_column_id >"
      }
    },
    <value(s)>
  ]
}

ℹ️

You can specify a having filter on an Aggregation Formula as follows:

json

{
  "expression": "< expression >",
  "parameters": [
    {
      "formula_id": "< your_formula_id >",
      "dataset_id": "< your_dataset_id >"
    },
    <value(s)>
  ]
}

You can even specify an ad-hoc Aggregation Formula using the following syntax:

json

{
  "expression": "< expression >",
  "parameters": [
    {
      "formula": "AVG( < dataset_id >:< column_id > ) / MIN( < dataset_id >:< column_id > )",
      "dataset_id": "< your_dataset_id >"
    },
    <value(s)>
  ]
}

The supported aggregations for each column type are listed here . The supported expressions, and expected values, can be found in the table below:

Filter expression	UI equivalent	Aggregation types	<value(s)>	<value(s)> example(s)
"? is null" "? is not null"	is missing is not missing	All aggregation types are supported on all column types, except `count` and `distinctcount`; these count aggregations would always result in `0`, instead of `null`.	No value!	Not applicable.
"? in ?" "? not in ?"	is in is not in	All aggregations supported on all columns types that result in numeric values (e.g. `count`, `distinctcount`, `sum`, `average`, etc.). Note that for `min` / `max` aggregations on columns of type Hierarchy or Datetime, you should specify a list of string or RFC3339 datetime filter value(s), respectively.	Array with numeric values Array with string values Array with RFC3339 datetime values	[ 1 ] [ "Value 1" ] [ "2000-01-01T00:00:00.000Z" ]
"? = ?" "? != ?" "? < ?" "? <= ?" "? > ?" "? >= ?"	equals not equals less than less than or equal greater than greater than or equal	All aggregations supported on all columns types that result in numeric values (e.g. `count`, `distinctcount`, `sum`, `average`, etc.). Note that for `min` / `max` aggregations on columns of type Hierarchy or Datetime, you should specify a string or RFC3339 datetime filter value, respectively.	Numeric value String value RFC3339 datetime value	1 "Value 1" "2018-05-05T12:23:45.000Z"
"? like ?" "? not like ?" "? starts with ?" "? not starts with ?" "? ends with ?" "? not ends with ?"	contains does not contain starts with does not start with ends with does not end with	Only supported on columns of type Hierarchy, and aggregation types `min` or `max`; other aggregation and/or column types aren't supported for these filter expressions.	String value	"Value"

Having filter examples

The following examples show the Luzmo Having API filters and what their effect will be.

Simple "having" filter examples

Less than 5 distinct categories

json

{
  ...,
  "having": [
    {
      "expression": "? < ?",
      "parameters": [
        {
          "column_id": "category_column_id",
          "dataset_id": "dataset_id",
          "aggregation": {
            "type": "distinctcount"
          }
        },
        5
      ]
    }
  ]
}

Weighted average score smaller than 7.5

json

{
  ...,
  "having": [
    {
      "expression": "? < ?",
      "parameters": [
        {
          "column_id": "score_column_id",
          "dataset_id": "dataset_id",
          "aggregation": {
            "type": "weightedaverage",
            "column_id": "weight_column_id",
            "dataset_id": "dataset_id",
          }
        },
        7.5
      ]
    }
  ]
}

"Having" filter group examples

"And" filter group: less than 5 distinct categories, and weighted average score less than 7.5

json

{
  ...,
  "having": [
    {
      "and": [
        {
          "expression": "? < ?",
          "parameters": [
            {
              "column_id": "category_column_id",
              "dataset_id": "dataset_id",
              "aggregation": {
                "type": "distinctcount"
              }
            },
            5
          ]
        },
        {
          "expression": "? < ?",
          "parameters": [
            {
              "column_id": "score_column_id",
              "dataset_id": "dataset_id",
              "aggregation": {
                "type": "weightedaverage",
                "column_id": "weight_column_id",
                "dataset_id": "dataset_id",
              }
            },
            7.5
          ]
        }
      ]
    }
  ]
}

"Or" filter group: less than 5 distinct categories, or weighted average score less than 7.5

json

{
  ...,
  "having": [
    {
      "or": [
        {
          "expression": "? < ?",
          "parameters": [
            {
              "column_id": "category_column_id",
              "dataset_id": "dataset_id",
              "aggregation": {
                "type": "distinctcount"
              }
            },
            5
          ]
        },
        {
          "expression": "? < ?",
          "parameters": [
            {
              "column_id": "score_column_id",
              "dataset_id": "dataset_id",
              "aggregation": {
                "type": "weightedaverage",
                "column_id": "weight_column_id",
                "dataset_id": "dataset_id",
              }
            },
            7.5
          ]
        }
      ]
    }
  ]
}

Nested filter groups:
- Less than 5 distinct categories or weighted average score less than 7.5
- And , average age greater than 25

json

{
  ...,
  "having": [
    {
      "and": [
        {
          "or": [
           {
              "expression": "? < ?",
              "parameters": [
                {
                  "column_id": "category_column_id",
                  "dataset_id": "dataset_id",
                  "aggregation": {
                    "type": "distinctcount"
                  }
                },
                5
              ]
            },
            {
              "expression": "? < ?",
              "parameters": [
                {
                  "column_id": "score_column_id",
                  "dataset_id": "dataset_id",
                  "aggregation": {
                    "type": "weightedaverage",
                    "column_id": "weight_column_id",
                    "dataset_id": "dataset_id",
                  }
                },
                7.5
              ]
            }
          ]
        },
        {
          "expression": "? > ?",
          "parameters": [
            {
              "column_id": "age_column_id",
              "dataset_id": "dataset_id",
              "aggregation": {
                "type": "average"
              }
            },
            25
          ]
        }
      ]
    }
  ]
}

Order

Here you can specify one or more columns that should be used to sort the result on. The order in which they are listed will dictate the order in which the result is sorted. You should specify a dataset_id, column_id and the order (i.e. "asc" or "desc").

For

Datetime and

Hierarchy dimension columns you should additionally specify the date level (1 = years, ..., 5 = days, ..., 9 = millisecond) or hierarchy level (1 = top level). If not specified, the lowest level available in your column will be used.

Optionally you can also specify an aggregation on a column, which causes the aggregated values to be used to sort the result.

json

"order": [
  {
    "column_id": "< column_id >",
    "dataset_id": "< dataset_id >",
    "level": #,
    "aggregation": {
      "type": "< aggregation_type >",
    },
    "order": "asc" // Or "desc"
  },
  ...
]

ℹ️

You can also sort on an Aggregation Formula as follows:

json

{
  "order": [
    {
      "formula_id": "FORMULA_ID",
      "order": "asc" // Or "desc"
    }
  ]
}

You can even specify an ad-hoc Aggregation Formula to sort on using the following syntax:

json

{
  "order": [
    {
      "dataset_id": "< dataset_id >",
      "formula": "< ad_hoc_formula_expression >",
      "order": "asc" // Or "desc"
    }
  ]
}

Limit

This property can be specified to limit the amount of rows that will be returned. You can specify both a limit (i.e. the amount of data points to return), and an offset (i.e. the amount of data points to first skip); this can be useful for sequential data retrieval.

ℹ️

When specifying a limit, we highly recommend also specifying an order to ensure the query returns the expected results.

json

"limit": {
  "by": ###,
  "offset": ###
}

Options

The following options can be optionally specified:

rollup_data (boolean): if false , non-aggregated data will be returned (i.e. row-level data). Defaults to true .
timezone_id (String): can be used to specify the timezone in which the data should be queried. This timezone ID needs to be a valid identifier from the IANA timezone database . By default UTC data will be queried and returned.
- Keep in mind that the data is returned as UTC values when grouping by hour, minute, second & millisecond level of a column (i.e. you should still apply the offset yourself after data retrieval). When grouping by the date levels year, quarter, month, week & day, the data will be returned in the timezone specified.
- For datetime columns of subtype "date", no timezone shift will be applied, as these columns (are assumed to) lack time information; instead, dates will always be returned as stored in the data source. More information about the date & datetime column subtype can be found in this Academy article .

json

"options": {
  "rollup_data": false,                 // => Row-level data
  "timezone_id": "America/New_york"     // => Timezone of data returned
}

Query examples

Simple example

To retrieve the total number of burritos savoured per type of burrito, we can use the following API request:

Replace with your own key & token, a valid dataset ID and valid column IDs.

server.js

Shell

Node

Java

.NET

Python

PHP

const Luzmo = require('@luzmo/nodejs-sdk');
const client = new Luzmo({
  api_key: "< your API key >",
  api_token: "< your API token >"
});

const data = await client.query({
  queries: [{
    dimensions: [
      {
        column_id: "< type of burrito column id >",
        dataset_id: "< burritos dataset id >",
      },
    ],
    measures: [
      {
        column_id: "< number burritos savoured column id >",
        dataset_id: "< burritos dataset id >",
      },
    ],
  }]
});

console.log("Result set: ", data.data);
// Prints: [["spicy", 1256],
//          ["sweet",  913]]

Advanced example

A more elaborate example: let’s record the weight per burrito as another numeric measure.

We want the weights to be classified in 10 equal sized bins, for example [50, 100[ grams, [100, 150[ grams. These bins will be made by equally dividing the range between the minimum and maximum values in the column specified.

Now we want to retrieve the top 5 weight classes by number of burritos savoured, but only for burritos that are of the type ‘spicy’.

ℹ️

discretization with bins can be used to group numeric data into bins of certain width. You can disable the binning of numeric columns by setting discretization to type "none" .

Replace with your own API key & token, a valid dataset ID and valid column IDs.

server.js

Shell

Node

Java

.NET

Python

PHP

const Luzmo = require('@luzmo/nodejs-sdk');
const client = new Luzmo({
  api_key: "< your API key >",
  api_token: "< your API token >"
});

const data = await client.query({
  queries: [{
    dimensions: [
      {
        column_id: "< burrito weight column id >",
        dataset_id: "< burritos dataset id >",
        discretization: {
          type: "linear",
          bins: 10,
        },
      },
    ],
    measures: [
      {
        column_id: "< number burritos savoured column id >",
        dataset_id: "< burritos dataset id >",
        aggregation: {
          type: "sum",
        },
      },
    ],
    where: [
      {
        expression: "? = ?",
        parameters: [
          {
            column_id: "< type of burrito column id >",
            dataset_id: "< burritos dataset id >",
          },
          "spicy",
        ],
      },
    ],
    order: [
      {
        column_id: "< number burritos savoured column id >",
        dataset_id: "< burritos dataset id >",
        aggregation: {
          type: "sum",
        },
        order: "desc",
      },
    ],
    limit: {
      by: 5,
    },
  }]
});

console.log("Result set:", data.data);
// Prints: [["[ 100 - 150 [", 125],
//          ["[ 150 - 200 [", 121]]

Formulas example

Next up, let's calculate the average number of burritos savoured divided by the minimum weight of them per burrito type.

To do this, we'll use a Formula , which is a way for us to define compound operations involving aggregation functions. There are 2 types of Formulas available:

Ad-hoc Formulas are defined inside the query request
Regular Formulas are defined separately using the createFormula endpoint.

For this example, we'll use both ways in the same request; in your application, use the one which fits your use-case best.

ℹ️

If you wish to follow along, create a Formula with the following definition: AVG(< burritos dataset id >:< number burritos savoured column id >) / MIN(< burritos dataset id >:< burrito weight column id >)

Replace with your own API key & token, a valid dataset ID and valid column/formula IDs.

server.js

Shell

Node

Java

.NET

Python

PHP

const Luzmo = require("@luzmo/nodejs-sdk");
const client = new Luzmo({
  api_key: "< your API key >",
  api_token: "< your API token >"
});

const data = await client.query({
  queries: [
    {
      dimensions: [
        {
          column_id: "< type of burrito column id >",
          dataset_id: "< burritos dataset id >",
        },
      ],
      measures: [
        {
          formula_id: "< created formula id >",
        },
        {
          dataset_id: "< burritos dataset id >",
          formula: "AVG(< burritos dataset id >:< number burritos savoured column id >) / MIN(< burritos dataset id >:< burrito weight column id >)",
        },
      ],
    },
  ],
});
console.log("Result set:", data.data);
// Prints: [["spicy", 13.22, 13.22],
//          ["savoury", 3.53, 3.53]]

Did this page help you?

Yes