Functions & Calculations

Row-Level Transformation Functions

Transformation functions only affect a single row of data and do not require data from multiple rows in order to execute correctly. Any functions added as transformation functions are fundamentally “safe” to the system and can be applied either in the model transformation layer or the field calculation layer. Since the application of transformation functions can create new dimensions or groupings of values, all transformation functions are applied prior to aggregation in the Pano query service.

A Field must be created before it can be referenced in a calculation. Currently, calculations referencing other fields can only go six layers deep, meaning there cannot be more than five degrees of abstraction between the final Field and the original mapped fields it references.

Text Functions

Function

Input

Output

PARSE( {DIMENSION}, DELIMITER_CHARACTER, INDEX )

Single dimension field

Delimiter Character that will be used to split the text into chunks

Integer Index referencing which chunk to output from the function

Output will be one dimension value, not an array or multiple

NULL value if list of items is too short

TRIM( {DIMENSION} )

Single text dimension field

Single text dimension field stripped of leading and trailing whitespace (spaces, tabs and newlines)

UPPER( {DIMENSION} )

Single text dimension field

Single text dimension field with case of the entire string converted from existing to upper case

LOWER( {DIMENSION} )

Single text dimension field

Single text dimension field with case of the entire string converted from existing to lower case

CONCAT( {FIELD}, [{FIELD | CONSTANT, ETC]

)

A list of one or more fields and/or user-defined constants, separated by commas

Single text dimension field with all values appended together in order

Conditional Functions

Conditional functions can be added into transformations in either the field “calculation” or the model “sql_ref”.

Function

Input

Output

IFF( LOGIC_STATEMENT, {FIELD},

[{FIELD}]

)

Logical Statement that evaluates to TRUE or FALSE

Desired output if TRUE

Desired output if FALSE (Default NULL)

Single field that matches the outcome of the logical statement. The data type of the output will match the field

IFS( LOGIC_STATEMENT,

{FIELD},

LOGIC_STATEMENT,

{FIELD}, ...

[{FIELD}]

)

Multiple Logic statements listed together and evaluated in order.

Each Statement is followed by the desired field output if TRUE.

Last field is the output if all are FALSE (default NULL)

Single field that matches the outcome of the logical statement. The data type of the output will match the field

(PARENTHESES)

Add parentheses into any calculation to dictate order of operations

-NA

CONTAINS(

{FIELD}, MATCH_VALUE, [MATCH_VALUE, ...]

)

Single field to search User-defined values to search the string for

Boolean TRUE or FALSE if any of the match_values exist in the field

Logic Statement Operators

Operator

Name

IS NULL

IS NULL

IS NOT NULL

IS NOT NULL

&&

AND

||

OR

NOT

NOT

=

EQUAL TO

!=

NOT EQUAL TO

<>

NOT EQUAL TO

>

GREATER THAN

>=

GREATER THAN OR EQUAL

<

LESS THAN

<=

LESS THAN OR EQUAL

Algebraic Functions

Function

Name

+

ADDITION

-

SUBTRACTION

*

MULTIPLICATION

/

DIVISION

Casting Functions

Function

Input

Output

TO_BOOL({FIELD}, [NULLIFY_ERRORS])

Single dimension or metric, Boolean flag to define whether to return error message or convert errors to NULL values

Dimension

“True” or “False” values or NULL for errors/unknown

TO_DATE({FIELD}, FORMAT_STRING, [NULLIFY_ERRORS])

Timestamp or datetime, Boolean flag to define whether to return error message or convert errors to NULL values

Dimension

A date (without 00:00:00)

TO_TEXT({field}, [NULLIFY_ERRORS])

Any field, Boolean flag to define whether to return error message or convert errors to NULL values

Dimension

Same value as a string

TO_URL({field}, [NULLIFY_ERRORS])

Text dimension, Boolean flag to define whether to return error message or convert errors to NULL values

Dimension

Validated URL string that can be rendered accordingly by a UI client

TO_NUMBER({field}, [precision - integer, optional, default 0], [NULLIFY_ERRORS])

dimension or metric, integer for decimal places

Metric

formatted with decimal places

Date / Time Functions

Function

Input

Output

CONVERT_TIMEZONE({time_taxon}, {from_timezone}, {to_timezone})

Date_time taxon (2020-05-01 02:00:00) that is aggregated at the hourly level or below, this field does not need to include timezone info

From_timezone:

  • A timezone taxon that points to data in the DB that specifies the time zone

  • Input value matching available, supported timezones

To_timezone:

Input value matching available, supported timezones to specify the time zone to convert to

Date/time field shifted to the desired timezone

DATE_DIFF( {interval}, {start_time}, {end_time} )

Start_time - can be a date field or the input “current_time”

  • “current_time” is a reserved word that resolves to SQL “now()”

  • end_time - can be a date field or the input “current_time”

  • Interval - the units that the user wants the diff to be calculated in, one of ‘second’, ‘minute’, ‘hour’, ‘day’, ‘week’, ‘month’, ‘year’.

Metric number indicating number of defined intervals between start_time and end_time

DATE({time_taxon})

Reduces granularity of a time field to daily granularity.

new date/date_time taxon with all values rolled up to the granularity defined

Examples:

  • Input: ‘2020/05/22 09:54:23’

  • ‘day’ = ‘2020/05/22’

DATE_HOUR({time_taxon})

Reduces granularity of a time field to hourly granularity.

New date/date_time taxon with all values rolled up to the granularity defined

Examples:

  • Input: ‘2020/05/22 09:54:23’

  • ‘hour’ = ‘2020/05/22 09:00:00’

DATE_WEEK({time_taxon})

Reduces granularity of a time field to weekly granularity.

New date/date_time taxon with all values rolled up to the granularity defined

Examples:

  • Input: ‘2020/05/22 09:54:23’

  • ‘week’ = ‘2020/05/17’

DATE_MONTH({time_taxon})

Reduces granularity of a time taxon to monthly granularity.

New date/date_time taxon with all values rolled up to the granularity defined

Examples:

  • Input: ‘2020/05/22 09:54:23’

  • ‘month’ = ‘2020/05/01’

YEAR({time_taxon})

Extracts the integer year from a date/time taxon

Integer output reflecting the year, ex: 2020

MONTH_OF_YEAR({time_taxon})

Truncates days, hours, minutes and seconds of a timestamp and returns data at a monthly granularity

Integer output reflecting the month number of the year (1-12)

WEEK_OF_YEAR({time_taxon})

Combines all timestamps within a given week into one record that is snapped to the first Sunday of each week

Integer output reflecting the ISO week of the year (1-53)

DAY_OF_WEEK({time_taxon})

Returns the named day of week associated with the date_time value (Mon, Tues, Wed, etc)

HOUR_OF_DAY({time_taxon})

Returns the hour of day associated with the date_time value (0-23)

Aggregation Functions

The following functions are only available in aggregation type and are not allowed in field calculations or model transformations

Aggregation functions are necessary for all mapped metric fields. For example, if I have a column in my database table for "impressions" and I set the aggregation on that field to be SUM(), I am telling Pano that any time I ask for "impressions", no matter what dimensions I group by, I will get the SUM(impressions) returned in the query results.

An aggregation type is applied only on the first mapped field pointing directly to the data table, all references to that field will inherit the same aggregation type.

Standard Metrics

Aggregation

Description

SUM( {METRIC} )

Single dataset SQL: SUM( {a} )

COUNT( {FIELD} )

Input: 1 dimension or metric

Output: a count (integer) of the total number of values (including duplicates) that appear for that field within the query parameters

Single dataset SQL: COUNT(DISTINCT {a}, {b}, etc )

AVERAGE( {METRIC} )

The exact metric used for SUM() should also be used in COUNT()

Single dataset SQL: SUM( {a} ) / COUNT( {a} )

AVERAGE_WEIGHTED( {METRIC}, {WEIGHTING_METRIC} )

Single dataset SQL: SUM( {a} * {a’} ) / SUM( {a’}

Lifetime & Unique

Aggregation

Description

COUNT_DISTINCT( {FIELD}, {FIELD}, ETC )

Input: a list of dimensions and/or metrics from a single dataset

Output: a count (integer) of the distinct values that appear for the set of fields within the query parameters

Single dataset SQL: COUNT( {a} )

FIRST_BY( {FIELD}, {TIME_DIMENSION} )

All dimensions from data request for the relevant dataset must be passed into the PARTITION BY () clause of the function

Single dataset SQL: FIRST_VALUE( {a} ) OVER (PARTITION BY ( {request_dimensions} ) ORDER BY {time_dimension} ASC)

LAST_BY( {FIELD}, {TIME_DIMENSION} )

All dimensions from data request for the relevant dataset must be passed into the PARTITION BY () clause of the function

Single dataset SQL: LAST_VALUE( {a} ) OVER (PARTITION BY ( {request_dimensions} ) ORDER BY {time_dimension} ASC)

Date / Time

Aggregation

Description

MIN( {TIME_DIMENSION} )

Description: Function will take one date/datetime value as input and return a metric value for the lowest/first date/time for the range within the query parameters

Input: any date or time taxon

Output: the earliest date in the set, at the same granularity as the input

MAX( {TIME_DIMENSION} )

Description: Function will take one date/datetime value as input and return a metric value for the highest/last date/time for the range within the query parameters

Input: any date or time taxon

Output: the latest date in the set, at the same granularity as the input

Special Functions

Special functions are functions created by Pano that do not correlate to any standard SQL functions. These functions have special behaviors that provide useful functionality when querying your data. Special functions are not available in model level “sql_ref”.

Conditional

  • ? Optional

    • Similar to the NULLIF() and ZEROIFNULL() functions in SQL. This operator is available to add in front of any field in a transformation and signifies to the pano system that the calculation should still execute successfully if the field is not available.

    • A common use case is when you want to create some cross-dataset blended metric, but you want that metric to be available when looking at single-dataset Dataframes as well.

      • E.g. CPC = (facebook|spend + ?twitter|spend) / (facebook|clicks + ?twitter|clicks)

      • In this example CPC would be available if you are building a blended Dataframe with both facebook and twitter Datasets, it would also be available if you are building a facebook only Dataframe, since twitter metrics are marked optional. Conversely, CPC would not be available if you build a twitter only dataframe since the formula requires facebook

Dimensions

  • MERGE( {DIMENSION}, {DIMENSION}, ETC )

    • Merge function allows for Dimension Blending - the ability to "blend" different dimensions from multiple datasets.

    • Merge function can take multiple dimension fields as inputs

    • You can not include more than one dimension from each dataset

    • The dimensions included can point directly to data or they can be derived but they cannot already include merged dimensions

  • OVERRIDE( {DIMENSION}, LOOKUP_PATH, FILTER_UNKNOWNS)

    • This function allows you to clean your data by overriding existing values with new values that you would like to use. This is very common when labeling technical API enums with presentable names, or standardizing values across datasets so they can be blended reports

    • Input

      • A single dimension field that they would like to use as input values for the override function

      • A path/link/reference to a predefined lookup dataset

      • FILTER_UNKNOWNS - Boolean flag to strip unmatched values from the response or include the data grouped as “Unknown”