Expression processor
The Expression processor is a versatile processor that takes an expression as an argument and outputs its result in a field.
The expression can be composed of:
Identifiers
Literals (numbers, textual literals, Booleans and keywords)
Operators
Functions
Remember that to create an expression, you must always start with an equal sign (=). If you don't, the processor will not evaluate the expression. The starting equal sign (=) will be omitted through all the following examples.
Setting the processor
To set the parameters of the Expression processor, follow the indications from the table below.
Label | Description | Mandatory |
Expression | The expression pattern | Yes |
Output field | Field where the expression will be applied. It can be an existing field chosen from the list, or a new field created for that purpose (in that case, choose "Input text" from the list and write the name of the new field in the text area below). | Yes |
Building an expression
An expression is an instruction for the processor to perform an operation and then to return the value in a new or existing column of the dataset.
These expressions look like what is called "formulas" in a spreadsheet software. The main difference is that instead of referencing cells (e.g: A1 + B2
), the processor can perform operations with the values of given columns (e.g: column_1 + column_2
).
Like in a formula, expressions can be any combination of various elements. The Expression processor supports the following elements:
Identifiers (e.g: a field name like
column_1
or a forced field name like$column_1
),Literals:
- Numbers (e.g: 2
or 3.6
)
- Textual literals (e.g "hello"
or 'world'
)
- Booleans (e.g: 'true'
or 'false'
)
- Keywords (e.g: a mathematical constant like pi
and e
)
Operators (e.g:
+
,OR
,not
,<=
)Functions (e.g:
now()
,sin(number)
,startswith("hello", 'he')
)
Identifiers
The main feature of the Expression processor is the ability to perform operations on a record's columns. The technical identifier (or field name) of a column can be used in any expression to access the value of this field for the current record. This technical identifier can be found in the field options, as the Name
of this field.
Examples:
column_1
to access a field namedcolumn_1
(and maybe labeled "Column 1")name_en
to access a field namedname_en
(and maybe labeled "Name (EN)")
In some cases, the field name can be ambiguous—for example if it is a number, if it starts with a number, or if it is a reserved keyword like "pi" (the mathematical constant PI) or "e" (the mathematical constant Euler's number).To force the Expression processor to evaluate an identifier as a field name, it is possible to prefix any identifier with the dollar sign ($). The dollar sign can be used for any field name, but it is only mandatory for ambiguous field names.
Examples:
$column_1
to access a field namedcolumn_1
(and maybe labeled "Column 1")$name_en
to access a field namedname_en
(and maybe labeled "Name (EN)")$20_to_25_yo
to access a field named20_to_25_yo
(and maybe labeled "20 to 25 years old")$33
to access a field named33
$pi
to access a field namedpi
In all the following examples, any number or textual literal can be replaced by a field name holding values of the same type. The Expression processor will extract the value for the specified column and perform the required operation with it.
Several fields (or even the same field several times) can be used at the same time in an expression.
Literals
Literals like numbers, textual literals (single or double quoted), Booleans and keywords can be used in any expression.
Examples:
3
2.5
"Hello"
or'Hello'
'A bigger sentence'
'true'
or'false'
pi
,PI
orPi
e
orE
Operators
Operators are symbols that behave generally like functions but are used with a more natural syntax.
The Expression processor supports three kinds of operators, depending on the number of parameters around them:
Unary operators can be used as prefixes or suffixes to alter the value of one expression
Binary operators can be arithmetic operators to perform a calculus between two expressions, or Boolean operators to compare the result of two expressions
The ternary operator, to convert a conditional expression to either one of two possible results
Operator precedence works in the following order: factorial, exponential, sign, euclidian division, function evaluation, multiplication/division, addition/substraction, concatenation, not, comparison, and, or, ternary operator, ternary operator without else.
Unary operators
Operator | Description | Example |
| Prefix that negates the following value |
|
| Boolean operator that inverts the following condition |
|
| Suffix that computes the factorial of an expression |
|
Binary operators
Operator | Description | Example |
| Arithmetic operators: add, subtract, multiply, divide, modulo, euclidian division, power |
|
| Boolean operators: and, or |
|
| Comparison operators: greater than, lower than, greater or equal to, lower or equal to, equal to, different than return a Boolean |
|
| String concatenation operator |
|
Ternary operator
The expression [condition] ? [result if true] : [result if false]
is called the ternary operator, and allows to return different results depending on a condition.
A ternary expression can be read as "if [condition] then [result if true] else [result if false]".
Examples:
'true' ? 'hello' : 'goodbye'
returns'hello'
'false' ? 'hello' : 'goodbye'
returns'goodbye'
4 > 3 ? '4 is bigger' : '3 is bigger'
returns'4 is bigger'
10 <= 9 ? '9 is bigger' : '10 is bigger'
returns'10 is bigger'
The last part of the ternary operator is optional, which means that the following expressions are valid:
'true' ? 'hello'
returns'hello'
'false' ? 'hello'
returns an empty result
Some operations can fail if they involve a field which type was wrongly guessed by the platform at the creation of the dataset, even if the field type was modified afterward. In that case, contact Opendatasoft support team.
Functions
Functions are used to perform more advanced operations in an expression, such as:
Text handling
Mathematical functions
Date and time handling
Text handling
Boolean functions
Function | Description | Example |
| True if text only contains letters |
|
| True if text only contains numbers |
|
| True if text only contains letters or numbers |
|
| True if text is a valid decimal number |
|
| True if text is a single digit |
|
| True if text is lowercase |
|
| True if text is uppercase |
|
| True if text is empty |
|
| True if text (1st argument) contains, starts with or ends with text (2nd argument) |
|
Processing functions
Function | Description | Example |
| Return the length of the text |
|
| Convert text to lowercase, to uppercase |
|
| Capitalize the first letter of the text, the first letter of each word |
|
| Left, right and center justify a text (1st argument), until it reaches [numeric] characters with another text (3rd argument) |
|
| Convert a text to its ascii representation |
|
| Extract a substring of text, starting at index indicated by 2nd argument and of a length indicated by 3rd argument (optional) |
|
| Return the number of times the 2nd argument is found in the 1st argument |
|
| Extract a part of text (1st argument), from the left, right or the middle; the size of the extract is given by the 2nd argument for |
|
Mathematical functions
Function | Description | Example |
| Absolute value |
|
| Square root |
|
| Floor and ceiling functions |
|
| Max and min functions |
|
| Return the nearest integer |
|
| Random number generator (between 0 and [numeric]) |
|
| Power function |
|
| Exponential, logarithm and base 10 logarithm functions |
|
| Convert an angle from degrees to radians |
|
| Convert an angle from radians to degrees |
|
| Cosine, hyperbolic cosine, sine, hyperbolic sine, tangent, hyperbolic tangent (in radians) |
|
| Inverse cosine, inverse cosine hyberbolical, inverse sine, inverse sine hyperbolical, inverse tangent, inverse tangent hyperbolical (in radians) |
|
Date and time handling
Function | Description | Example |
| Extract the year, quarter, month, week, day, day of week, hours, minutes, seconds from a date/datetime |
|
|
|
|
| Convert a timestamp to a datetime |
|
| Convert a quarter (e.g: "2014Q2", "2019q1") to a date range |
|
| Count the number of units between the two datetime expressions. |
|