Skip to content

Commit

Permalink
Move SQL UDF content
Browse files Browse the repository at this point in the history
Into the separate page, and adjust the generic content to be suitable
for any UDF language.
  • Loading branch information
mosabua committed Dec 16, 2024
1 parent 71f5135 commit a1fcbce
Show file tree
Hide file tree
Showing 3 changed files with 105 additions and 89 deletions.
8 changes: 1 addition & 7 deletions docs/src/main/sphinx/udf.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,11 @@ A user-defined function (UDF) is a custom function authored by a user of Trino
in a client application. UDFs are scalar functions that return a single output
value, similar to [built-in functions](/functions).

[Declare the UDF](udf-declaration) with a `FUNCTION` definition using the
supported statements. A UDF can be declared and used as an [inline
UDF](udf-inline) or declared as a [catalog UDF](udf-catalog) and used
repeatedly.

UDFs are defined and written using the [SQL routine language](/udf/sql).

More details are available in the following sections:

```{toctree}
:titlesonly: true
:maxdepth: 1
udf/introduction
udf/function
Expand Down
109 changes: 28 additions & 81 deletions docs/src/main/sphinx/udf/introduction.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,25 @@
# Introduction to user-defined functions
# Introduction to UDFs

A user-defined function (UDF) is a custom function authored by a user of Trino
in a client application. UDFs are scalar functions that return a single output
value, similar to [built-in functions](/functions).

[Declare the UDF](udf-declaration) with a `FUNCTION` definition using the
supported statements. A UDF can be declared and used as an [inline
UDF](udf-inline) or declared as a [catalog UDF](udf-catalog) and used
repeatedly.

UDFs are defined and written using the [SQL routine language](/udf/sql).

:::{note}
User-defined functions can alternatively be written in Java and deployed as a
plugin. Details are available in the [developer guide](/develop/functions).
:::

(udf-declaration)=
## UDF declaration

Declare the UDF with a [](/udf/function) keyword using the supported statements
for [](/udf/sql).

A UDF can be declared and used as an [inline UDF](udf-inline) or declared as a
[catalog UDF](udf-catalog) and used repeatedly.

(udf-inline)=
## Inline user-defined functions

Expand All @@ -20,10 +29,11 @@ query:

```sql
WITH
FUNCTION abc(x integer)
FUNCTION doubleup(x integer)
RETURNS integer
RETURN x * 2
SELECT abc(21);
SELECT doubleup(21);
-- 42
```

Inline UDF names must follow SQL identifier naming conventions, and cannot
Expand All @@ -39,13 +49,14 @@ invocation.

```sql
WITH
FUNCTION abc(x integer)
FUNCTION doubleup(x integer)
RETURNS integer
RETURN x * 2,
FUNCTION xyz(x integer)
FUNCTION doubleupplusone(x integer)
RETURNS integer
RETURN abc(x) + 1
SELECT xyz(21);
RETURN doubleup(x) + 1
SELECT doubleupplusone(21);
-- 43
```

Note that inline UDFs can mask and override the meaning of a built-in function:
Expand Down Expand Up @@ -94,75 +105,11 @@ Use the [](/connector/memory) in a catalog for simple storing and
testing of your UDFs.
:::

(udf-declaration)=
## UDF declaration

Refer to the documentation for the [](/udf/function) keyword for more
details about declaring the UDF overall. The UDF body is composed with
statements from the following list:

* [](/udf/sql/begin)
* [](/udf/sql/case)
* [](/udf/sql/declare)
* [](/udf/sql/if)
* [](/udf/sql/iterate)
* [](/udf/sql/leave)
* [](/udf/sql/loop)
* [](/udf/sql/repeat)
* [](/udf/sql/return)
* [](/udf/sql/set)
* [](/udf/sql/while)

Statements can also use [built-in functions and operators](/functions) as well
as other UDFs, although recursion is not supported for UDFs.

Find simple examples in each statement documentation, and refer to the
[](/udf/sql/examples) for more complex use cases that combine multiple
statements.

:::{note}
User-defined functions can alternatively be written in Java and deployed as a
plugin. Details are available in the [developer guide](/develop/functions).
:::

(udf-sql-label)=
## Labels

SQL UDFs can contain labels as markers for a specific block in the declaration
before the following keywords:

* `CASE`
* `IF`
* `LOOP`
* `REPEAT`
* `WHILE`

The label is used to name the block to continue processing with the `ITERATE`
statement or exit the block with the `LEAVE` statement. This flow control is
supported for nested blocks, allowing to continue or exit an outer block, not
just the innermost block. For example, the following snippet uses the label
`top` to name the complete block from `REPEAT` to `END REPEAT`:

```sql
top: REPEAT
SET a = a + 1;
IF a <= 3 THEN
ITERATE top;
END IF;
SET b = b + 1;
UNTIL a >= 10
END REPEAT;
```

Labels can be used with the `ITERATE` and `LEAVE` statements to continue
processing the block or leave the block. This flow control is also supported for
nested blocks and labels.

## Recommendations

Processing UDFs can potentially be resource intensive on the cluster in
terms of memory and processing. Take the following considerations into account
when writing and running SQL UDFs:
when writing and running UDFs:

* Some checks for the runtime behavior of UDF are in place. For example,
UDFs that take longer to process than a hardcoded threshold are
Expand All @@ -181,14 +128,14 @@ when writing and running SQL UDFs:

## Limitations

The following limitations apply to SQL UDFs.
The following limitations apply to UDFs.

* SQL UDFs must be declared before they are referenced.
* UDFs must be declared before they are referenced.
* Recursion cannot be declared or processed.
* Mutual recursion can not be declared or processed.
* Queries cannot be processed in a SQL UDF.
* Queries cannot be processed in a UDF.

Specifically this means that SQL UDFs can not use `SELECT` queries to retrieve
Specifically this means that UDFs can not use `SELECT` queries to retrieve
data or any other queries to process data within the UDF. Instead queries can
use UDFs to process data. UDFs only work on data provided as input values and
only provide output data from the `RETURN` statement.
77 changes: 76 additions & 1 deletion docs/src/main/sphinx/udf/sql.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,30 @@
# SQL user-defined functions

A SQL user-defined function, also known as SQL routine, is a [user-defined
function](/udf) that uses the SQL routine language and statements for the
definition of the function.

## SQL UDF declaration

Declare a SQL UDF using the [](/udf/function) keyword and the following
statements can be used in addition to [built-in functions and
operators](/functions) and other UDFs:

* [](/udf/sql/begin)
* [](/udf/sql/case)
* [](/udf/sql/declare)
* [](/udf/sql/if)
* [](/udf/sql/iterate)
* [](/udf/sql/leave)
* [](/udf/sql/loop)
* [](/udf/sql/repeat)
* [](/udf/sql/return)
* [](/udf/sql/set)
* [](/udf/sql/while)

```{toctree}
:titlesonly: true
:hidden:
sql/examples
sql/begin
Expand All @@ -15,4 +38,56 @@ sql/repeat
sql/return
sql/set
sql/while
```
```

A minimal example declares the UDF `doubleup` that returns the input integer
value `x` multiplied by two. The example shows declaration as [](udf-inline) and
invocation with the value 21 to yield the result 42:

```sql
WITH
FUNCTION doubleup(x integer)
RETURNS integer
RETURN x * 2
SELECT doubleup(21);
-- 42
```

The same UDF can also be declared as [](udf-catalog).

Find simple examples in each statement documentation, and refer to the
[](/udf/sql/examples) for more complex use cases that combine multiple
statements.

(udf-sql-label)=
## Labels

SQL UDFs can contain labels as markers for a specific block in the declaration
before the following keywords:

* `CASE`
* `IF`
* `LOOP`
* `REPEAT`
* `WHILE`

The label is used to name the block to continue processing with the `ITERATE`
statement or exit the block with the `LEAVE` statement. This flow control is
supported for nested blocks, allowing to continue or exit an outer block, not
just the innermost block. For example, the following snippet uses the label
`top` to name the complete block from `REPEAT` to `END REPEAT`:

```sql
top: REPEAT
SET a = a + 1;
IF a <= 3 THEN
ITERATE top;
END IF;
SET b = b + 1;
UNTIL a >= 10
END REPEAT;
```

Labels can be used with the `ITERATE` and `LEAVE` statements to continue
processing the block or leave the block. This flow control is also supported for
nested blocks and labels.

0 comments on commit a1fcbce

Please sign in to comment.