layout	title	spec
page	bandicoot - specification	true

Table of contents:

Language Specification
Transactions
API

Language Specification

## Keywords

Here are the keywords which are currently in use:

extend	fn	int	join	long	minus
project	real	rename	return	select	string
summary	time	type	union	var	void

The time keyword has no meaning at the moment and is reserved for the future release.

Program Structure

A program is defined in a single source file. The file is evaluated from top to bottom in one pass (similar to the C language). The top-level elements of the program can be of the following types:

relational type declarations
relational variable declarations
function declarations

The convention for Bandicoot source file extension is .b.

Primitive Types

Primitive types are scalar types and are used for attributes within relations, as well as input parameters for functions. There are four types available:

Type	Size	Description
int	32-bit	signed integer
long	64-bit	signed integer
real	64-bit	IEEE 754 double precision
string	0-1024 bytes	UTF-8 encoded string

The primitive types are referenced within this specification as PrimitiveType.

Bandicoot is a strongly-typed language and converting a primitive expression of a given type into another type must be explicit. The current version of Bandicoot supports only conversion from one numeric type to another. There is no support for conversion between strings and numbers. The following syntax forms are supported:

    (int  PrimitiveExpr)
    (real PrimitiveExpr)
    (long PrimitiveExpr)

Identifiers

Here is the regular expression defining an identifier: [_a-zA-Z0-9]+. Maximum identifier length is 32 characters. Below you will find the following references to the identifiers:

TypeName
AttrName
VarName
ParamName
FuncName

## Relational Types

There are two ways to declare a relational type: named and inline. Named declarations give an identifier to some particular type so that it can be referenced in the code later. Inline (or anonymous) declarations are useful when the type is used only once (e.g. as an input or output function parameter).

Named type can be declared in the following way:

type TypeName
{
    AttrName PrimitiveType [,]
    [more attributes]
}

and inline type:

{
    AttrName PrimitiveType [,]
    [more attributes]
}

The relational types (both inline and named) are referenced within this specification as RelType.

## Relational Variables

Relational variables are used for keeping the program state. The system provides two types of variables:

global variables
local (temporary) variables

Here is how you can declare a global variable named VarName.

var VarName RelType ;

The relational variables are referenced within this specification as RelVar.

Functions

Functions are identified by names which must be unique across the whole program source file. A function can make complex state transformations on top of the global variables (see Transactions section).

fn FuncName ( FuncArgs ) FuncReturn
{
    FuncBody
}

FuncArgs can be contain only one relational argument and and several arguments of a primitive type all separated with the commas. Each argument has the following structure:

ArgName "RelType | PrimitiveType"

The FuncReturn defines the result type of a function. It can either be a relational type or no result at all, identified by keyword void:

RelType | void

Function body (FuncBody) is a list of statements evaluated from top to bottom. The list is separated with the semicolons (";"). Statements can be of three types:

global variable assignment

VarName = RelExpr ;

temporary variable declaration and assignment

var VarName = RelExpr ;

return statement (only if a function declares its output type)

return RelExpr ;

A function cannot call another function. Also, only one assignment per global relational variable is possible within a function body. After the assignment the global variable cannot be accessed anymore (within the same function). This is a temporary limitation and you can workaround it with the help of temporary variables.

## Relational Operators and Expressions

Bandicoot implements 8 relational operators which provide rich data manipulative features. Some of the operators are binary (take 2 relations as input) and some are unary (take 1 relation as input). Apart from the relational inputs these operators usually take additional argument specific to the operator. Every operator returns a new relation and does not modify the inputs. The language provides these operators as functions with arguments:

OperatorName (arg1) (arg2) ... (argN)

The brackets around the arguments are mandatory only if the argument is an operator with at least one argument.

Every relational variable (global or local) is an operator as well and returns the value of the variable. The operators are the main building blocks in the language. Complex relational expressions (RelExpr) can be created by nesting the relational operators to compute the desired results.

Rename

rename ToAttrName = FromAttrName [,] [more attributes] RelExpr

This operator creates a new relation with the specified attributes being renames, the relational body (tuples) does not change.

### Project

project AttrName [,] [more attributes] RelExpr

The result contains only the attributes defined as the first argument. It can have reduced number of tuples due to removal of duplicate values.

### Extend

extend AttrName = PrimitiveExpr [,] [more attributes] RelExpr

The operator adds the attributes defined as the first argument to each tuple of the input relation. The values are computed by primitive expressions.

Select

select BooleanExpr RelExpr

The result contains only those tuples of the input relation which match the boolean expression defined as the first argument.

Union

union RelExpr RelExpr

or

RelExpr + RelExpr

The union operator creates a new relation consisting of two input relations removing duplicate tuples. Both inputs need to have the same attributes.

Minus (Semidifference)

minus RelExpr RelExpr

or

RelExpr - RelExpr

Removes tuples from the first input which match tuples in the second input. The matching logic is an equality on the common attributes.

### Natural Join

join RelExpr RelExpr

or

RelExpr * RelExpr

This operator creates a result where the tuples are combinations of matching tuples from both input relations. The matching logic is an equality on the common attributes. If there are no common attributes the result is a cartesian join (i.e. every tuple from the first input matches every tuple in the second input). All the attributes from the input relations are present in the result.

Summary

Unary version

summary AttrName = SumFunc [,] [more attributes] RelExpr

#### Binary version

summary AttrName = SumFunc [,] [more attributes] RelExpr RelExpr

Both unary and binary versions of summary operator produce tuples containing summary data grouped according to the specified attributes. In case of the unary version, the grouping is done by a virtual relation with zero attributes and therefore the result contains up to one tuple. The binary version creates the groups according to all the attributes of the second relation.

Result type is expressed as an extension of the empty relation (unary summary) or rightmost relation (binary summary). Each attribute can be of a specified summary function (SumFunc). Here is a list of currently supported functions:

add - sum up the values of an attribute

(add AttrName DefVal)

avg - average of values of an attribute

(avg AttrName DefVal)

cnt - count the number of tuples

(cnt)

max - maximum value of an attribute

(max AttrName DefVal)

min - minimum value of an attribute

(min AttrName DefVal)

Where DefVal is a constant expression. The type of the expression should match the type of the result and attribute. The exception is the avg function where the default value and result are always real numbers. DefVal is used in those cases when the RelExpr body is empty. In case of the binary summary this can happen when there is no matching tuple in left RelExpr for a tuple in the right RelExpr.

# Transactions

Each invocation of a function implicitly creates a transaction. All the statements within a function are part of the same transaction. There are no explicit keywords to commit or rollback a transaction. If there is an error the rollback is performed automatically and an error code is returned to the client.

Modification of a global variable is not allowed by two transactions at the same time. Therefore two functions which modify the same variable are serialized and executed one after the other. Read-only functions are executed in parallel with other read/write functions.

The level of isolation is always serializable and it means that if a read of the same variable occurs several times within a function it always returns the same data even if the variable is modified by a different function at the same time.

API

Bandicoot API is based on the HTTP/1.1 protocol. The interface exposes all the functions defined in a program source file through http://server:port/FuncName URLs. The HTTP POST method must be used to invoke a function with an input parameter. Otherwise the HTTP GET is required.

Both input and output parameters are exchanged in "comma separated values" format. The tuples are delimited with the \n end-of-line character. The first line is a relational head definition in the following format:

    AttrName PrimitiveType [,][more attributes]

The comma or the end-of-line character can be escaped by using \ character. It means the Bandicoot will not represent those characters and they will be treated as part of your data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

specification.md

specification.md

Language Specification

Program Structure

Primitive Types

Identifiers

Functions

Rename

Select

Union

Minus (Semidifference)

Summary

Unary version

API

Files

specification.md

Latest commit

History

specification.md

File metadata and controls

Language Specification

Program Structure

Primitive Types

Identifiers

Functions

Rename

Select

Union

Minus (Semidifference)

Summary

Unary version

API