Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop18 #79

Open
wants to merge 198 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
198 commits
Select commit Hold shift + click to select a range
9a28c8a
Initial dbtoaster-related commit
amirsh Jul 14, 2017
f02701f
Calc AST started
amirsh Jul 14, 2017
ba042ad
add include parser and ast node
praal Jul 17, 2017
4748b94
stream WIP
praal Jul 18, 2017
7700891
create stream parsing is done
praal Jul 18, 2017
5153504
ignoring comments
praal Jul 18, 2017
17cc2e9
adding some features (types, optional from, where not)
praal Jul 19, 2017
36502ae
natural join is added
praal Jul 19, 2017
8617fa8
Add CalcAST.
Sajerner Jul 19, 2017
59ed076
all and some are done
praal Jul 19, 2017
fa00ca5
gitignore updated
Sajerner Jul 19, 2017
1cc2845
WIP
praal Jul 19, 2017
2a90bd2
sql parsing is completed
praal Jul 20, 2017
a0a85b5
Merge pull request #1 from epfldata/calc-ast
praal Jul 21, 2017
7f315d9
initializing calc parser (create table and stream parser with error)
praal Jul 21, 2017
4ba8441
WIP
praal Jul 21, 2017
30ea92c
WIP
praal Jul 24, 2017
d0d7259
calc parser is almost completed
praal Jul 24, 2017
78c602c
keywords added to calc parser and all tests (simple and tpch) passed …
praal Jul 25, 2017
766ff4f
calc parser parses a simple query. (without relation)
praal Jul 25, 2017
7d13551
fixing valueExpression operations and some sql queries
praal Jul 26, 2017
c47e5f4
sql simple queries are converted to calc queries.
praal Jul 26, 2017
db95a76
all tests (tpch and simple) passed
praal Jul 26, 2017
92213d2
calc queries
praal Jul 26, 2017
e552228
Translate the [FROM] clause of a select query to its corresponding Ca…
Sajerner Jul 26, 2017
a1f04fb
calc printer initialized.
praal Jul 26, 2017
17a26f5
Merge branch 'sql-parser' of https://github.com/epfldata/dblab-toaste…
Sajerner Jul 26, 2017
1edaf8e
calc printer is completed.
praal Jul 27, 2017
d8009b0
fixing
praal Jul 27, 2017
f89fe47
some problems fixed.
praal Jul 27, 2017
f0ac077
Translate the [From] and [WHERE] clauses of a simple SQL query to its…
Sajerner Jul 28, 2017
6d5817c
initial calc optimizer
praal Aug 3, 2017
7676a86
nomalize and rewrite are done (without testing)
praal Aug 3, 2017
75408dd
a simple test passed
praal Aug 4, 2017
e39d9f5
Changes to be committed:
Sajerner Aug 6, 2017
c01e5fb
schema of expression and some fixings.
praal Aug 9, 2017
bfcb6a2
nesting rewrite is almost done.
praal Aug 18, 2017
5d080fc
testing nesting rewrites
praal Aug 18, 2017
83c678a
Changes to be committed:
Sajerner Aug 18, 2017
0c719b3
DDLInterpreter and Schema support streams, skeleton for SQLNamer
amirsh Aug 19, 2017
092562b
WIP
Sajerner Aug 19, 2017
2f6e4e0
Merge branch 'calc-ast' of https://github.com/epfldata/dblab-toaster …
Sajerner Aug 19, 2017
cdafa4d
SQLParser uses a consistent way to produce select all queries.
amirsh Aug 19, 2017
76533f6
SQLNamer works for select all with one relation.
amirsh Aug 19, 2017
b9cf9a3
SQLNamer for select all for only a subset of relations.
amirsh Aug 19, 2017
aa8b83b
SQLNamer for subquery WIP
amirsh Aug 19, 2017
d8bfdbf
SQLNamer handles subqueries.
amirsh Aug 19, 2017
0332137
WIP
Sajerner Aug 19, 2017
ffad7b4
Merge remote-tracking branch 'origin/namer' into calc-ast
Sajerner Aug 19, 2017
3f9e7cd
nesting rewrites is completed. (make tmp var is added)
praal Aug 21, 2017
f0f5635
SQLNamer substitutes star for all simple queries.
amirsh Aug 21, 2017
f906209
Exists added
Sajerner Aug 21, 2017
b398a06
SQLParser expands the content of include files.
amirsh Aug 21, 2017
669c48f
SQLParser makes all relation names lowercase.
amirsh Aug 21, 2017
e427769
SQLNamer works for all TPCH queries.
amirsh Aug 21, 2017
3b8d498
Merge remote-tracking branch 'origin/calc-ast' into namer
amirsh Aug 21, 2017
2c6af13
Merge remote-tracking branch 'origin/sql-parser' into calc-ast
Sajerner Aug 21, 2017
d68bd06
Extractor for all SQL expression providing a factory method and the l…
amirsh Aug 21, 2017
9146a64
SQLNamer puts aliasing for from fields deep inside the SQL expressions.
amirsh Aug 21, 2017
48ef358
SQLNamer now makes all field identifiers fully qualified.
amirsh Aug 21, 2017
7611fec
mk_sum fixed, Or added to calc of condition and inline naming fixed
Sajerner Aug 21, 2017
a184569
union added & lift bug in calc_of_sql_expr fixed
Sajerner Aug 22, 2017
2aecb2a
Added pardis-synthesis dependency.
amirsh Aug 24, 2017
bd0dc1a
WIP
Sajerner Aug 24, 2017
a759235
additional relation bug fixed
Sajerner Aug 24, 2017
1e45802
CalcExpr nodes follow SC pardis synthesizer design
amirsh Aug 24, 2017
e2b7c42
Added transformer for Calc and a sample rule
amirsh Aug 24, 2017
2a3a2c2
additional relation and exists bug fixed
Sajerner Aug 24, 2017
f0b1a59
Divide added , cases for add completed
Sajerner Aug 25, 2017
f1db384
Not Exists added and Sources became Option in all functions.
Sajerner Aug 25, 2017
a05a365
some fixing
praal Aug 25, 2017
4084683
Merge remote-tracking branch 'origin/calc-ast' into rules
amirsh Aug 25, 2017
329ce6a
Fixed an issue with SQLNamer in naming group by.
amirsh Aug 25, 2017
f0d41e4
some fixing
praal Aug 25, 2017
ab890ef
SQLNamer converts natural join to a proper inner join.
amirsh Aug 25, 2017
4cb1301
InList in progress
Sajerner Aug 25, 2017
4504982
some fixing (almost all the tests passed)
praal Aug 28, 2017
1588fbd
deep scoping bug fixed
Sajerner Aug 29, 2017
d7d9b85
InList problem in AST fixed
Sajerner Aug 29, 2017
b25ba37
all simple calc queries passed
praal Aug 29, 2017
cd8931e
Merge remote-tracking branch 'origin/rules' into sql-parser
praal Aug 29, 2017
2525801
Merge remote-tracking branch 'origin/rules' into calc-ast
Sajerner Aug 29, 2017
1ffac4f
Merge remote-tracking branch 'origin/sql-parser' into calc-ast
Sajerner Aug 29, 2017
ca1062f
Driver changed
Sajerner Aug 29, 2017
3f94312
some rules are added
praal Aug 29, 2017
beba594
all the nesting rewrites and normalize rules are added
praal Aug 29, 2017
fbd1433
Merge remote-tracking branch 'origin/sql-parser' into rules
amirsh Aug 29, 2017
481f2ad
Recursive rule based transformation added
amirsh Aug 29, 2017
67a014c
all simple test checked
Sajerner Aug 30, 2017
32009fa
calc raw queries
praal Aug 30, 2017
d4080ee
correcting raw queries
praal Aug 30, 2017
d5a3fff
Merge remote-tracking branch 'origin/rules' into sql-parser
praal Aug 30, 2017
e61a747
CalcParserTest fixed
praal Aug 30, 2017
ff25019
bottom up/top down transformers added, calc rules fixed, tests for ca…
praal Aug 30, 2017
8cd4d83
List of queries and count(distinct) fixed
Sajerner Aug 30, 2017
e4b5abc
Merge remote-tracking branch 'origin/rules' into calc-ast
Sajerner Aug 30, 2017
f8f7c99
Merge remote-tracking branch 'origin/sql-parser' into calc-ast
Sajerner Aug 30, 2017
48d845d
all tpch tests passed in rulebased opt
praal Aug 30, 2017
c7a4d22
in list fixed and ScanForExistance added
Sajerner Aug 31, 2017
f89130d
Modifications to QueryInterpreter, Testing still WIP
amirsh Aug 31, 2017
9bb7e68
names fixed
praal Aug 31, 2017
5e08cc4
Merge remote-tracking branch 'origin/calc-ast' into rules
amirsh Aug 31, 2017
d0351dc
moving general functions to CalcUtils
praal Aug 31, 2017
77f64b2
moving getcalcfiles
praal Aug 31, 2017
a46afbe
deleting unnecessary returns
praal Aug 31, 2017
f4f9972
Fixes for PlanExecuter, tpch Q1 correctly works now
amirsh Aug 31, 2017
8b6c7cf
Select Distinct fixed
Sajerner Sep 1, 2017
b821d46
Minor fixes for SQLNamer.
amirsh Sep 1, 2017
20a6638
SQLTyper WIP, SQLAnlyzer completely removed.
amirsh Sep 1, 2017
493370d
Merge remote-tracking branch 'origin/sql-parser' into rules
amirsh Sep 1, 2017
591752b
DBToaster driver uses proper arguments as compilation options.
amirsh Sep 1, 2017
b9e8bb4
listmax added
Sajerner Sep 1, 2017
534a6aa
Merge remote-tracking branch 'origin/rules' into calc-ast
Sajerner Sep 1, 2017
a704362
namer bug fixed
Sajerner Sep 1, 2017
52f3238
SQL AST nodes are using Pardis Tpe instead of String
amirsh Sep 1, 2017
7f9239d
A test added for SQLTyper
amirsh Sep 1, 2017
bb616a0
some other functions added
Sajerner Sep 1, 2017
34416e6
Merge remote-tracking branch 'origin/calc-ast2' into calc-ast
amirsh Sep 1, 2017
3b58cbe
formatting
amirsh Sep 1, 2017
9562efc
Merge pull request #2 from epfldata/sql-parser
amirsh Sep 1, 2017
1969d39
SQLTyper properly takes schema context into account
amirsh Sep 3, 2017
d1d8c00
Merge remote-tracking branch 'origin/develop' into sql-parser
amirsh Sep 3, 2017
8c95b94
SQLTyper works for TPCH and simple queries of DBToaster
amirsh Sep 3, 2017
9d18fd7
Costing and search infrastructure added
amirsh Sep 4, 2017
cf04e83
calc compiler initializing
Sep 23, 2017
da007f4
WIP
Sep 23, 2017
f3bdcac
compile table and some data structures
Sep 23, 2017
6396995
fixing
Sep 23, 2017
8e2862d
constant calc values are added.
praal Sep 23, 2017
c9aa693
Date added, method names changed to camelCase format
Sajerner Sep 25, 2017
fe14634
source of findTable changed to schema.tables
Sajerner Sep 25, 2017
990bb3e
tables removed from all methods
Sajerner Sep 25, 2017
aad555f
varOfSqlVar parameter's type changed from string to Tpe and TPCH test…
Sajerner Sep 25, 2017
488f26c
subtract and like added
Sajerner Sep 27, 2017
cdd5200
WIP
praal Oct 2, 2017
9604c66
delta of expression is completed.
praal Oct 2, 2017
1d43226
WIP extract renamings
praal Oct 3, 2017
2f1f8fd
Case added , BoolLiteral added to SQL-AST
Sajerner Oct 4, 2017
df60269
materialize as external is done
praal Oct 9, 2017
754cf0b
compile map is almost done.
praal Oct 14, 2017
293e61f
compile functions are almost completed.
praal Oct 15, 2017
1b3dc08
some fixing
praal Oct 15, 2017
f8ff984
fixing todos
praal Oct 16, 2017
6bb8262
fold and rewrite in Calculus
praal Oct 17, 2017
bbd3c1d
CalcCosting with no difference between 1d and 2d in cardinality
Sajerner Nov 1, 2017
7ebc974
simple test passed before implementing lift
Sajerner Nov 5, 2017
960fe20
almost all test accepted
Sajerner Nov 5, 2017
75a65da
compiler can compile tpch and simple calc queries (without exceptions)
praal Nov 13, 2017
023398a
code cleaning
Sajerner Dec 1, 2017
61dc134
Merge branch 'calc-compiler' into cost-compiler
amirsh Jan 4, 2018
ec2d343
Removed unnecessary outputs and disabled the test on cost-based optim…
amirsh Jan 4, 2018
54cd0ef
Improved what the driver outputs for Plan and Calc.
amirsh Jan 4, 2018
fe1fe57
plan to m3
praal Jan 5, 2018
bc43f52
minor bug
praal Feb 18, 2018
799e5d0
Removed unnecessary output logs.
amirsh Feb 18, 2018
a45fbbc
Merge branch 'calc-compiler' of github.com:epfldata/dblab-toaster int…
amirsh Feb 18, 2018
c17556b
Added Plan as an option for the output language.
amirsh Feb 18, 2018
f956ea7
dbtoaster and olap-engine sbt commands added.
amirsh Feb 18, 2018
9c95c90
Fixing a problem with PlanExecutor WIP
amirsh Feb 18, 2018
9898f23
Fixed bugs with handling joins in SQLNamer and SQLTyper.
amirsh Feb 19, 2018
926c021
Bugs fixed with accessing a field of a row in PlanExecutor.
amirsh Feb 19, 2018
4075608
Fixed a problem with naming and typing the HAVING clause.
amirsh Feb 19, 2018
dabbaa4
The test file for olap interpreter ignores precision of double values.
amirsh Feb 20, 2018
f0e1509
Namer names the join clause as well now.
amirsh Feb 20, 2018
f082587
Tentative fixes for SQL to plan convertor for the case of semi join.
amirsh Feb 20, 2018
f0c983c
Query plan naive optimizer handles projection and subqueries.
amirsh Feb 20, 2018
cf8d995
Schema has a method for finding a table by its attribute name.
amirsh Feb 20, 2018
8afb664
Removed unnecessary console outputs.
amirsh Feb 20, 2018
1c538e5
Updated the test file for query interpreteration of TPCH queries.
amirsh Feb 20, 2018
9c191d7
Updated TPCH queries 2 and 17.
amirsh Feb 20, 2018
8ed123c
Refactored SQL AST nodes for SELECT *
amirsh Feb 20, 2018
4080c90
Added a simple infrastructure for cost estimation and a function to v…
plechoss Mar 12, 2018
8246aea
Small fix for join tables list
plechoss Mar 12, 2018
3f9a6b7
Added filterSelectivity, joinOutputEstimation and Selinger join optim…
plechoss Mar 26, 2018
6a8ba42
Added simplified visualisation of queries with edge weights showing t…
plechoss Apr 6, 2018
8dac664
Added nested loop joins to join types
plechoss Apr 6, 2018
d128555
Cleaning up code
plechoss Apr 6, 2018
7eb956a
Added primary and foreign key information to the schema. Selinger alg…
plechoss Apr 16, 2018
b3c1e4d
Optimizer and visualiser handle subqueries well now
plechoss May 2, 2018
fa59288
Selinger optimizer returns results similar to hyperWeb for queries 2 …
plechoss May 7, 2018
c705034
Trees returned by selinger optimizer now contain projection, map, ord…
plechoss May 22, 2018
492f9e7
Deleted useless methods, refactored selinger function
plechoss May 24, 2018
4053a9c
Refactored the getTableNames(e: Expression) method
plechoss May 24, 2018
74a739b
Cleaned up more code, fixed a bug in collecting filter expressions in…
plechoss Jun 1, 2018
536079f
Fixed a bug that stopped some subqueries from optimizing. Added selec…
plechoss Jun 6, 2018
ded534c
Added Selection nodes for Aggregations. Improved size estimation for …
plechoss Jun 7, 2018
fd56097
Removed useless methods in Selinger Optimizer. Cleaned up comments.
plechoss Jun 7, 2018
aa89c0b
Refactored the getJoinColums method
plechoss Jun 7, 2018
c813864
Minor fixes
plechoss Jun 7, 2018
8c19552
Minor fixes.
plechoss Jun 8, 2018
28a4f58
Moved the visualize method to its own object
plechoss Jun 8, 2018
7c83c5e
Moved the visualize method
plechoss Jun 8, 2018
f582499
Removed the addStats helper method from QueryInterpreter
plechoss Jun 8, 2018
61feb91
Added the HashJoin and IndexNestedLoopJoin types.
plechoss Jun 8, 2018
ba52316
Renamed the visualizer package.
plechoss Jun 8, 2018
ada6d7f
Removed an expression from getFilterSelection which caused wrong size…
plechoss Jun 8, 2018
b13dec1
Fixed the visualization of ScanOpNodes
plechoss Jun 8, 2018
61d703b
Added qualifiers to ScanOpNodes
plechoss Jun 8, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
*.swo
*.swn

*.iml

# sbt specific
.cache/
.history/
Expand All @@ -30,3 +32,5 @@ result.csv
run_config.cfg

tpchdata/

.idea/
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ object Config {
/** Specifies whether to obtain statistics during schema definition */
val gatherStats: Boolean = false
/** Specifies whether to show information about the query plan generation during execution */
val debugQueryPlan: Boolean = true
val debugQueryPlan: Boolean = false
/** Specifies whether to specialize the loader or not */
var specializeLoader: Boolean = true
/** Specifies whether to specialize the query engine or not */
Expand Down
158 changes: 158 additions & 0 deletions components/src/main/scala/ch/epfl/data/dblab/dbtoaster/Driver.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
package ch.epfl.data
package dblab
package dbtoaster

import schema._
import frontend.parser._
import frontend.optimizer._
import frontend.analyzer._
import utils.Utilities._
import java.io.PrintStream

import ch.epfl.data.dblab.frontend.parser.CalcAST._
import ch.epfl.data.dblab.frontend.parser.DDLAST.UseSchema
import ch.epfl.data.dblab.frontend.parser.SQLAST._
import ch.epfl.data.sc.pardis.annotations.mutable
import ch.epfl.data.sc.pardis.types.{ IntType, Tpe }
import frontend.parser.OperatorAST._
import config._
import schema._
import sc.pardis.language.Language

import scala.collection.mutable.ArrayBuffer

object Driver {
/**
* The starting point of DBToaster
*
* @param args the setting arguments passed through command line
*/
def main(args: Array[String]) {
if (args.length < 1) {
System.out.println("ERROR: Invalid number (" + args.length + ") of command line arguments!")
System.out.println("USAGE: dbtoaster <SQL/Calc query> -l SQL|CALC|PLAN|M3 -O")
System.out.println("Example: dbtoaster experimentation/dbtoaster/queries/tpch/query6.sql -l M3")
System.exit(1)
}
val options = nextOption(Map(), args.toList)
val queryFiles = options.get('queries) match {
case Some(l: List[String]) => l
case x => throw new Exception(s"No queries provided: $x")
}
val outputLang: Language = options.get('lang).map(_.toString().toUpperCase()) match {
case Some("CALC") => Calc
case Some("M3") => M3
case Some("SQL") => SQL
case Some("PLAN") => CalcPlan
case _ =>
throw new Exception("No proper -l defined!")
}
val shouldOptimize = options.get('opt).map(_.asInstanceOf[Boolean]).getOrElse(false)

for (q <- queryFiles) {
def getCalc(): List[CalcExpr] = if (q.endsWith(".calc")) {
CalcParser.parse(scala.io.Source.fromFile(q).mkString)

} else {
val sqlParserTree = SQLParser.parseStream(scala.io.Source.fromFile(q).mkString)
val sqlProgram = sqlParserTree.asInstanceOf[IncludeStatement]
val tables = sqlProgram.streams.toList.map(x => x.asInstanceOf[CreateStream]) // ok ?
val ddlInterpreter = new DDLInterpreter(new Catalog(scala.collection.mutable.Map()))
val query = sqlProgram.body
// println(query)
val schema = ddlInterpreter.interpret(UseSchema("DBToaster") :: tables)

def listOfQueries(q: TopLevelStatement): List[TopLevelStatement] = {
q match {
case u: UnionIntersectSequence if u.connectionType.equals(SEQUENCE) =>
List(u.top) ++ listOfQueries(u.bottom)
case x => List(x)
}
}
val queries = listOfQueries(query)

val sqlToCalc = new SQLToCalc(schema)

// sql_to_calc.init()
val namer = new SQLNamer(schema)
val typer = new SQLTyper(schema)
// val tpchSchema = TPCHSchema.getSchema("experimentation/dbtoaster/queries/sf0.001/", 0.0001)
// val calcCoster = new CalcCosting(tpchSchema)

queries.flatMap({ q =>
val namedQuery = namer.nameQuery(q)
val typedQuery = typer.typeQuery(namedQuery)
val calcExpr = sqlToCalc.calcOfQuery(None, typedQuery)

// println()
// println("Costing : ")
// println()
// println(schema)
// println()
// println(tpchSchema)

// calcExpr.foreach({
// case (name, exp) =>
// println(name + ":")
// println(calcCoster.cost(exp))
// })
// println()
// println()
// println()

calcExpr.map({ case (tgt_name, tgt_calc) => tgt_calc })
})
}
def getCalcPlan(calcExprs: List[CalcExpr]): (Plan, List[CalcQuery]) = {
val queries = calcExprs.collect({ case ce: CalcQuery => ce })
CalcCompiler.compile(Some(1), queries, Schema(ArrayBuffer(), Statistics()))
}
outputLang match {
case Calc =>
val calcExprs = getCalc()
for (calcExpr <- calcExprs) {
calcExpr match {
case CalcQuery(x, y) => println(s"$x:\n${prettyprint(y)}")
case _ =>
}
}
case CalcPlan =>
println("Outputing PLAN")
val calcExprs = getCalc()
val (plan, qs) = getCalcPlan(calcExprs)
for (cds <- plan.list) {
println("description")
println(cds.description)
println("triggers")
println(cds.triggers)

}
// println(plan)
// println(qs)
case M3 =>
println("Outputing M3")
val calcExprs = getCalc()
val (plan, qs) = getCalcPlan(calcExprs)
val m3 = PlanToM3.planToM3(Schema(ArrayBuffer(), Statistics()), plan)
println(m3)
case lang =>
throw new Exception(s"Outputing language $lang is not supported yet!")
}
}
}

type OptionMap = Map[Symbol, Any]

def nextOption(map: OptionMap, list: List[String]): OptionMap = {
list match {
case Nil => map
case "-O" :: tail =>
nextOption(map ++ Map('opt -> true), tail)
case "-l" :: value :: tail =>
nextOption(map ++ Map('lang -> value), tail)
case string :: tail =>
println(s"foo-$string")
nextOption(map ++ Map('queries -> (map.getOrElse('queries, List()).asInstanceOf[List[String]] :+ string)), tail)
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
package ch.epfl.data
package dblab
package dbtoaster

import sc.pardis.language._

case object SQL extends Language
case object Calc extends Language
case object CalcPlan extends Language
case object M3 extends Language
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
package ch.epfl.data
package dblab
package frontend
package analyzer

import ch.epfl.data.dblab.frontend.parser.SQLAST
import parser.CalcAST._
import sc.pardis.search.CostingContext
import sc.pardis.ast.Node
import schema._

import scala.math._

class CalcCosting(schema: Schema) extends CostingContext {
def apply(node: Node): Double = cost(node.asInstanceOf[CalcExpr])
var count = 0

def searchCost(x: Double): Double = max(1, scala.math.log(x) / scala.math.log(2))

def findVars(exp: CalcExpr): scala.collection.mutable.Map[VarT, CalcExpr] = {
val vars = scala.collection.mutable.Map[VarT, CalcExpr]()
exp match {
case CalcProd(lst) => lst.foldLeft(vars)((a, b) => a ++ findVars(b))
case CalcSum(lst) => lst.foldLeft(vars)((a, b) => a ++ findVars(b))
case AggSum(v, e) => findVars(e)
case CalcNeg(e) => findVars(e)
case Lift(vr, e) => vars += (vr -> e)
case _ => vars
}
}

def cost(exp: CalcExpr): Double = {
count += 1
val vars = findVars(exp)
costExpr(exp, vars)
}

def costExpr(exp: CalcExpr, vars: scala.collection.mutable.Map[VarT, CalcExpr]): Double = {

exp match {
case CalcProd(List()) => 0.0
case CalcProd(lst) =>
val headCardinality = cardinality(lst.head, vars)._2
lst.map(x => costExpr(x, vars)).sum +
lst.foldLeft((1.0, 0.0))((a, b) => {
if (cardinality(b, vars)._1 == 2)
(a._1 * cardinality(b, vars)._2, a._2 + a._1 * cardinality(b, vars)._2)
else if (cardinality(b, vars)._1 == 1 && a._1 == 1)
(cardinality(b, vars)._2, a._2 + cardinality(b, vars)._2)
else if (cardinality(b, vars)._1 == 1 && a._1 != 1)
(a._1, a._2 + a._1)
else
a
})._2 - headCardinality
case CalcSum(lst) =>
val headCardinality = cardinality(lst.head, vars)._2
lst.map(x => costExpr(x, vars)).sum + lst.map(x => searchCost(cardinality(x, vars)._2)).sum +
headCardinality - searchCost(headCardinality) //TODO what if they had different rows ?
case CalcNeg(e) => costExpr(e, vars) + cardinality(e, vars)._2
case AggSum(List(), e) => costExpr(e, vars) + cardinality(e, vars)._2
case AggSum(v, e) => costExpr(e, vars) + cardinality(e, vars)._2 *
searchCost(cardinality(e, vars)._2) + cardinality(e, vars)._2 //TODO we should multiply sorting part to a constant if we know what algorithm is used
case _: Rel => cardinality(exp, vars)._2
case Cmp(c, first, second) => costExpr(first, vars) + costExpr(second, vars) + max(cardinality(first, vars)._2, cardinality(second, vars)._2)
case External(_, inps, outs, tp, _) => ???
case CmpOrList(v, consts) => ???
case Lift(vr, e) => searchCost(vars.size)
case Exists(term) => costExpr(term, vars) + cardinality(term, vars)._2
case CalcValue(v: ArithFunc) => 5.0
case CalcValue(v: ArithVar) => cardinality(v, vars)._2
case CalcValue(_) => 1.0
case ArithVar(v) =>
if (vars.contains(v))
costExpr(vars(v), vars)
else
schema.stats.getCardinalityOrElse("PART", 1) //TODO it should change finding size of R in schema
case ArithProd(lst) =>
lst.map(x => costExpr(x, vars)).sum +
lst.foldLeft((1.0, 0.0))((a, b) => {
if (cardinality(b, vars)._1 == 1 && a._1 == 1)
(cardinality(b, vars)._2, a._2 + a._1)
else if (cardinality(b, vars)._1 == 1 && a._1 != 1)
(a._1, a._2 + a._1)
else
a
})._2

case _: ArithFunc => 5.0
case ArithConst(s: SQLAST.StringLiteral) => s.v.length.toDouble
case _: ArithConst => 1.0
}
}

def cardinality(exp: CalcExpr, vars: scala.collection.mutable.Map[VarT, CalcExpr]): (Int, Double) = {
exp match {
case CalcProd(lst) => (lst.foldLeft(1)((a, b) => max(a, cardinality(b, vars)._1)), lst.foldLeft(1.0)((a, b) => {
if (cardinality(b, vars)._1 == 2)
a * cardinality(b, vars)._2
else if (cardinality(b, vars)._1 == 2 && a == 1)
cardinality(b, vars)._2
else
a
}))
case CalcSum(lst) => (lst.foldLeft(1)((a, b) => max(a, cardinality(b, vars)._1)),
lst.foldLeft(0.0)((a, b) => max(a, cardinality(b, vars)._2)))
case CalcNeg(e) => cardinality(e, vars)
case AggSum(List(), e) => (1, 1)
case AggSum(v, e) => (cardinality(e, vars)._1, min(cardinality(e, vars)._2,
v.foldLeft(1)((a, b) => schema.stats.getDistinctAttrValuesOrElse(b.name, cardinality(e, vars)._2.toInt) * a)) * 0.5)
case Rel(_, name, _, _) => (2, schema.stats.getCardinalityOrElse("PART", 1)) //TODO it should change to name
case Cmp(c, first, second) => (1, max(cardinality(first, vars)._2, cardinality(second, vars)._2) * 1) // TODO : +we can have some inference here
case External(_, inps, outs, tp, _) => ???
case CmpOrList(v, consts) => ???
case Lift(vr, e) => (0, 1)
case Exists(term) => cardinality(term, vars) //TODO we can have inference here
case CalcValue(v: ArithVar) => cardinality(v, vars)
case CalcValue(_) => (1, 1)
case ArithVar(v) =>
if (vars.contains(v))
cardinality(vars(v), vars)
else
(1, schema.stats.getCardinalityOrElse("PART", 1)) //TODO it should change finding size of R in schema
case _: ArithConst => (1, 1)
case _: ArithFunc => (1, 1)
case ArithProd(lst) => (lst.foldLeft(1)((a, b) => max(a, cardinality(b, vars)._1)), lst.foldLeft(1.0)((a, b) => {
if (cardinality(b, vars)._1 == 2 && a == 1)
cardinality(b, vars)._2
else
a
}))

}
}
}
Loading