No, you don't have to write the parser!
Instead, a complete parser and type checker for the MiniSQL language is provided with the skeleton code.
This small subset of SQL includes the basic commands CREATE
, DROP
,
INSERT
, SELECT
, UPDATE
, DELETE
, DESCRIBE
, and EXPLAIN
.
Supported data types include INTEGER
, FLOAT
, and STRING
(notice the slight deviations from the SQL standard).
The following (incomplete) list illustrates what is not included in the language:
NULL
values
WHERE a = b + 1
FROM Employees E
) DISTINCT
and ORDER BY
GROUP BY
, HAVING
, etc.
Once a MiniSQL statement is parsed, the abstract syntax tree (AST) is passed to the
Optimizer
,
which in turn dispatches the query to the corresponding class implementing the
Plan
interface.
In this project, you will implement several of these plan classes, using the provided
parser
and system
Catalog
.
Your first task is to implement the
CreateIndex
and DropIndex
commands, allowing you to build and destroy hash indexes on tables.
Your implementation of these and the remaining Plan
classes must meet the following general requirements:
QueryCheck
to fulfill this requirement.
Catalog
to maintain them.
Some useful hints and tips:
CreateTable
and DropTable
as a reference. CREATE INDEX
should actually build the hash index! CreateTable
to "index")
Your next task is to implement the Insert
and Select
classes, allowing you to create and query actual data.
Remember to fulfill the general requirements listed in part one.
For Select
, you will implement a basic query optimizer.
The parser will give you an array of table names to select from, an array of (unique) column names to project,
and an array of selection predicates (in conjunctive normal form).
The basic plan is to use FileScan
s
and SimpleJoin
s for all the tables,
add a Selection
for each conjunct,
and have one Projection
at the root of the
Iterator
tree.
For example:
Given the tables T1(a, b) and T2(c, d) and the following query:
EXPLAIN SELECT d, a FROM T1, T2 WHERE a = c and b = d or a = 5;
The default (naive) execution plan is as follows:
(Note how the conditions of the WHERE clause change into the predicates)
Projection : {3}, {0}
Selection : b = d OR a = 5
Selection : a = c OR a = 5
SimpleJoin : (cross)
FileScan : T2
FileScan : T1
In order to get the full credit, you need to implement the pushing selections optimization technique described in textbook (section 12.5.1): if the predicates of a selection involve only the attributes of one table, you should execute the selection before the join
Some more useful hints and tips:
INSERT
statements.Select
's constructor is to create an Iterator
query tree.execute()
is call iter.explain()
or iter.execute()
)
Besides basic evaluation plan, there is much space for optimization.
We are going to provide you with some MiniSQL queries and expect you to utilize information from the Catalog to optimize the queries.
Think about possibilities to optimize those queries.
Query 1
Query 2
Query 3
Here is Project documentation, and Project skeleton code.
Note that this code is a complete starting point for the project,
i.e. the bufmgr
, heap
, index
, and relop
packages are provided for you (in jar files).
The framework classes may be sightly different than the ones used in previous projects.
Running and testing this project will be quite different from the others.
Instead of using an automated test driver, you will run the provided command-line utility,
Msql
,
which resembles the behavior of SQL*Plus
.
Several test queries are provided with the skeleton code, but more queries will be tested at the time of grading.
Use the STATS
command to view performance counters.
The Msql program can receive input from the command line, or from a file.
In the latter case, you need to provide the file name as an input parameter. For example:
If you use an IDE, feel free to set up your run configuration to accept that input parameter.
java -classpath bin global.Msql mytest.sql
For individual parts, submit you work via Blackboard. As usual, please include together with your code the Makefile, Readme (which describes your project). All files need to be zipped in a file named: your_career_login_qe.zip.
For group part, one of the group member will submit the work via Blackboard. Also please include with your code the Makefile, Readme (which lists group members, how you do the optimization, new features that you want us to know, and roles of each member, i.e. who do what). All files need to be zipped in a file named: your_career_login1_your_career_login2_your_career_login3_qe.zip.
I should be able to compile/run your program using make on a CS deparment Unix machine. The directory structure of your zip file should be identical to the directory structure of the provided zip file (i.e., having the directory src, the Makefile, ...), except the top-level name (should be your career login above). Your grade may be deduced 5% off if you don’t follow this.