Creating and
Manipulating Relational Databases
Ø CREATE TABLE <table name> (
<column name> <data type> <constraint>,
<column name> <data type> <constraint>);
Ø CREATE TABLE Item (
Item_Identifier char(5) NOT NULL,
Item_Description char(15) NOT NULL,
CONSTRAINT pk_item PRIMARY KEY (Item_Identifier) );
Ø CHAR(n) Fixed length string of exactly n characters.
Ø NUMERIC(p,q) Decimal number, p digits with decimal point q digits from the right.
Ø INTEGER Signed integer.
Ø DECIMAL (p,q), FLOAT, DOUBLE PRECISION, REAL, SMALLINT, ...
Ø There are many synonyms, always check your implementation.
Ø You can eliminate the need to control the order in which tables and constraints are created by executing all the create table statements and then adding FK constraints using alter table statements. PK constraints can be added at either stage.
Ø ALTER TABLE <tablename> ADD
Ø
columns,
constraints
Ø ALTER TABLE Employee ADD birthdate DATE;
Ø
NOT NULL is
not allowed with ALTER TABLE ADD column as all the rows will have a null value
immediately after the column is added
Ø ALTER TABLE Employee ADD PRIMARY KEY(EmployeeID);
Ø ALTER TABLE Product ADD FOREIGN KEY(prod_group_code)
Ø
REFERENCES
Product_Group (prod_group_code);
Inserting values into a database
Ø INSERT INTO employees VALUES
(98765, 00011, Jones,Cecile, 822-8947, 3343, null);
Ø INSERT INTO employees VALUES
(98775, 00011, Cray,Jean, 822-8947, 3344, 834-3382);
Ø Queries in SQL are performed using the SELECT statement.
Ø The SELECT statement has 6 clauses:
Ø
SELECT <attribute list>
Ø
FROM <table list>
Ø
[WHERE <selection criteria and join
condition>]
Ø
[GROUP BY <grouping attribute(s)>]
Ø
[HAVING <group selection condition>]
Ø
[ORDER BY <attribute list>]
Ø
Clauses in
brackets [ ... ] are optional
Ø SELECT *
Ø
* refers to
all attributes in the order they were created in CREATE TABLE
Ø
SELECT * FROM
item;
Ø
This
statement projects all columns from all tuples from the specified table.
Ø SELECT supplier_identifier, supplier_name
Ø
if a list of
attributes is provided then only these attributes are projected and in the
order specified.
Ø SELECT ALL / DISTINCT
Ø
ALL means
that all tuples are selected. DISTINCT means that only one tuple for each
distinct value in the specified attributes will be selected.
Ø
If not
specified, the default is ALL
Ø
Null values
are considered equal and will be treated as duplicate values.
Ø As noted above, the FROM clause is required in all queries.
SELECT DISTINCT item_description
FROM supplier_item;
Ø The simplest query is of the form: SELECT * FROM tablename;
Ø the FROM clause specifies the tables in which the requested data is found.
Ø You can create a table alias in the from clause in order to make the select statement more readable.
SELECT i.item_description,
s.supplier_name, si.item_code
FROM supplier_item
si, supplier s, item i
WHERE si.item_identifier=i.item_identifier
AND si.supplier_identifier=s.supplier_identifier
Ø
The WHERE
clause specifies the criteria for selecting individual tuples
(the relational SELECT operation)
Ø
The WHERE
clause specifies the join condition for joining two relations
(the relational JOIN operation)
Ø Single table queries have no join conditions, hence the where clause specifies only the criteria for selecting tuples.
Ø The standard comparison operators are used in the selection criteria
Ø
= Equal
Ø
<> Not equal
Ø
> Greater than
Ø
< Less than
Ø
>= Greater than or Equal
Ø
<= Less than or Equal
SELECT item_description
FROM item
WHERE item_identifier
= 00722;
AND
Ø
AND is used
to specify multiple selection criteria, all of which must be met for a row to
be selected.
Ø
AND is useful
when formulating search criteria involving multi-attribute keys:
SELECT unit_price
FROM supplier_item
WHERE item_identifier
= '00722'
AND supplier_identifier
= '92327';
OR
Ø
OR is used to
specify multiple selection criteria, any of which must be met for a row to be
selected.
Ø
OR is useful
to select rows outside a range:
SELECT unit_price
FROM supplier_item
WHERE item_identifier
= '00722'
OR supplier_identifier
= '92327';
NOT
Ø
NOT is placed
before an entire expression to negate the truth value
SELECT item_code
FROM supplier_item
WHERE NOT supplier_identifier = '92327';
WHERE
supplier_identifier NOT = 123456 is incorrect
No abbreviations of logical
conditions are allowed. |
Cannot be written as: |
SELECT * FROM item WHERE item_identifier = 12345 OR item_ identifier = 19961 |
SELECT * FROM item WHERE item_ identifier = 12345 OR 19963 |
BETWEEN
SELECT *
FROM item
WHERE item_identifier
BETWEEN 12000 AND 21000
BETWEEN
is inclusive.
IN
Ø
IN selects
those rows for which a specified value appears in a list of constant values
enclosed in parentheses
SELECT *
FROM item
WHERE item_identifier
IN (67211, 27412)
NULL
Ø
NULL selects
those rows for which a specified value is the special SQL null value.
SELECT *
FROM employees
WHERE pager
IS NULL
LIKE
Ø
LIKE selects
those rows that match a given search pattern.
Ø
LIKE can only
be used with character data.
SELECT *
FROM supplier_item
WHERE item_code LIKE %345%
Ø
The following
wildcards are supported:
%Avenue% |
any character string containing the word Avenue. |
%Avenue |
any character string consisting of any number of characters ending in
the word Avenue. |
Avenue% |
Any character string beginning with the word Avenue. |
b_g |
Any three letter string beginning with b and ending in g. Only a
three character string can match this pattern. |
b_g% |
Any character string beginning with b, followed by any single
character, followed by g, followed by any string of zero or more characters. |
%b_g% |
Any character string that has a three character string beginning with
b and ending with g anywhere in it. |
WHERE specifies join conditions
Ø Join conditions are specified by a comparison between the values of two columns in different tables. Consider the following tables:
create table SUPPLIER
(
SUPPLIER_IDENTIFIER char(5) not null,
SUPPLIER_NAME char(30) not null,
SUPPLIER_ADDRESS char(45) not null,
constraint pk_supplier primary key
(SUPPLIER_IDENTIFIER)
)
create table SUPPLIER_ITEM
(
ITEM_CODE char(5) not null,
ITEM_IDENTIFIER char(5) not null,
SUPPLIER_IDENTIFIER char(5) not null,
UNIT_PRICE money not null,
constraint pk_supplier_item primary key
(ITEM_CODE)
)
Ø The join condition is based on the equivalence of the value of the foreign key supplier_identifier in the table supplier_item with the value of the primary key supplier_item in the table supplier.
Ø
For every
item, list the items identifier, code, unit price, and the name of the supplier
that supplies that item at that price.
SELECT item_identifier,
item_code, unit_price, supplier_name
FROM supplier_item
si, supplier s
WHERE s.supplier_identifier = si.supplier_identifier
Aggregate Functions
Ø COUNT, SUM, MAX, MIN, AVG (your DBMS may have additional aggregate functions)
SELECT MAX(unit_price),
MIN(unit_price), AVG(unit_price)
FROM supplier_item;
SELECT COUNT(item_identifier)
FROM supplier_item
Is
equivalent to COUNT(*) which is a count of how many tuples in the relation.
SELECT COUNT
(DISTINCT item_identifier)
FROM supplier_item
Is
the number of distinct item identifiers in the table.
Ø Aggregate functions require the GROUP BY clause
Ø
An aggregate
function gives a result based on a group of rows in a table. Count, Sum, Max,
Min, Average all are operations on groups of values.
Ø
Remember that
the result of an operation on a relation is another relation. Relations are rectangular
in shape, thus any operation must produce a rectangular result.
Ø
It follows
that you cant mix aggregate functions, which return one value, with standard
SQL statements that return a set of individual values. For example, the query
SELECT item_identifier,
AVG(unit_price)
FROM supplier_item;
would result in multiple rows for the product_code
part and a single row for the AVG(unit_price) part.
Ø The purpose of the GROUP BY clause is to group those columns in the SELECT clause that are not the arguments of an aggregate function so that the resulting table will be rectangular.
Ø All attributes appearing in SELECT attribute list must appear in the GROUP BY list unless they are the argument of an aggregate function. This guarantees that each attribute in the SELECT clause will be single valued per group.
SELECT item_identifier,
AVG(unit_price)
FROM supplier_item
GROUP BY item_identifier;
This
query will give the expected results. The aggregate function is calculated on each
group (product_code).
Another
way to think about GROUP BY is that it specifies the control break for
calculating the values of the aggregate functions.
Ø HAVING selects groups of tuples just like the selection criteria in the WHERE clause selects individual tuples.
Ø Think of the HAVING clause as specifying the selection criteria for group values generated by aggregate functions.
v
Which items have an average
unit_price greater than 100?
SELECT item_identifier,
AVG(unit_price)
FROM supplier_item
GROUP BY item_identifier;
HAVING AVG(unit_price)
> 100;
Ø ORDER BY permits the ordering of the tuples in the result of a query by one or more attributes. The default order is ascending. The order used for each attribute in the ORDER BY clause can be specified by following the attribute name with DESC for descending or ASC for ascending.
v
Generate a
list of item identifiers having an average unit price of more than 100. List
the output in order of the item identifiers.
SELECT item_identifier,
AVG(unit_price)
FROM supplier_item
GROUP BY item_identifier;
HAVING AVG(unit_price)
> 100
ORDER BY item_identifier;
v
Generate a
list of item identifiers having an average unit price of more than 100. List the
item identifiers in order of largest to smallest average unit price.
SELECT item_identifier,
AVG(unit_price)
FROM supplier_item
GROUP BY item_identifier;
HAVING AVG(unit_price)
> 100
ORDER BY AVG(unit_price)
DESC;
Ø SELECT statements can be nested to an arbitrary number of levels. The innermost statement is evaluated first.
Ø Remember that the result of any query on a relation is another relation.
v
Create a list
that shows the items whose average unit price is greater than 100$. Your list
should show the item id, description, and average price.
SELECT i.item_identifier,
i.item_description, AVG(si.unit_price)
FROM item
i, supplier_item si
WHERE i.item_identifier
= si.item_identifier AND
i.item_identifier
IN (SELECT item_identifier
FROM
supplier_item
GROUP
BY item_identifier
HAVING
AVG(unit_price) > 100)
GROUP BY i.item_identifier,
i.item_description
ORDER BY i.item_description;
Ø The inner querys result is dependent on the row being processed by the outer query
Ø
The subquery
result varies with the row being processed by the outer query. The defining
characteristic is a reference to one or more outer query tables in the WHERE or
HAVING clause of the subquery.
Ø
the subquery
is re-evaluated for each row in the outer query (consumes resources).
v List items whose average unit price > 150
SELECT i.item_description
FROM item
i
WHERE 150
<
(SELECT
avg(unit_price)
FROM
supplier_item si
WHERE
i.item_identifier = si.item_identifier)
Ø An SQL view is a single, virtual table derived from other tables.
Ø It exists not as the intension itself, but only as the instructions to create that intension.
Ø The tables from which the view is derived can be base tables (physically exist) or other views.
Ø The advantage of a view is that it is always up to date.
CREATE VIEW WhoSuppliesWhat AS
SELECT s.supplier_name,
si.item_identifier, i.item_description,
si.item_code, si.unit_price
FROM supplier
s, item i, supplier_item si
WHERE si.item_identifier=i.item_identifier
AND si.supplier_identifier=s.supplier_identifier;
CREATE VIEW Catalogue (Who, What, Describe, CatNumber,
Price) AS
SELECT s.supplier_name,
si.item_identifier, i.item_description,
si.item_code, si.unit_price
FROM supplier
s, item i, supplier_item si
WHERE si.item_identifier=i.item_identifier
AND si.supplier_identifier=s.supplier_identifier;
Ø Not specifying the column names in the view means the view (virtual table) takes on the corresponding column names from the column list of the select statement from which it is created.
Ø Once created, the view acts as a regular table EXCEPT that its values cannot be updated. Updates are performed on the base tables on which the view is defined.
SELECT Who, Describe, Price
FROM
catalogue
WHERE Describe = Sweater_1
ORDER BY Who;
Ø DROP is used to delete objects from the database
Ø
DROP TABLE
employees;
Ø
DROP VIEW
catalogue;
Ø
ALTER TABLE item
DROP COLUMN item_description CASCADE;
Ø
CASCADE will
drop not only the specified column, but also all view definitions and integrity
constraints using this column. Note that
this is not a cascading delete of values of attributes in other tables but rather
of objects (views and constraints) defined using the dropped column.
Ø A second advantage of a view is that it provides some level of security.
Ø
It limits the
columns and rows that a user can see, thus limiting access to sensitive data
Ø
It is read
only, thus preventing the alteration of any data values.
Ø A second level of security is provided through passwords that permit connection to the database
Ø A third level of security is provided by according certain privileges to certain passwords. This is done through the GRANT statement which takes the following form:
GRANT privileges
ON object
TO users [WITH GRANT OPTION);
Ø Objects are things on which privileges are granted and include the following:
Ø
DOMAIN
domain_name
Ø
[TABLE]
table_name (table includes views in this case)
Ø
CHARACTER SET
character-set_name
Ø
COLLATION
collation_name
Ø
TRANSLATION
translation_name
Ø The key word TABLE is not required for the table object.
Ø The privileges granted must make sense for the object they are granted on.
Ø Privileges is a comma separated list of privileges, or the phrase ALL PRIVILEGES, on the specified object for which the current userID (the one issuing the GRANT) has grant authority.
Ø USAGE
Ø
(to use a
specific domain, character set, collation, or translation).
Ø SELECT
Ø
(access all
columns of a named table, including columns added later).
Ø INSERT(x) or UPDATE(x)
Ø
(insert into
or update a specific column x of a named table).
Ø INSERT or UPDATE
Ø
(update all columns
of a named table, including columns added later).
Ø DELETE
Ø
(delete rows
from a specified table).
Ø REFERENCES(x) or REFERENCES
Ø
(to impose any integrity constraint on column x or
all columns of a named table. In the case of REFERENCES this includes all
columns added later).
Ø An owner of an object is automatically granted all privileges that make sense for that object.
Ø
If a
privilege is granted with grant option then the recipient has grant authority
for that privilege, which means that the recipient can in turn grant that same
privilege, with or without grant option, to some further user.
GRANT delete
ON employees
TO personnelassistant
GRANT select
ON employees
TO PUBLIC
Ø PUBLIC is a special key word meaning all userids known to the system at any given time.
Ø REVOKE is used to take privileges on objects away from users. The user issuing the revoke must be the user who granted the privileges in the first place. The statement has the general form:
Ø
REVOKE
privileges ON object FROM users
Ø Dropping a base table, column, or view automatically revokes all privileges on the dropped object from all users.
Ø If A grants a privilege to B and C on Table T1 with grant option, and both B and C grant the privilege to D on T1, and then later B revokes this same privilege from D, D still retains the privilege on T1 based on Cs granting of the privilege.
Ø Ambiguous: can mean dont know or doesnt exist (no/0) or not applicable
Ø Nulls in the argument of aggregate functions are eliminated before the aggregate function is applied, except in the case of COUNT(*), where nulls are handled just like non-null values.
Ø Effect on conditional expressions:
Ø
If A is null,
or B is null, or both are null, then the following all evaluate to unknown:
Ø
A=B, A<B,
A>B, A<>B, A<=B, A>=B
Ø
Two nulls are
not considered equal: A=A evaluates to unknown.
Ø This means that we need to adopt a three-valued truth condition for any expression involving nulls
Ø
true, false,
unknown (null) = AMBIGUITY
AND |
t |
U |
f |
|
OR |
T |
u |
f |
|
NOT |
|
T |
t |
U |
f |
|
t |
T |
t |
t |
|
t |
f |
U |
u |
U |
f |
|
u |
T |
u |
u |
|
u |
u |
F |
f |
F |
f |
|
f |
T |
u |
f |
|
f |
t |
NULLS in other
conditions
Ø y BETWEEN x AND z is equivalent to x <= y AND y <= x
Ø
either
component may evaluate to null with the result of either unknown (one or both
null) or false (one false and the other either false or null) depending on the
value of the other component expression.
Ø string LIKE pattern
Ø
is null if
either string or pattern is null
Ø x IN (a, b)
Ø
is equivalent
to x = a OR x = b. The result will be unknown unless one of the conditions is
true. If all are false and at least one unknown then the overall result is
unknown.
Ø All rows that evaluate to false or unknown are eliminated from the WHERE and HAVING clauses.
Ø All nulls are considered equal for purposes of ORDER BY, however whether they are greater or less than all non-null values is implementation defined.
Ø An integrity constraint is an expression that must not evaluate to false. Hence it is not considered to be violated when it evaluates to unknown (formally it is not known if the constraint is violated).
Ø
For the WHERE
clause, unknown = false. For domain and check constraints, unknown = true.
Ø For FK constraints unknown is generally interpreted as false.
Ø PK constraint is defined as including NOT NULL.
DESIGN
TO AVOID NULLS |