SQL - DQL

Data Query Language

This section uses JavaScript to display math formulas. If you keep it disabled, you will most likely only see garbage where formulas should be.

The following section handles the most complex part of SQL: issuing queries to a database table.

When you have created tables in your database, you routinely might want to access the stored information. Assume the following created tables:

      Supplier (sId: integer {PK}, name: char(40), city: char(30))
      Part (pId: integer {PK}, name: char(50), color: char(10))
      SP (sId: integer {->Supplier}, pId: integer {->Part}, qty: integer, price: numeric(4,2); {PK: (sId,pId)})

Most of the above columns are already introduced. The qty and price specify how much of a part a supplier has in stock and for what unit price.

For the viewing of data, you use the select ... statement. As there are many variations and additions to this statement, I assembled a list of possible uses, ordered by ascending complexity:

Projection: $\pi_{a,b,...}(R)$
This will select rows $a$, $b$, etc. from the given relation (set of tuples) $R$. In SQL, this is realized by the from clause:

        select name,city from Supplier;
        
        Result:
        name      city
        -----     -----
        Smith     London
        Borgaiev  Budapest
        ...       ...

Selection $\sigma_{e}(R)$
The where clause will check against a given expression $e$ and only select matching rows:

      select name from Part
        where color='red';
      
      name
      -----
      M1714-A
      M2450-B
      ...

Rename $\rho_{(b1,b2,...)}(R)$
This operation will rename the respective values $a1,a2,...$ of the relation to the given ones:

      select sId as sup, pId as part from SP;
      
      sup     part
      -----   -----
      Smith   M1221-A
      Wells   M1714-A
      ...     ...

Cross Join $R_1 \times R_2$
Much like the mathematical cross product, the cross join statement will produce a table where every possible combination of values from $R_1$ and $R_2$ is present:
```
      select S.name as sname, S.city as city, P.name as pname from Supplier S cross join Part P;
      
      sname   city        pname
      -----   -----       ------
      Smith   London      M1221-A
      Wells   San Diego   M1221-A
      ...     ...         ...
    
```
In the above example, we used an alias for the tables Supplier (alias S) and Part (alias P) to save time on writing and make the statement more readable.

Most of the time, the cross join is not the desired method, as it joins unrelated data together.
Natural Join $R_1 \bowtie R_2$
The natural join is a very useful statement: It joins two tables together by matching attribute names. This way you can acquire data from many tables in one single result table. For example, show all supplier names and the pId's they have in stock:
```
      select name, pId from Supplier natural join SP;
      
      name    pId
      -----   -----
      Smith   1
      Smith   4
      Smith   5
      ...     ...
    
```
This works because both the Supplier and the SP table have a sId attribute. This is used to join the tables together naturally.

You could also write inner join SP using (sId) instead of natural join SP, if you want to be more clear.
Equi-Join $R_1 \bowtie_\theta R_2$
This extended variant of the natural join checks if a given expression is true, and then joins matching rows together. This is needed because sometimes you might not have a convenient setup for natural joins. The following join TABLE on ... clause is equivalent to a natural join:
```
      select A.name, B.qty from Part A join SP B on A.pId=B.pid;
      
      A.name    B.qty
      -----     -----
      M1221-A   120
      M1221-A   230
      ...       ...
    
```
As you can see, there are multiple entries for a part if it has multiple entries in the SP table. This is the correct behaviour.

Another note: The join clause is an abbreviation for inner join. When doing an inner join, only fully matching rows are assembled into the result set. There are different join methods: left outer join, right outer join and full outer join. These will include rows from the left/right/left and right table into the result set even if there was no respective matching row in the other table. Unknown values will be set to null.
This could be useful if there is an optional piece of information a user can supply (e.g. an email address) and you want to see all users and their email address, but including users who have not given their email address:
```
      select username, email from User left outer join ContactInfo using (uId);
      
      username    email
      -----       -----
      john11      john.cusack@airmail.cc
      pyrax       NULL
      fredg       frederic@gonzo.fr
      ...         ...
    
```
Grouping and Aggregation $\gamma_{(a,b,...)}(R)$
By grouping a table according to a given column, all rows with matching values in this column will be put into a single row. When using an aggregate function, the given row is reduced in some way so that it can be grouped properly. For example, to show the total stock for each supplier:
```
      select sId, name, sum(qty) as totalQty from Supplier natural join SP
        group by sId;
      
      sId     name      totalQty
      -----   -----     -----
      1       Smith     23760
      3       Wells     75420
      ...     ...       ...
    
```
sum(...) returns the sum of the argument over all input rows. It is a good idea to group by a column that has to be present, like the sId of a Supplier.

There are many useful aggregate functions: avg, count, sum, max, min, round(NUM, DGT).

Sorting $\tau_{(a,b,...)}(R)$
This will sort a table by the given attributes:

      select S.name as sname, qty, P.name as pname from Supplier S natural join SP natural join Part P
        where qty > 100
        order by sname asc;
        
      sname     qty     pname
      -----     -----   -----
      Antonov   320     M1440-A
      Antonov   140     M2210-B
      ...
      Grant     510     M1430-B
      ...
      Smith     250     M1221-A
      ...

Let's investigate some advanced examples:

Give the name and qty of all parts which have a value of at least 2000.00, sorted by descending qty.
```
      select P.name as pname, qty from SP natural join Part P
        where (select sum(qty*price) as value from Part P1 
          group by pId
          having P1.pId=p.pId) > 2000.00
        order by qty desc;
    
```
Two major new things are introduced here. You can use subqueries almost anywhere and use them like actual tables. The having clause evaluates a grouped column and only puts it in the result set if the expression is true. In the example, a subquery should find the value of the current part; then the outer query checks if the value is higher than 2000.
Give the name, city and price of the most expensive parts.
```
      select P.name as pname, city, price from Part P natural join SP natural join Supplier
        where price = (select max(price) from SP);
    
```
All rows except are discarded, except the one with the most expensive price. If many parts have this price, all of them will appear in the result set.
Give the name and the number of suppliers of each part, ordered by the descending number of suppliers.
```
      select * from ( 
        select name as pname, count(SP.pId) as num_suppliers from Part P left outer join SP using (P.pId)
          group by P.pId
        union
        select name as pname, 0 as num_suppliers from Part P
          where pId not in (select pId from SP)
      ) T
        order by T.num_suppliers desc;
    
```
The UNION clause acts like the mathematical union: it will merge two tables together. To do this, the column names and types must match. Finally, the resulting table is wrapped in a FROM clause, so that it can be sorted by the num_suppliers attribute. The union allows you to include parts with no supplier: those parts will get a num_suppliers of 0.

There are two other set operations available: INTERSECT, which will drop entries that do not exist in both tables, and EXCEPT, which will remove the corresponding second table's entries from the first table.

More examples are coming... soon...