The SQL GROUP BY clause  

  • The SQL construct to form groups of tuples based on common attribute values is:

     SELECT    ...
     FROM      relation-list
     WHERE     tuple-boolean-condition      
     GROUP BY  grouping-attributes
    

    Effect:

    • Categorize (= group) the tuples selected by the basic SQL clause into groups based on:

      • common values in the grouping-attributes values

  Restriction posed by the GROUP BY clause on the SELECT attricutes  

  • The SELECT attributes are restricted to:

    1. A subset of the GROUP BY attributes

    2. A set function expression

  • I.e.:

       SELECT   subset of {A, B, C, D}, setFunc1(..), setFunc2(..), ...    
       FROM     ...
       WHERE    ...
       GROUP BY A, B, C, D
    

    In other words:

    • You cannot select an attribute that is not a grouping attribute

  Review: applying set functions on sub-groups  

  • Query:

    • Find the average salary for each department in the company

  Example SELECT command with a GROUP BY clause and a set function  

  • Suppose the Employee relation is:

            fname  lname    sex    dno         salary    
            ------ -------- ------ ----------- --------- 
            James  Borg     M      1           55000.00  
            Alicia Zelaya   F      4           25000.00  
            Jennif Wallace  F      4           43000.00  
            Ahmad  Jabbar   M      4           25000.00  
            John   Smith    M      5           30000.00  
            Frankl Wong     M      5           40000.00  
            Ramesh Narayan  M      5           38000.00  
            Joyce  English  F      5           25000.00       
    

  • Example SQL query with a GROUP BY clause and its effect:

        SELECT    dno, avg(salary)           +-----+--------------+
        FROM      employee                   | dno | avg(salary)  |
        GROUP BY  dno                        +-----+--------------+
                                             |   1 | 55000.000000 |
    					 |   4 | 31000.000000 |
    					 |   5 | 33250.000000 |
    					 +-----+--------------+ 
    

  Forming groups using multiple grouping attributes  

  • Suppose the Employee relation is: (re-arranged by common dno and sex values)

         +--------+---------+-----+------+----------+             
         | fname  | lname   | dno | sex  | salary   |
         +--------+---------+-----+------+----------+
         | James  | Borg    |   1 | M    | 55000.00 |
         | Alicia | Zelaya  |   4 | F    | 25000.00 |
         | Jennif | Wallace |   4 | F    | 43000.00 |
         | Ahmad  | Jabbar  |   4 | M    | 25000.00 |
         | Joyce  | English |   5 | F    | 25000.00 |
         | John   | Smith   |   5 | M    | 30000.00 |
         | Frankl | Wong    |   5 | M    | 40000.00 |
         | Ramesh | Narayan |   5 | M    | 38000.00 |
         +--------+---------+-----+------+----------+
    

  • Sample GROUP BY query using the 2 GROUPING attributes dno and sex:

        SELECT    dno, sex, sum(salary)    +-----+------+-------------+    
        FROM      employee                 | dno | sex  | sum(salary) |
        GROUP BY  dno, sex                 +-----+------+-------------+
                                           |   1 | M    |    55000.00 |
          				       |   4 | F    |    68000.00 |
     Note:				       |   4 | M    |    25000.00 |
                                	       |   5 | F    |    25000.00 |
       Group (1, F) is not in output !     |   5 | M    |   108000.00 |
    				       +-----+------+-------------+