CS554 Syllabus & Progress


CS554 - Advanced Database Systems
Syllabus and Progress

  1. Introduction



  2. Review of undergraduate database material (my CS377 class notes is here: click here )

    • Review of Relational Algebra:
      • Relations: click here   ------  
      • The Relational (Data) Model click here   ------  

      • Overview of Relational Algebra

      • Set functions:
        • The Aggregate (= set) Functions: click here   ------  
        • Forming sub groups and applying set functions on sub groups: click here   ------  
        • Using groups and set functions in relational algebra queries: click here   ------  


      • Bags, sets and the δ operator: click here   ------  


        Homework 1: click here



    • Review of SQL:
      • Accessing the SQL database software:

      • Intro to SQL: click here   ------  

      • Intro to the SELECT command

      • Qualifying attributes and aliasing: click here   ------  

      • The tuple conditions that can be used in the where clause: click here   ------  


      • Nested queries:

      • Correlated nested queries:

      • Useful trick to get other attributes using a set of foreign keys in a nested query: click here   ------  


      • Set functions:
        • Using set function in the SELECT clause: click here   ------  
        • Using set function in the WHERE clause: click here   ------  
        • Example queries using set functions: click here   ------  

      • Forming groups based on grouping attributes and conditions on a group:
        • Forming groups that have common attribute values:
        • Conditions on a group:

      • Virtual table (relation) or view: click here   ------  


        Homework 2: click here





    Accessing data stored on disk                      

  3. Secondary Storage (Disks) management





  4. Indexing





    Query processing                      

  5. Cost and constraint on query execution (= processing a physical query plan)




  6. One-pass Algorithms for Query execution (processing a physical query plan)




  7. The nested-loop Algorithms for Cartesian Product and Join




  8. 2-pass Algorithms for Query execution that are based on hashing

    • Intro to two-pass algorithms: click here (+ slides)

    • Introduction to 2-pass hashing-based algorithms: click here (+ slides)

    • Unary operators:
      • Selection σ: click here     ***
      • Projection π: click here    ***
      • Duplicate elimination δ: click here (+ slides)     +++ 1st presentation
        • Partition R using hashing
        • Process each partition using the one-pass algorithm
      • Grouping γ: click here (+ slides)     +++ 2nd presentation
        • Partition R using hashing
        • Process each partition using the one-pass algorithm


    • Binary operators:
      • Union ∪
        • Bag union: click here    ***
        • Set union: click here (+ slides)     +++ 1st presentation
          • Partition R and S using hashing
          • Process each partition using the one-pass algorithm
      • Intersection ∩
        • Bag intersection: click here (+ slides)     +++ 2nd presentation
          • Partition R and S using hashing
          • Process each partition using the one-pass algorithm
        • Set intersection: click here (+ slides)     +++ 3rd presentation
          • Partition R and S using hashing
          • Process each partition using the one-pass algorithm
      • Difference −
        • Bag difference: --- same procedure as in the +++ presentation
          • Partition R and S using hashing
          • Process each partition using the one-pass algorithm
        • Set difference: --- same procedure as in the +++ presentation
          • Partition R and S using hashing
          • Process each partition using the one-pass algorithm

      • Product (cartesian product) × click here (+ slides)    ***

      • Join ⋈: click here (+ slides) --- same procedure as in the +++ presentation
        • Partition R and S using hashing
        • Process each partition using the one-pass algorithm

    • Summary: click here




  9. 2-pass Algorithms for Query execution that are based on (TPMMS) sorting





  10. Multi-pass algorithms




  11. Algorithms that are based on indexing:





    Query optimization                      

  12. Overview: query optimization: click here (+ slides)




  13. Parsing and pre-processing




  14. Converting a Parse Tree into an initial logical query plan (tree)





  15. Algebraic Laws used to transform/optimize logical query plans




  16. Heuristic-based approach to finding the optimal logical query plan




  17. Prelude to cost-estimation-based approach to finding the optimal join ordering: cost estimation


    Homework 6: click here



  18. Cost-estimation-based approach to find the optimal join ordering




  19. A greedy heuristic to find the best left-deep join ordering:




  20. The physical query plan


    Homework 7: click here






    Ensuring database consistency                      

  21. Recoverability: protecting database from system failure (logging)




  22. Serializability: correctness of concurrent execution of transactions




  23. Serializability and Recoverability






  24. Deadlock





    High-performance (parallel and distributed) database systems                      

  25. Parallel Data Processing Algorithms




  26. The Map-Reduce Parallelism framework




  27. Distributed Databases: query processing




  28. Commiting Distributed Transactions




  29. "Big Data" systems (and NOSQL)