Sets and bags

  • Set:

    • Set = a collection of elements that do not contain duplicate elements

    Example:

       S = { 1, 2, 3 }         // A set
    


  • Bag:

    • Bag = a collection of elements that can contain duplicate elements

    Example:

       B = { 1, 2, 2, 3 }    // A bag
    

Relations in (1) Relational Algebra vs. in (2) database systems

  • A relation in Relational Algebra (Math) is a set of tuples

    • A set does not have duplicates


  • A relation in relational database systems is a bag of tuples

    Because:

    • Detecting/Removing duplicate values requires extra overhead

    Therefore:

    • No extra work is performed to check/remove duplicates by default

    • Users can request the database system to remove duplicates using the "duplicate elimiation" operation

The duplicate elimination operator δ

  •  δ remove duplicates

    • δ(B) = the set of tuples in (bag) B with duplicates removed

    Example:

    • δ({1, 2, 2, 3})   =   {1, 2, 3}


  • Note: in SQL, the duplicate elimination operator is:

      SELECT  UNIQUE  .....