The 2 fundamental data structures in computer science

  • Data structure:

    • Variables that store information used by algorithms

  • Ideal data structure provides:

    • Fast insertion (when adding a new item to data structure)
    • Fast deletion (when removing an item from the data structure)
    • Fast lookup
    • Miminum memory usage

    Unfortunately:   There is no ideal data structure


  • There are 2 fundamental data structures in Computer Science:

    • Array
    • Linked list

    All other data structures are based on the concept of these data structures

Review: the array

  • Array = a series of variables that:

    • Are of the same data type

    • Are stored in consecutive memory locations

  • The memory used to store and array must be allocated "up front"

Review: the array

  • Strength of an array:

    • Fast access using an array index (e.g.: x[i])

      (Because the memory address of x[i] can be computed easily)

Review: the array

  • Weaknesses of an array --- cannot increase the array size

    • To increase the array, we must use memory cells that follows the last element of the array

      However:   these memory cells may not be available (used by another variable)

Review: the array

  • Weaknesses of an array --- it takes a long time to:

    • Insert a value in the middle of an array

      (You need to shift many element over)

The linked list data structure

  • The linked list and array are complementary to each other.

  • Characteristics of a linked list:

    1. Each list element is allocated individually (and then linked into the list)

        (Memory for all array elements are allocated up front)

    2. It's easy to insert/delete elements from a linked list)

        (It's not easy to insert/delete elements from an array)

    3. It is slow to look up elements in a linked list by its index)

        (It is fast to look up elements in an array by its index)

  • Comment:

    • Because each list element is allocated indvidually:

      • There is no need to "shift" list elements

What does a linked list look like

  • A linked list consists of a chain of list objects:
     

  • A list object of often called: a "node"

  • Every node consists of 2 parts:

    1. One of more data fields (contain the information stored in the linked list

    2. A link (reference) variable (contains the reference (= address) of the next node/list element)

      The link in the last node is null (= end of the list)

Programmer perspective of a linked list

  • A Java program will have a reference variable (commonly named as head or first) that contains the reference to the first node (= list element):

  • Consequently: only the data stored in the first node is immediately accessible

  • Accessing the data stored in the other nodes:

    1. Data in the 2nd node is accessed through the link in the 1st node

    2. Data in the 3rd node is accessed through the link in the 2d node

    3. And so on

Defining a Node class for a linked list

  • Suppose we want to create a linked list where:

    • Each node stores a int (integer)

  • We can define the following Node class for this purpose:

       public class Node
       {
          int     item;  // int data stored in Node
          Node    next;  // Link that reference to the next node
                         // in the linked list
       }
    

  • Note:

    • The class Node contains a reference variable next that references to an Node (same class) object

    • The next variable is used to create a chain of Nodes

    • You can define other data field depending on what you want to store in a Node

How is a linked list stored in the computer memory ?

  • Consider the following linked list:

    Remember that: memory locations are identified uniquely by their addresses

How is a linked list stored in the computer memory ?

  • Internally, this linked list is stored in the memory as follows:

    Notice:

    • head contains the address (reference) of the first Node object

    • Each next field contains the address (reference) of the subsequent Node object

What can Linked Lists do for us ?

It's easy to extend (= add a node to) a linked list:

  • Before inserting the node "not":

  • After adding the node "not":

    Because each node is create on demand and linked into the list

What can Linked Lists do for us ?

It's easy to insert a node at any position in linked list:

  • Before inserting the node "be":

  • After inserting the node "be":

    Because each node is create on demand and linked into the list

Array vs Linked List

Array Linked list
  • Array elements are stored contiguously in memory

  • All array elements are allocated at once
     

  • Once allocated, the number of elements in the array is fixed
     

  • Only store data fields, do not need to store non-data fields

  • Accessing the kth element in an array is fast

  • Inserting a value in the middle of an array is difficult (need to shift elements over)
  • List elements (= "nodes") do not need to be stored contiguously in memory

  • Nodes can be allocated piece meal when needed

  • We can increase the number of elements in a list easily by increasing the length of the chain

  • Requires the use of a linking field (next) to create a chain

  • Need to traverse the chain to reach the kth element --- slow

  • Inserting a Node in the middle of a linked list is easy (splice it in...)
     

Array and linked list complement each other in the strengths/weaknesses...