|
Let us denote this tuple as (v, 1, Δ)
|
..... (vi, gi, Δi), (vi+1, gi+1, Δi+1) .... by: ..... (vi+1, gi+gi+1, Δi+1) .... |
Example:
Input stream: ... 5 9 6 .... Insertion into Summary Summary: ... (5, 1, ?), (6, 1, ?), (9, 1, ?) .... |
|
|
|
|
So ideally, you want entries with large g...
|
the value of g cannot grow arbitrarily large...
|
|
2 ε n = 14 Δ ------------------------------- Band(0): 14 Band(1): 12 13 Band(2): 11 10 9 8 Band(3): 7 6 5 4 3 2 1 0 |
Hence, a larger value for the band is more desirable
int findband(Δ, p) { int diff; double band; diff = p - Δ + 1; if ( diff == 1 ) { return(0); } else { band = Math.log(diff)/Math.log(2); return( (int) band ); } } |
A group of tuples is defined using a tree-structure imposed upon the values of their band
Given: any sequence of integer Example: 0 1 3 0 2 1 3 0 1 2 3 |
Lift up the largest value:
3 3 3 0 1 0 2 1 0 1 2 |
Lift up the next largest value:
3 3 3 2 2 0 1 0 1 0 1 |
Lift up the next largest value:
3 3 3 2 2 1 1 1 0 0 0 |
Tree: parent(i) = first number to your right that is > i
---------- root ----------- / \ \ 3 ---- 3 3 / / / / / 2 / 2 / / / / 1 / 1 1 / / / 0 0 0 |
*** ε is the margin error (a parameter of the algorithm) S = {}; // S contains the summary structure, which is: // <(v0, g0, Δ0), (v1, g1, Δ1) ... > // NOTE: S is an ordered list !!! N = 0; // Number of items processed while ( not EOS ) { /* --------------------------------------------- Delete phase: executed once every 1/(2×ε) insertions --------------------------------------------- */ if ( N % [1/(2×ε)] == 0 ) { COMPRESS(); // <-------- Delete some entries in summary } /* ------------------------------------ Insert phase ------------------------------------ */ v = next value in input /* -------------------------------------------- Find insert position for v in S -------------------------------------------- */ Find a tuple (vi, gi, Δi) ∈ S such that: vi-1 ≤ v < vi if ( v is inserted at the head or tail of S ) Δ = 0; else Δ = gi + Δi - 1 ; INSERT "(v, 1, Δ)" into S between vi-1 and vi; N++; } |
|
|
COMPRESS()
Input: S = {}; // S contains the summary structure, which is: // <(v0, g0, Δ0), (v1, g1, Δ1) ... (vs-1, gs-1, Δs-1) |
The lemma's make use of the way that the tuples are organized in bands.
|
|
|
|
|
|
Hence, one period is equal to 1/(2 ε) = 2 items
The COMPRESS() method is invoked once after 2 items is received
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3 |
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3 S = (12, 1, 0) |
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3 10 12 S = (10, 1, 0) (12, 1, 0) |
2εN = 1 According to the band program: (more on bands later) Δ ------- band(0) = 1 band(1) = 0 S = (10, 1, 0) (12, 1, 0) band(1) band(1) Testing: (10, 1, 0) Band: 1 ≤ 1 ==> TRUE 1 + 1 + 0 < 1 ==> FALSE cannot delete (10, 1, 0) DONE |
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3 10 11 12 S = (10, 1, 0) (11, 1, 0) (12, 1, 0) Δ = 1 + 0 - 1 = 0 |
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3 10 10 11 12 S = (10, 1, 0) (10, 1, 0) (11, 1, 0) (12, 1, 0) Δ = 1 + 0 - 1 = 0 |
2εN = 2 According to the band program: (more on bands later) Δ ------- band(0) = 2 band(1) = 1 0 S = (10, 1, 0) (10, 1, 0) (11, 1, 0) (12, 1, 0) band(1) band(1) band(1) band(1) Testing: (11, 1, 0), (12, 1, 0) Band: 1 ≤ 1 ==> TRUE 1 + 1 + 0 < 2 ==> FALSE cannot delete (11, 1, 1) Testing: (10, 1, 0), (11, 1, 0) Band: 1 ≤ 1 ==> TRUE 1 + 1 + 0 < 2 ==> FALSE cannot delete (10, 1, 0) Testing: (10, 1, 0), (10, 1, 0) Band: 1 ≤ 1 ==> TRUE 1 + 1 + 0 < 2 ==> FALSE cannot delete (10, 1, 0) DONE |
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3 1 10 10 11 12 S = (1,1,0) (10,1,0) (10,1,0) (11,1,0) (12,1,0) (Δ = 0 because (1, 1, 0) is inserted at the head of S) |
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3 1 10 10 10 11 12 S = (1,1,0) (10,1,0) (10,1,0) (10,1,0) (11,1,0) (12,1,0) Δ = 1 + 0 - 1 = 0 |
2εN = 3 !!!! According to the band program: (more on bands later) Δ ------- band(0) = 3 band(1) = 1 2 band(2) = 0 S = (1,1,0) (10,1,0) (10,1,0) (10,1,0) (11,1,0) (12,1,0) band(2) band(2) band(2) band(2) band(2) band(2) Testing: (11,1,0) (12,1,0) Band: 2 ≤ 2 ==> TRUE 1 + 1 + 0 < 3 ==> TRUE DELETE subtree (11, 1, 0): replace (11,1,0) (12,1,0) by (12,2,0) S = (1,1,0) (10,1,0) (10,1,0) (10,1,0) (12,2,0) band(2) band(2) band(2) band(2) band(2) Testing: (10, 1, 2) (12,2,0) Band: 2 ≤ 2 ==> TRUE 1 + 2 + 0 < 3 ==> FALSE cannot delete (10, 1, 0) Testing: (10,1,0) (10,1,0) Band: 2 ≤ 2 ==> TRUE 1 + 1 + 0 < 3 ==> TRUE DELETE S = (1,1,0) (10,1,0) (10,2,0) (12,2,0) band(2) band(2) band(2) band(2) Testing: (10, 1, 0) (10,2,0) Band: 2 ≤ 2 ==> TRUE 1 + 2 + 0 < 3 ==> FALSE cannot delete (10, 1, 0) Testing: (1, 1, 0) (10,1,0) Band: 2 ≤ 2 ==> TRUE 1 + 1 + 0 < 3 ==> TRUE DELETE (1, 1, 0): replace (1,1,0) (10,1,0) by (10,2,0) S = (10,2,0) (10,2,0) (12,2,0) |
Assessing the state:
Input: 1 10 10 10 11 12 State: S = (1,1,0) (10,1,0) (10,2,0) (12,2,0) Or: S = 1:[1..1] 10:[2..2] 10:[3..3] 12:[5..5] Sample query: 1 2 3 4 5 6 -------------------------- 1 10 10 10 11 12 | | +----------+ Answer: S = 1:[1..1] 10:[2..2] 10:[3..3] 12:[5..5] |
We can answer any φ-quantile query with error margin within 1 error position.
That is acceptable because ⌊ε×N⌋ is 1.
Notice that we have remove 2 items and need to maintain less information
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3 1 10 10 10 11 11 12 S = (1,1,0) (10,1,0) (10,2,0) (11,1,1) (12,2,0) Δ = 2 + 0 - 1 = 1 |
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3 1 9 10 10 10 11 11 12 S = (1,1,0) (9,1,0) (10,1,0) (10,2,0) (11,1,1) (12,2,0) Δ = 1 + 0 - 1 = 0 |
2εN = 4 (wiggle room) According to the band program: (more on bands later) Δ ------- band(0) = 4 band(1) = 2 3 band(2) = 0 1 S = (1,1,0) (9,1,0) (10,1,0) (10,2,0) (11,1,1) (12,2,0) band(2) band(2) band(2) band(2) band(2) band(2) Testing: (11, 1, 1) (12,2,0) Band: 2 ≤ 2 ==> TRUE 1 + 2 + 0 < 4 ==> TRUE DELETE (11, 1, 1): replace (11,1,1) (12,2,0) by (12,3,0) S = (1,1,0) (9,1,0) (10,1,0) (10,2,0) (12,3,0) Testing: (10, 1, 0) (12,3,0) Band: 2 ≤ 2 ==> TRUE 2 + 3 + 0 < 4 ==> FALSE cannot delete (10, 1, 0) Testing: (10, 1, 0) (10,2,0) Band: 2 ≤ 2 ==> TRUE 1 + 2 + 0 < 4 ==> TRUE DELETE (10, 1, 0): replace (10,1,0) (10,2,0) by (10,3,0) S = (1,1,0) (9,1,0) (10,3,0) (12,3,0) Testing: (9,1,0) (10, 3, 0) Band: 2 ≤ 2 ==> TRUE 1 + 3 + 0 < 4 ==> FALSE cannot delete (9, 1, 0) Do not delete: (1, 1, 0) S = (1,1,0) (9,1,0) (10,3,0) (12,3,0) DONE |
Max positional error: εN = 2 (more wiggle room now !!) Input processed: 1 9 10 10 10 11 11 12 Summary: S = (1,1,0) (9,1,0) (10,3,0) (12,3,0) Or: S = 1:[1..1] 9:[2..2] 10:[5..5] 12:[8..8] User query processing: Rank: 1 2 3 4 5 6 7 8 --------------+--------------------------------------- actual answer: 1 9 10 10 10 11 11 12 | | r = 1 +-----------+ ===> rmax(9) = 2 > 1 1:[1..1] | | r = 2 +----------------+ ===> rmax(10) = 5 > 2 9:[2..2] r = 3 | | +----------------------+ ===> rmax(10) = 5 > 3 9:[2..2] r = 4 | | +----------------------+ ===> rmax(10) = 5 > 4 9:[2..2] r = 5 | | +----------------------+ ===> rmax(12) = 8 > 4 10:[5..5] |
We can answer any φ-quantile query with error margin within 2 error position.
That is acceptable because ⌊ε×N⌋ is now 2 !!