|
Let us denote this tuple as (v, 1, Δ)
|
..... (vi, gi, Δi), (vi+1, gi+1, Δi+1) ....
by:
..... (vi+1, gi+gi+1, Δi+1) ....
|
Example:
Input stream: ... 5 9 6 ....
Insertion into Summary
Summary: ... (5, 1, ?), (6, 1, ?), (9, 1, ?) ....
|
|
|
|
|
So ideally, you want entries with large g...
|
the value of g cannot grow arbitrarily large...
|
|
2 ε n = 14
Δ
-------------------------------
Band(0): 14
Band(1): 12 13
Band(2): 11 10 9 8
Band(3): 7 6 5 4 3 2 1 0
|
Hence, a larger value for the band is more desirable
int findband(Δ, p)
{
int diff;
double band;
diff = p - Δ + 1;
if ( diff == 1 )
{
return(0);
}
else
{
band = Math.log(diff)/Math.log(2);
return( (int) band );
}
}
|
A group of tuples is defined using a tree-structure imposed upon the values of their band
Given: any sequence of integer Example: 0 1 3 0 2 1 3 0 1 2 3 |
Lift up the largest value:
3 3 3
0 1 0 2 1 0 1 2
|
Lift up the next largest value:
3 3 3
2 2
0 1 0 1 0 1
|
Lift up the next largest value:
3 3 3
2 2
1 1 1
0 0 0
|
Tree: parent(i) = first number to your right that is > i
---------- root -----------
/ \ \
3 ---- 3 3
/ / / /
/ 2 / 2
/ / / /
1 / 1 1
/ / /
0 0 0
|
*** ε is the margin error (a parameter of the algorithm)
S = {}; // S contains the summary structure, which is:
// <(v0, g0, Δ0), (v1, g1, Δ1) ... >
// NOTE: S is an ordered list !!!
N = 0; // Number of items processed
while ( not EOS )
{
/* ---------------------------------------------
Delete phase:
executed once every 1/(2×ε) insertions
--------------------------------------------- */
if ( N % [1/(2×ε)] == 0 )
{
COMPRESS(); // <-------- Delete some entries in summary
}
/* ------------------------------------
Insert phase
------------------------------------ */
v = next value in input
/* --------------------------------------------
Find insert position for v in S
-------------------------------------------- */
Find a tuple (vi, gi, Δi) ∈ S such that: vi-1 ≤ v < vi
if ( v is inserted at the head or tail of S )
Δ = 0;
else
Δ = gi + Δi - 1 ;
INSERT "(v, 1, Δ)" into S between vi-1 and vi;
N++;
}
|
|
|
COMPRESS()
Input:
S = {}; // S contains the summary structure, which is:
// <(v0, g0, Δ0), (v1, g1, Δ1) ... (vs-1, gs-1, Δs-1)
|
The lemma's make use of the way that the tuples are organized in bands.
|
|
|
|
|
|
Hence, one period is equal to 1/(2 ε) = 2 items
The COMPRESS() method is invoked once after 2 items is received
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3 |
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3 S = (12, 1, 0) |
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3 10 12 S = (10, 1, 0) (12, 1, 0) |
2εN = 1
According to the band program: (more on bands later)
Δ
-------
band(0) = 1
band(1) = 0
S = (10, 1, 0) (12, 1, 0)
band(1) band(1)
Testing: (10, 1, 0)
Band: 1 ≤ 1 ==> TRUE
1 + 1 + 0 < 1 ==> FALSE
cannot delete (10, 1, 0)
DONE
|
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3
10 11 12
S = (10, 1, 0) (11, 1, 0) (12, 1, 0)
Δ = 1 + 0 - 1 = 0
|
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3
10 10 11 12
S = (10, 1, 0) (10, 1, 0) (11, 1, 0) (12, 1, 0)
Δ = 1 + 0 - 1 = 0
|
2εN = 2
According to the band program: (more on bands later)
Δ
-------
band(0) = 2
band(1) = 1 0
S = (10, 1, 0) (10, 1, 0) (11, 1, 0) (12, 1, 0)
band(1) band(1) band(1) band(1)
Testing: (11, 1, 0), (12, 1, 0)
Band: 1 ≤ 1 ==> TRUE
1 + 1 + 0 < 2 ==> FALSE
cannot delete (11, 1, 1)
Testing: (10, 1, 0), (11, 1, 0)
Band: 1 ≤ 1 ==> TRUE
1 + 1 + 0 < 2 ==> FALSE
cannot delete (10, 1, 0)
Testing: (10, 1, 0), (10, 1, 0)
Band: 1 ≤ 1 ==> TRUE
1 + 1 + 0 < 2 ==> FALSE
cannot delete (10, 1, 0)
DONE
|
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3
1 10 10 11 12
S = (1,1,0) (10,1,0) (10,1,0) (11,1,0) (12,1,0)
(Δ = 0 because (1, 1, 0) is inserted at the head of S)
|
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3
1 10 10 10 11 12
S = (1,1,0) (10,1,0) (10,1,0) (10,1,0) (11,1,0) (12,1,0)
Δ = 1 + 0 - 1 = 0
|
2εN = 3 !!!!
According to the band program: (more on bands later)
Δ
-------
band(0) = 3
band(1) = 1 2
band(2) = 0
S = (1,1,0) (10,1,0) (10,1,0) (10,1,0) (11,1,0) (12,1,0)
band(2) band(2) band(2) band(2) band(2) band(2)
Testing: (11,1,0) (12,1,0)
Band: 2 ≤ 2 ==> TRUE
1 + 1 + 0 < 3 ==> TRUE
DELETE subtree (11, 1, 0): replace (11,1,0) (12,1,0) by (12,2,0)
S = (1,1,0) (10,1,0) (10,1,0) (10,1,0) (12,2,0)
band(2) band(2) band(2) band(2) band(2)
Testing: (10, 1, 2) (12,2,0)
Band: 2 ≤ 2 ==> TRUE
1 + 2 + 0 < 3 ==> FALSE
cannot delete (10, 1, 0)
Testing: (10,1,0) (10,1,0)
Band: 2 ≤ 2 ==> TRUE
1 + 1 + 0 < 3 ==> TRUE
DELETE
S = (1,1,0) (10,1,0) (10,2,0) (12,2,0)
band(2) band(2) band(2) band(2)
Testing: (10, 1, 0) (10,2,0)
Band: 2 ≤ 2 ==> TRUE
1 + 2 + 0 < 3 ==> FALSE
cannot delete (10, 1, 0)
Testing: (1, 1, 0) (10,1,0)
Band: 2 ≤ 2 ==> TRUE
1 + 1 + 0 < 3 ==> TRUE
DELETE (1, 1, 0): replace (1,1,0) (10,1,0) by (10,2,0)
S = (10,2,0) (10,2,0) (12,2,0)
|
Assessing the state:
Input:
1 10 10 10 11 12
State:
S = (1,1,0) (10,1,0) (10,2,0) (12,2,0)
Or:
S = 1:[1..1] 10:[2..2] 10:[3..3] 12:[5..5]
Sample query:
1 2 3 4 5 6
--------------------------
1 10 10 10 11 12
| |
+----------+
Answer:
S = 1:[1..1] 10:[2..2] 10:[3..3] 12:[5..5]
|
We can answer any φ-quantile query with error margin within 1 error position.
That is acceptable because ⌊ε×N⌋ is 1.
Notice that we have remove 2 items and need to maintain less information
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3
1 10 10 10 11 11 12
S = (1,1,0) (10,1,0) (10,2,0) (11,1,1) (12,2,0)
Δ = 2 + 0 - 1 = 1
|
12 10 11 10 1 10 11 9 6 7 8 11 4 5 2 3
1 9 10 10 10 11 11 12
S = (1,1,0) (9,1,0) (10,1,0) (10,2,0) (11,1,1) (12,2,0)
Δ = 1 + 0 - 1 = 0
|
2εN = 4 (wiggle room)
According to the band program: (more on bands later)
Δ
-------
band(0) = 4
band(1) = 2 3
band(2) = 0 1
S = (1,1,0) (9,1,0) (10,1,0) (10,2,0) (11,1,1) (12,2,0)
band(2) band(2) band(2) band(2) band(2) band(2)
Testing: (11, 1, 1) (12,2,0)
Band: 2 ≤ 2 ==> TRUE
1 + 2 + 0 < 4 ==> TRUE
DELETE (11, 1, 1): replace (11,1,1) (12,2,0) by (12,3,0)
S = (1,1,0) (9,1,0) (10,1,0) (10,2,0) (12,3,0)
Testing: (10, 1, 0) (12,3,0)
Band: 2 ≤ 2 ==> TRUE
2 + 3 + 0 < 4 ==> FALSE
cannot delete (10, 1, 0)
Testing: (10, 1, 0) (10,2,0)
Band: 2 ≤ 2 ==> TRUE
1 + 2 + 0 < 4 ==> TRUE
DELETE (10, 1, 0): replace (10,1,0) (10,2,0) by (10,3,0)
S = (1,1,0) (9,1,0) (10,3,0) (12,3,0)
Testing: (9,1,0) (10, 3, 0)
Band: 2 ≤ 2 ==> TRUE
1 + 3 + 0 < 4 ==> FALSE
cannot delete (9, 1, 0)
Do not delete: (1, 1, 0)
S = (1,1,0) (9,1,0) (10,3,0) (12,3,0)
DONE
|
Max positional error: εN = 2 (more wiggle room now !!)
Input processed:
1 9 10 10 10 11 11 12
Summary:
S = (1,1,0) (9,1,0) (10,3,0) (12,3,0)
Or:
S = 1:[1..1] 9:[2..2] 10:[5..5] 12:[8..8]
User query processing:
Rank: 1 2 3 4 5 6 7 8
--------------+---------------------------------------
actual answer: 1 9 10 10 10 11 11 12
| |
r = 1 +-----------+ ===> rmax(9) = 2 > 1
1:[1..1]
| |
r = 2 +----------------+ ===> rmax(10) = 5 > 2
9:[2..2]
r = 3
| |
+----------------------+ ===> rmax(10) = 5 > 3
9:[2..2]
r = 4
| |
+----------------------+ ===> rmax(10) = 5 > 4
9:[2..2]
r = 5
| |
+----------------------+ ===> rmax(12) = 8 > 4
10:[5..5]
|
We can answer any φ-quantile query with error margin within 2 error position.
That is acceptable because ⌊ε×N⌋ is now 2 !!