/* ---------------------------------------
Prelude (Sampling Scan)
--------------------------------------- */
F = Scaled-Sample(R) with probability p;
// Potentially heavy elements
+------------------------------------------------+
| Actual scan... |
| Do not include elements in F in the processing |
| |
| Assume the output is G |
+------------------------------------------------+
/* ---------------------------------------
Postlude (Cleaning Scan)
--------------------------------------- */
F = G ∪ F;
// Add F to G --> Potentially solutions
F = Count(F);
// Remove false positives
|
A1[1..(m/2)] = buckets used for hash function h1 A2[1..(m/2)] = buckets used for hash function h2 |
The result of each pass is summarised in a bit array variable.
/* --------------------------------
Prelude
-------------------------------- */
S = Scaled-Sample(R) using a suitable probab p;
F = select f most frequent items in S;
|
|
|
Conclusion:
|
|
Conclusion:
|