If the Y attribute values
in
R(X,Y)
and
S(Y,Z)
are
disjoint:
T(
R(X,Y)
⋈
S(Y,Z)
) = 0
If Y attribute
is a key in
S:
T(
R(X,Y)
⋈
S(Y,Z)
) = T(R)
If every tuple
in R and
S has the
sameY attribute value:
T(
R(X,Y)
⋈
S(Y,Z)
) = T(R) × T(S)
Range of
T( R ⋈ S ):
0 ≤ T( R ⋈ S ) ≤ T(R)×T(S)
Simplifying Assumptions...
Fact:
Without any
assumptions on the
joining attribute values,
it is
not possible to
provide
an estimation on the
resultT(R ⋈ S)
Assumptions that helps
use find an
estimate of
R(X,Y)
⋈
S(Y,Z)
:
The containment of
value setsassumption:
An attributeY in a
relation R(...,Y)always
takes on a prefix
of a fixed list of
values:
y1 y2 y3 y4 ....
Example:
Relations:
R( .... , Y )
S( .... , Y )
U( .... , Y )
Then:
Attr values of Y in R can be one of: y1 y2 ..... yR
Attr values of Y in S can be one of: y1 y2 ........ yS
Attr values of Y in U can be one of: y1 y2 ... yU
(The
containment of
value setsassumption
will helps use
estimate the
size of
T(R⋈S)
)
The preservation of
value setsassumption:
The join operation
R(X,Y)
⋈
S(Y,Z)
will
preserveall the
possible values of
the non-joining attributes
In other words:
The attribute values taken on
by X in
R(X,
Y)
⋈
S(Y,Z)
and
R(X,
Y)
are same
The attribute values taken on
by Z in
R(X,
Y)
⋈
S(Y,Z)
and
S(Y,
Z
)
are same
(The
preservation of
value setsassumption
will helps use
estimate the
size of
T(R⋈S⋈U)
)