2.2. VISUAL FEATURES 15
• Visual Clarity Score (VCS): Measures the distribution difference between two image sets:
one is the image set in a certain news event (event set) and the other is the image set
containing images from all events (collection set). Visual clarity score is measured as the
Kullback–Leibler divergence between two language models representing the event im-
age set and all image set, respectively. e bag-of-word image representation, such as
SIFT [82] or SURF [11] features, can be used to define language models for images.
Specifically, let p.wjc/ and p.wjk/ denote the term frequency of visual word w in collec-
tion set and event set, respectively, and the visual clarity score is denoted as
VCS D D
KL
.p.wjc/jjp.wjk//: (2.12)
• Visual Coherence Score (VCoS): Measures the coherence degree of images in a certain
news event. is feature is computed based on visual similarity among images and can
reflect relevant relations of images in news events quantitatively. More specifically, the
average of similarity scores between every two images i
j
and i
k
are computed as the co-
herence score as follows:
VCoS D
1
jM.M 1/j
X
j;kD1; ;M Ij ¤k
sim.i
i
; i
k
/: (2.13)
Here M is number of images in event set and sim.i
j
; i
k
/ is the visual similarity between
image i
j
and image i
k
. In implementation, the similarity between the image pairs is cal-
culated based on their GIST [101] feature representations.
• Visual Similarity Distribution Histogram (VSDH): Describes inter-image similarity in
a fine granularity level. It evaluates image distribution with a set of values by quantify-
ing the similarity matrix of each image pair in an event. e visual similarity matrix S is
obtained by calculating pairwise image similarity in a news event. e visual similarity is
also computed based on their GIST [101] feature representations. e similarity matrix S
is then quantified into an H -bin histogram by mapping each element in the matrix into
its corresponding bin, which results in a feature vector of H dimensions representing the
similarity relations among images,
VSDH.h/ D
1
M
2
jf.j; k/jj; k M; m
j;k
2 h
th
bingj; h D 1; ; H: (2.14)
• Visual Diversity Score (VDS): Measures the visual difference of the image distribution.
First, images are ranked according to their popularity on social media, based on the as-
sumption that popular images would have better representation for the news event. en,
the diversity of an image is defined as its minimal difference with the images ranking be-
fore it in the entire image set [161]. At last, the visual diversity score is then calculated as