14 2. WHAT NEWS CONTENT TELLS

sentences around sentence s

. We can also introduce an attention mechanism to learn the weights

to measure the importance of each sentence, and the news article vector a is computed as follows:

a D

iD1

; (2.10)

where ˛

measures the importance of i

sentence for the news piece a, and ˛

is calculated as

follows:

D tanh.W

C b

exp.o

kD1

exp.o

;

(2.11)

where o

is a hidden representation of h

obtained by feeding the hidden state h

to a fully

embedding layer, and o

is the weight parameter that represents the sentence-level context vector.

2.2 VISUAL FEATURES

Visual cues have been shown to be an important manipulator for fake news propaganda.

As we

have described, fake news exploits the individual vulnerabilities of people and thus often relies

on sensational or even fake images to provoke anger or other emotional response of consumers.

Visual features are extracted from visual elements (e.g., images and videos) to capture the diﬀer-

ent characteristics for fake news. Visual features are generally categorized into three types [21]:

Visual Statistical Features, Visual Content Features, and Neural Visual Features.

2.2.1 VISUAL STATISTICAL FEATURES

Visual statistical features represent the statistics attached to fake/real news pieces. Some repre-

sentative visual statistical features include [60] the following.

• Count: the occurrence of images in fake news pieces, they count the total images in a news

event and the ratio of news posts containing at least one or more than one images.

• Popularity: the popularity of images indicate the number of sharing on social media.

• Image type: some images have particular type in resolution or style. For example, long

images are images with a very large length-to-width ratio. e ratio of these types of

images is also counted as a statistical feature.

2.2.2 VISUAL CONTENT FEATURES

Research [60] has shown that image contents in fake news and real news have diﬀerent charac-

teristics. e representative visual content features are detailed as follows.

https://www.wired.com/2016/12/photos-fuel-spread-fake-news/

2.2. VISUAL FEATURES 15

• Visual Clarity Score (VCS): Measures the distribution diﬀerence between two image sets:

one is the image set in a certain news event (event set) and the other is the image set

containing images from all events (collection set). Visual clarity score is measured as the

Kullback–Leibler divergence between two language models representing the event im-

age set and all image set, respectively. e bag-of-word image representation, such as

SIFT [82] or SURF [11] features, can be used to deﬁne language models for images.

Speciﬁcally, let p.wjc/ and p.wjk/ denote the term frequency of visual word w in collec-

tion set and event set, respectively, and the visual clarity score is denoted as

VCS D D

.p.wjc/jjp.wjk//: (2.12)

• Visual Coherence Score (VCoS): Measures the coherence degree of images in a certain

news event. is feature is computed based on visual similarity among images and can

reﬂect relevant relations of images in news events quantitatively. More speciﬁcally, the

average of similarity scores between every two images i

and i

are computed as the co-

herence score as follows:

VCoS D

jM.M  1/j

j;kD1; ;M Ij ¤k

sim.i

; i

/: (2.13)

Here M is number of images in event set and sim.i

; i

/ is the visual similarity between

image i

and image i

. In implementation, the similarity between the image pairs is cal-

culated based on their GIST [101] feature representations.

• Visual Similarity Distribution Histogram (VSDH): Describes inter-image similarity in

a ﬁne granularity level. It evaluates image distribution with a set of values by quantify-

ing the similarity matrix of each image pair in an event. e visual similarity matrix S is

obtained by calculating pairwise image similarity in a news event. e visual similarity is

also computed based on their GIST [101] feature representations. e similarity matrix S

is then quantiﬁed into an H -bin histogram by mapping each element in the matrix into

its corresponding bin, which results in a feature vector of H dimensions representing the

similarity relations among images,

VSDH.h/ D

jf.j; k/jj; k  M; m

j;k

2 h

bingj; h D 1;    ; H: (2.14)

• Visual Diversity Score (VDS): Measures the visual diﬀerence of the image distribution.

First, images are ranked according to their popularity on social media, based on the as-

sumption that popular images would have better representation for the news event. en,

the diversity of an image is deﬁned as its minimal diﬀerence with the images ranking be-

fore it in the entire image set [161]. At last, the visual diversity score is then calculated as

16 2. WHAT NEWS CONTENT TELLS

a weighted average of dissimilarity over all images, where top-ranked images have larger

weights [35],

VDS

j D1

kD1



sim

; i

//: (2.15)

• Visual Clustering Score: Evaluates the image distribution over all images in the news

event from a clustering perspective. Representative clustering methods such as hierarchi-

cal agglomerative clustering [66] (HAC) algorithm can be utilized to obtain the image

clusters.

2.2.3 NEURAL VISUAL FEATURES

Multi-layer neural networks have been widely used for learning image feature representations.

Speciﬁcally, the specially designed architecture of CNNs are very powerful in extracting vi-

sual features from images, which can be used for various tasks [143, 162]. VGG 16 is one the

state-of-the-art CNNs (see Figure 2.3) for learning neural visual representations [143]. It is

comprised of three basic types of layers; convolutional layers for extracting translation-invariant

features from images, pooling layers for reducing the parameters, and fully connected layers for

classiﬁcation tasks. To prevent CNN from over-ﬁtting and to ease the training of deep CNNs,

dropout layers [145] and residual layers [50] are introduced to CNN structures. Recent work

that use images for fake news detection has adopted the VGG model [57, 143] to extract neural

visual features [165].

224 × 224 × 3 224 × 224 × 64

112 × 112 × 128

56 × 56 × 256

28 × 28 × 512

14 × 14 × 512

Convolution + ReLU

Maxpooling

Fully Connected + ReLU

Softmax

1 × 1 × 4096 1 × 1 × 1000

7 × 7 × 512

Figure 2.3: e illustration of VGG 16 framework for learning neural image features.

2.3 STYLE FEATURES

Fake news publishers often have malicious intent to spread distorted and misleading information

and inﬂuence large communities of consumers, requiring particular writing styles necessary to