4.2. WEAKLY SUPERVISED FAKE NEWS DETECTION 65
seeks to find a subset of news articles that frequently cluster together in different configurations
of the tensor decomposition of the previous step. e intuition is that news articles that tend
to frequently appear surrounded each other among different rank configurations, while having
the same ranking within their latent factors are more likely to ultimately belong to the same
category. e ranking of a news article with respect to a latent factor is derived by simply sorting
the coefficients of each latent factor, corresponding to the clustering membership of a news
article to a latent factor.
To this end, we can combine the clustering results of each individual tensor decomposition
into a collective (news-article by latent-factor) matrix, from which we are going to extract co-
clusters of news articles and the corresponding latent factors (coming from the ensemble of
decompositions). For example, as shown in Figure 4.5, we can perform the tensor decomposition
three times with different rank 3, 4, and 5, and then construct a collect feature matrix F
0
. e
co-clustering objective with l
1
norm regularization for a combine matrix [103] F
0
is shown as
follows:
min
F
0
RQ
T
2
F
C .kRk
1
C kQk
1
/; (4.12)
where R 2 R
N k
is the representation matrix of news articles, and Q 2 R
M k
is the coding
matrix, and the term .kRk
1
C kQk
1
/ is to enforce the sparse constraints.
4.2.3 A PROBABILISTIC GENERATIVE UNSUPERVISED APPROACH
Existing work on fake news detection is mostly based on supervised methods. Although they
have shown some promising results, these supervised methods suffer from a critical limitation,
i.e., they require a reliably pre-annotated dataset to train a classification model. However, ob-
taining a large number of annotations is time consuming and labor intensive, as the process
needs careful checking of news contents as well as other additional evidence such as authorita-
tive reports.
e key idea is to extract users’ opinions on the news by exploiting the auxiliary infor-
mation of the users’ engagements with the news tweets on social media, and aggregate their
opinions in a well-designed unsupervised way to generate our estimation results [174]. As news
propagates, users engage with different types of behaviors on social media, such as publishing a
news tweet, liking, forwarding, or replying to a news tweet. is information can, on a certain
level, reflect the users opinions on the news. For example, Figure 4.6 shows two news tweet
examples regarding the aforementioned news. According to the users’ tweet contexts, we see
that the user in Figure 4.6a disagreed with the authenticity of the news, which may indicate the
user’s high credibility in identifying fake news. On the other hand, it appears that the user in
Figure4.6b falsely believed the news or intentionally spread the fake news, implying the user’s
deficiency in the ability to identify fake news. Besides, as for other users who engaged in the
tweets, it is likely that the users who liked/retweeted the first tweet also doubted the news,
while those who liked/retweeted the second tweet may also be deceived by the news. e users’