Detecting Communities with Multiplex Semantics by Distinguishing Background, General, and Specialized Topics

Abstract

Finding semantic communities using network topology and contents together is a hot topic in community detection. Existing methods often use word attributes in an indiscriminate way to help finding communities. Through analysis we find that, words in networked contents often embody a hierarchical semantic structure. Some words reflect a background topic of the whole network with all communities, some imply the high-level general topic covering several topic-related communities, and some imply the high-resolution specialized topic to describe each community. Ignoring such semantic structures often leads to defects in depicting networked contents where deep semantics are not fully utilized. To solve this problem, we propose a new Bayesian probabilistic model. By distinguishing words from either a background topic or some two-level topics (i.e., general and specialized topics), this model not only better utilizes the networked contents to help finding communities, but also provides a clearer multiplex semantic community interpretation. We then give an efficient variational algorithm for model inference. The superiority of this new approach is demonstrated by comparing with ten state-of-the-art methods on nine real networks and an artificial benchmark. A case study is further provided to show its strong ability in deep semantic interpretation of communities.

Publication
IEEE Transactions on Knowledge and Data Engineering
Pengfei Jiao
Pengfei Jiao
Professor