Short-text learning in social media: a review


Social networks occupy a ubiquitous and pervasive place in the life of their users. The substantial amount of content generated and shared by social networking users offers new research opportunities across a wide variety of disciplines, including media and communication studies, linguistics, sociology, psychology, information and computer sciences, or education. This situation, in combination with the continuous growth of social media data, creates an imperative need for content organisation. Thus, large-scale text learning tasks in social environments arise as one of the most relevant problems in machine learning and data mining. Interestingly, social media data pose several challenges due to its sparse, high-dimensional and large-volume characteristics. This survey reviews the field of social media data learning, focusing on classification and clustering techniques, as they are two of the most frequent learning tasks. It reviews not only new techniques that have been developed to tackle the new challenges posed by short-texts, but also how traditional techniques can be adapted to overcome such challenges. Then, open issues and research opportunities for social media data learning are discussed.

The Knowledge Engineering Review, (34), pp. e7,