![]() ![]() For example, Figure 1 is a text snippet from the 20 Newsgroup dataset. Neither local context nor global context alone is sufficient for encoding the semantics of a word. While local context reflects the local semantic and syntactic features of a word, global context encodes general semantic and topical properties of words in the document, which complements local context in embedding learning. Global context refers to the larger semantic unit that a word belongs to, such as a document or a paragraph. In this paper, we argue that apart from local context, another important type of word context-which we call global context-has been largely ignored by unsupervised word embedding models. GloVe ( Pennington et al., 2014) factorizes a global word-word co-occurrence matrix, but the co-occurrence is still defined upon local context windows. The Skip-Gram architecture of word2vec uses the center word to predict its local context, and the CBOW architecture uses the local context to predict the center word. For instance, the famous word2vec algorithm ( Mikolov et al., 2013a, b) learns word representation from each word's local context window (i.e., surrounding words) so that local contextual similarity of words are preserved. ![]() Typically, the mapping function is learned based on the assumption that words sharing similar local contexts are semantically close. ![]() Words and phrases, which are originally represented as one-hot vectors, are embedded into a continuous low-dimensional space. Unsupervised word representation learning, or word embedding, has shown remarkable effectiveness in various text analysis tasks, such as named entity recognition ( Lample et al., 2016), text classification ( Kim, 2014) and machine translation ( Cho et al., 2014). Our quantitative analysis and case study show that despite their simplicity, our two proposed models achieve superior performance on word similarity and text classification tasks. We conduct a thorough evaluation on a wide range of benchmark datasets. We provide theoretical interpretations of the proposed models to demonstrate how local and global contexts are jointly modeled, assuming a generative relationship between words and contexts. We propose two simple yet effective unsupervised word embedding models that jointly model both local and global contexts to learn word representations. Global contexts, referring to the broader semantic units, such as the document or paragraph where the word appears, can capture different aspects of word semantics and complement local contexts. We argue that local contexts can only partially define word semantics in the unsupervised word embedding learning. Word representations are typically learned by modeling local contexts of words, assuming that words sharing similar surrounding words are semantically close. Word embedding has benefited a broad spectrum of text analysis tasks by learning distributed word representations to encode word semantics.
0 Comments
Leave a Reply. |