The Self-focus category: motivation reflected on topical coverage in Wikipedia

Other authors

Rodríguez Hontoria, Horacio

Sallent Ribes, Sebastián

Publication date

2011-04-15

Abstract

”Wikipedia is a free web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation” this is the way the definition of Wikipedia in the article of the English language edition starts. This means it can be modified at any time, by anyone and at any place. These bases and their participation success make of Wikipedia an excellent social object of study which, at the same time, for being a technological construct, can be approached by techniques of natural language processing, information retrieval or data mining. However, in the current research there is a clear lack of software which can make an integral approach. Taking this into account, we make an in depth characterization of Wikipedia with the end goal of understanding which elements and structures compound its data and how they can be obtained with an analytical tool. We start with the existing API called wikAPIdia, which we develope until include new functionalities and have it ready to use in multiple scenarios and problematics of social sciences. Looking for a practical case to test it, we review the current state of art in motivation of editors and the topical coverage in the repository. This allows us to consider the aim of understanding Wikipedia from the perspective of having a different cultural configuration for each language. Phrasing it as a question: ”is there a national or self-representative motivation which is reflected in the content and thus disposes them differenciately?”. Autoreferentiality is the concept we present in order to analyse this hypothetical higher interest in local content. An identification and recollection is made on articles from heterogenous topics which can refer to the local history, sport teams or pop culture, but still maintain a semantic relation to the context of editors. Later, we propose a multidimensional analysis of them on features which can be significant indicators, to reach common conclusions and evaluate the language editions through an index of Autoreferentiality. Last, we point out which is the impact of this content and the risk of not considering its existance in the design of applications based on user generated content.

Document Type

Master thesis

Language

English

Publisher

Universitat Politècnica de Catalunya

Recommended citation

This citation was generated automatically.

Rights

http://creativecommons.org/licenses/by-nc-sa/3.0/es/

Open Access

Attribution-NonCommercial-ShareAlike 3.0 Spain

This item appears in the following Collection(s)