Workshop on Wikipedia research Bearbeiten

Wikimedia Foundation's wikis - first of all the Wikipedia project to create a free encyclopaedia - play an important role in wiki's success. Wikipedia is currently the 17th most visited website worldwide so it's of general public interest to find out more about its structure and processes. Research on Wikipedia is supported by making its data publically available. Wikipedians reflect a lot on their project and there is a small but growing number of scientific papers in Wikipedia research. Most of the large wikis use the same MediaWiki-engine like Wikipedia so general wiki research can also profit from Wikipedia research. However it is difficult to analyze Wikipedia without basic knowledge of its particularities and some involvement in the community. This workshop wants to bring together people interested in further wikipedia research. A short overview of current Wikipedia research will be given as well as some practical guidelines on methods how to analyze data and where to get in contact with the community. Together we want to talk about differences and commonalities of Wikipedia and other wikis and hot topics in Wikipedia research.

Jakob Voss is tightly involved in the German Wikipedia project. He got his degree in library and information science with a Masters thesis about Wikipedia and maintains a weblog and a bibliography on Wiki(pedia) research.

Angela Beesley will also participate.

Topics Bearbeiten

  1. minimal introduction into Wikipedia structure
  2. short overview of current Wikipedia research
  3. practical guidelines on methods how to analyze data and where to get in contact with the community
  4. differences and commonalities of Wikipedia and other wikis
  5. hot topics in Wikipedia research

A critical Review of Wikipedia Research Bearbeiten

Introduction Bearbeiten

I must admit that there is probably a bias on research in German compared to other languages because it's my native language, but as German Wikipedia is the second largest there is probably also the second highest number of publications about Wikipedia in German. If I missed essential works in Icelandic or Nauruan please let me know.

Motivation and scope of this review Bearbeiten

  • ...
  • different research traditions with a great deal of variability in vocabulary, methods and goals.
  • high rate of uncitedness between papers of Wikipedia research
  • ...

Where:

  • In Wikis: Mostly (but not only) research within the community
  • Wikimania: Mostly researchers from inside
  • WikiSym: Mostly researchers from outside
  • Othere conferences: Even more outside

Wikipedia Bearbeiten

...

Definition of Wikipedia Research Bearbeiten

Research is very common term with differing meanings and emphases. As noted by Cormac Lawler[1] there are three meanings of research in the context of Wikipedia:

  • Primary research on Wikimedia projects (analysing the content and processes of Wikimedia projects and possibly other wikis too)
  • Secondary research using Wikimedia projects (eg. using Wikipedia as an academic resource)
  • Research and development of Wikimedia projects (technical aspects and management)[2]

In this review I will focus on the first meaning. However there can be papers about how people use Wikipedia as an academic resource and primary research on Wikipedia can also help or be part of research and developement of Wikimedia projects. I will also subsume research on other Wikimedia projects (Wikibooks, Wikinews, Wikisource etc.) under Wikipedia research just because there is no specific research on single other Wikimedia projects and despite significant differences all Wikimedia projects arose from Wikipedia to hold content that is not appropriate for an encyclopaedia (tutorials, news, original sources etc.).

...so what is "analysing the content and processes of Wikimedia projects"?

Where to find Wikipedia Research Bearbeiten

Beside the different background of researchers there is a vast diversity of publication types. I found relevant content on wiki and wikipedia research in:

  • articles in magazines and newspapers (journalism)

masters theses and student research projects. In Germany there are already six finished masters or diploma theses about Wikipedia and at least the same number is beeing worked on.

  • weblog entries and personal essays
  • papers in journals and conference proceedings, and anthologies
  • books
  • technical reports
  • and last but not least on many wiki pages
    • Category:Research at meta
    • de:Wikipedia:Wikipedistik

In process there are also some PhD theses, but normally you need at least three years for a PhD and wikipedia research started not before 2004. Some may argue that only peer reviewed journals allow genuine research and everything else could not be considered as scientific publication. But the simplicity to publish and share information on the web has radically changed the way new information is distributed. Fair-minded you should admit that the quality of research more depends on the content and publication types are more or less exchangable. Especially until 2004 there was no ... in traditional ... but wiki pages - and that's where you still find some ... aussagen... that are now presented as new in scientific journals.

But you also find immature supposition, surveys without statistical significance, or just nonsense. These lacks of information quality also occur in peer-reviewed papers but maybe less (this is another assumption without statistical significance by the way).

In traditional research peer review is meant to ensure quality. However there are some intrinsic problems in Peer review, especially in context of Wikipedia?.

  • Can only distinguish between
  • There is no Wikipedia research as a discipline so there are no peers able to judge adequately
  • In Wikipedia

Interdisziplinarität. Low Quality. in general.

critis say that peer review can

However the diversity of number of peers is very low

WHERE: conferences (for instance Hawaii conference of system sciecne??) but above all Wikimania (xx) and WikiSym. There is no dedicated journal of WR – if then probably some open peer reviewed kind of Wiki. Publications are ... in WRB. Discussion: Mailinglists (....), Weblogs (reagle, wikipedistik, wikimetrics) and Wiki pages.

Topics Bearbeiten

see below

  • Where and how to use Wikis and Wikipedia
    • Teaching & Learning, e-Learning, Wikis and Wikipedia in schools etc.
    • Knowledge Management, Knowledge Sharing...

Early papers to mention are

  • (Aronsson, 2002) about Wiki
  • (Cifolilli, 2003) about Wikipedia (first paper in december 2003) => 2,5 years of Wikipedia research.

Methods Bearbeiten

...

Hot topics, Bewertung? Bearbeiten

...

References Bearbeiten

  1. http://meta.wikimedia.org/w/index.php?title=Wikimania_2006/Program/Research_thingy&oldid=317854, 29 March 2006
  2. This was the main goal of the "Wikimedia Research Network" that was started in May 2005: http://meta.wikimedia.org/wiki/Wikimedia_Research_Network

Topics Bearbeiten

Coming from a broad spectrum of disciplines there are many topics treated in Wikipedia research. However you can find three broad foci, namely content, users and impact ("cui"). Works that deal with Wikipedia's content especially look at its quality and structure, works that deal with Wikipedia's users especially look at their motivation and collaboration, and works that deal with Wikipedia's impact look at how it is influenced and influences other system – for instance by means of comparisions with other encyclopaedia, or explanations of its success.


Content

Quality (OK)

  • Criticism
    • Sanger (2004)
    • Lanier (2006): Digital Maoism
  • Brändle (2005)
  • Lih (2004) : reliable sources?
  • Giles (2005): nature study,
  • c't-test
  • DENNING (2005): wikipedia risks
  • Meyer (2006): Defense and Illustration of Wikipedia (was declard as dead)
  • In Anthony et al (2005) edits are supposed to be of good or bad quality wether the retain in an article or not (see below).
  • Stvilia et al. (2005): discussions in wikipedia ... (also below)

Structure (TODO)

A first overview was given by Voss (2005)...

  • Capapocci: Network growth, preferential attachement...

Users (TODO)

  • Ingo Frost compared Wikipedia with ...
  • Schroer : user survey

Collaborative authoring Multiauthorship in scientific publishing is ... but in Wikipedia different... The ... of author and reader is discussed at several places, for instance Miller (2005) writes about the ... but the paper does not bring any new results.

number of edits distributed uneven Confirmed with...A unterteilung in strongly commited users and passerby contributors is also ... Lotka but fliesend übergang!


In the Dutch and French Wikipedia Anthony et al (2005) find, that contributions more likely retain in an article with increasing number of contributions for registered users, and decreasing number of contributions for anonymous contributors. They conclude that anonymous "Good Samaritans" that rarely contribute, as well as commited experts ("Zealots"), contribute high quality content to Wikipedia because their edits remain. However the role of vandalism is not discussed in the paper so you could question method and definition of quality.


Impact:

The impact of Wikipedia on traditional publishing and information services is treated by Kuhlen (2005)

  • comparision with other reference works
  • usage and impact of Wikipedia usage: Lih
  • teaching & learning
  • history
  • ethics
  • Technical issues
  • ...

Wikipedia as a corpus. Especially in information retrieval Wikipedia is also used as an instrument and data source for other research. Ahn et al. (2004) test automatically answering questions with Wikipedia in a track at the Text Retrieval Conference (TREC). Sigurbjörnsson et al (2006) present a search engine for structured text with Wikipedia as an example. Named entities can be detected and disambiguated with Wikipedia and a method by Bunescu and Pasca (2006). Strube and Ponzetto (2006) combine categories and article content to compute the semantic relatedness of concepts. They show that Wikipedia can successfully be used as a knowledge base for artifical intelligence and natural language processing. Mahoney (2006) started a new benchmark with large files of Wikipedia data to compare and test data compression algorithms for natural language text.


Mostly focus is on Wikipedia only or broader aspects like collaborative authoring or open content in general. ... But Bibauw (2005) analysed the lexicographical structure of Wiktionary.