Wikipedia:Projekt IMAGINATION/Informationen über den zweiten Workshop

Informationen über den zweiten Workshop Bearbeiten

User scenarios for Wikipedia users Bearbeiten

Short description of the users Bearbeiten

The Wikimedia Commons archive contains a big number of images (at the time of writing 2,895,379 media files). These images are used to illustrate articles in Wikipedia and may be uploaded by Wikipedia users. Moreover, sometimes all images of an external archive are integrated into Wikimedia Commons.
Images can have different types of resolutions, e.g small sizes for overview pages (thumbnails) and bigger sizes for detailed views (layout images) or download sizes. Downloadable images can have different types of copyrights, e.g. freely usable for private usage only or usable in general. Also, images are connected with an owner, which is the user who uploaded the image. This allows navigating to the user page of the user where additional information about the user and additional images may be found. Also, images may be assigned categories such as ‘nature’ or ‘science’, which implements a kind of image set functionality.
The well known way for image annotation and search is tag based. This means images are annotated with tags (i.e. words) to describe the images. This allows image searcher, to enter tags. Then, similar to Google or other information retrieval systems, matching images are returned and displayed to a user.
Text based annotation of images is possible in a collaborative way in Wikimedia, which means everyone can participate in the annotation process (based on Web 2.0 techniques). Wikipedia users in general are highly familiar with using the Internet and the described state of techniques for image annotation and search.
The ImageNotion system may provide the following benefits for this user group. So far, only complete images are annotated, which means that interesting parts of an image cannot have annotations. In ImageNotion also image parts may have annotations. Moreover, ImageNotion uses semantic annotations which may improve the quality of image search, and may also provide new possibilities for navigation in the Wikimedia Commons archive.

In the following we assume that parts or the complete content of the Wikimedia Commons images is already included in the ImageNotion system. In the ImageNotion system, one can distinguish between two different roles a Wikipedia user can have. One role is that of the annotator. He or she is highly interested in image annotations with a high quality in the Wikimedia archive. To do so, automated processes shall speed up the generation of semantic annotations and the generation of image annotations for image parts. Ideally, the system should provide a very high-level of automatization because of the huge number of images to annotate.

The other role is that of the image searcher. They would like to have methods to find their desired images as fast as possible and thereby benefit from the semantic technologies used in ImageNotion.

User scenario focusing on image search Bearbeiten

Search of images of all female main actors who played together with Clint Eastwood

Hans would like to create a Wikipedia article about female main actors who played together with Clint Eastwood. To fulfil his goal he have to search for adequate images in the image archive powered by the ImageNotion system.

To do so, Hans does the following tasks to create such an article:

Find the female actors: using the ImageNotion system Hans uses the ImageNotion system to get a list of all female actors who played together with Clint Eastwood and for whom images are available in the archive. Hans writes down the list of names that he can readily use in his article.
Find an image of each of this female actor: As a next step, the Hans finds one image for each of these female actors in the system and store them on the local system to use it for his Wikipedia article
Find an image of each female actor together with Clint Eastwood: Hans would also like to demonstrate in his article that the female actors really appeared together with Clint Eastwood. To do that, he searches for an image for each of these female actors where they appear together with Clint Eastwood.
Find the names of the movies the female actors played together with Clint Eastwood: Hans would also like to include the list of movies of female actors where they played together with Clint Eastwood. To fulfil this goal he searches for the names of the movies where the female actors played together with Clint Eastwood.
Find the categories of the movies: Hans notices that the list of movies is quite long, therefore he would like to group movies based on their categories. Therefore he uses the system to determine the categories of the movies from the previous list, e.g. to determine that “Dirty Harry” is an action movie.
Find images for the movies: Hans would like to link to other Wikipedia pages that describe these movies in greater detail. He notices, that these pages do not contain any movie images. To illustrate the movies, he collects three images for each movie in the image archive and downloads them.
Upload images: Hans has some new images of the movie “Million Dollar Babe” that are not yet in the archive. He uploads these images to the archives, reviews the automatically created annotations, correct them and add some new ones manually, if necessary.

User scenario focusing on image annotation Bearbeiten

Upload and annotate images

In this scenario, another Wikipedia user, Karin, has to integrate an bigger external image archive to Wikimedia Commons, which is powered by the ImageNotion system. Therefore, she should fulfil tasks, which normally done by professional image annotators in an image archive.

These tasks are the following:

Select training images: Karin would like to automatize the process of annotating as much as possible. Do reach this goal she select training data for the system. There are three persons that are completely new for the image archive and are not yet trained. She has to select 15 images of each person.
Train the system: Karin trains the system using the previously selected 45 images for the detection of faces of the three new persons.
Rate the quality of automated processes: After finishing the training process, Karin uploads the remaining images in the archive. The automated annotation process starts automatically for each uploaded image. Karin checks the created annotations and rate the outcoming quality to assess the amount of work that she will have with integrating the whole external archive.
Correct automatic annotations: Karin reviews the automatically created annotations. She corrects the incorrect ones and adds missing annotations.