Commons:Structured data/Modeling/Depiction

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
Show your love! Please help! – Work in progress
You can help by working on guidelines and providing examples below. Visit the talk page to discuss this process in general.

Modeling of various types of depiction information in structured data, for files on Wikimedia Commons, is mostly done through depicts (P180) property and its qualifiers. See Commons:Essential information for section on different styles of sources.

The Wikidata depicts (P180) property is widely used on Wikidata to describe the content of certain items. It is now available on Wikimedia Commons to give structure to media descriptions, as part of the Structured Data on Commons project. It is used by more than 8 million files on Commons (May 2022).

The importance of depicts[edit]

Language and translation[edit]

Prior to the introduction of structured data, file information and categories on Commons could only be added in a single language, which was most commonly English. The Wikidata depicts (P180) property, however, can be described in multiple languages. Adding depicts statements to media files therefore transforms Wikimedia Commons into a truly multilingual platform, where readers of any language can find, understand, and use media files.

Searchability, findability, and reuse[edit]

Depicts statements are used by MediaSearch to find the most relevant results.

At first, MediaSearch runs queries that include statements and captions that use Structured Data. Next, it expands and searches terms based on Wikidata labels (the most common name that a Wikidata item would be known by). Lastly, it can further expand based on Commons categories and text-based information. All of that means that, in addition to full-text matching, MediaSearch will also include results that have a depicts statement of any Wikidata entity that matches a user's search term. It will also include results that have a relevant digital representation of (P6243) statement.

By taking advantage of statements like depicts, MediaSearch works in any language supported by MediaWiki and does not require knowledge of English. You can learn more about how MediaSearch works here.

MediaSearch also powers the visual editor on Wikipedia, allowing more image results, in more languages, to show up to illustrate Wikipedia articles.

Depicts are also used by tools that help users find relevant images for unillustrated Wikipedia articles, a new feature in development. You can learn more about image suggestions here.

Making use of depicts[edit]

How to find depicts statements[edit]

A gif representing the tab where Structured Data appears on a Commons file, how to add a depicts statement, and how to mark a statement as prominent

The depicts statements on Wikimedia Commons are available on the page for the media file, but on a secondary tab, the Structured data tab, which can be found beside the File information tab.

Depicts statements are also used to suggest images on the Wikimedia Commons MediaSearch. This search works by ranking images accordingly to the certain structured data, like depicts, that are inserted into the images. Through the Structured Data Across Wikimedia initiative, depicts statements will also be used to suggest images on Wikipedia articles in various languages, in the image recommendations project.

How to add depicts statements[edit]

Currently, the main way depicts statements can be added to Wikimedia Commons media files is through the media file's main page, on the Structured data tab, but there are some other means.

Users that upload to Wikimedia Commons using Upload Wizard are also led to insert depicts statements (the only Structured Data information requested) into their files to help identify what is portrayed in the media.

Tools are also available with depicts capabilities, like the ISA tool, AC/DC, SDC, Image Annotator, and Quickstatements. Learn how to use them on this page.

More advanced users are also adding depicts statements with bots, through a Pywikibot script.

How to search depicts statements and its qualifiers[edit]

There are three ways to search depicts statements on Wikimedia Commons: using the usual Structured data interface in every Commons file (or Structured data tab), using the MediaSearch, or through Wikimedia Commons Query Service (WCQS). All these alternatives allow searches with depicts and qualifiers. The MediaSearch also allows you to search depicts statements on specific Commons categories.

Find more more details about how to search Structured Data on Commons here and how to use the Wikimedia Commons Query Service here.

Every Structured Data on Commons edit, including depicts (P180) added to media files, will take a few days to be displayed on the Wikimedia Commons Query Service, as this tool is still being developed.

Suggested best practices[edit]

Level of detail[edit]

The primary purpose of depicts (P180) statements on Commons is to identify, in a structured way, the items clearly visible in a media file.

Wikidata has many entities at varying levels of detail and MediaSearch can't reliably infer relationships between them. Therefore, to optimize a media file for discovery, you should add multiple depicts (P180) statements, both general and specific. For example, this image of the fictional dog Lassie should have the statements dog (Q144), Rough Collie (Q38650), and Lassie (Q941640). To differentiate the level of importance of each statement, you should mark the most significant element as prominent and apply qualifiers (see both sections below).

Another example is this image of a Sequoia tree, which should have both Sequoia (Q1975652) and tree (Q10884) depicts statements. Even though Sequoia is a tree, in the Wikidata ontology Sequoia is considered a subclass of taxon (Q16521). For that reason, searching only for tree would not be enough to find media files depicting only Sequoia (Q1975652) and not also tree (Q10884). In other words, MediaSearch currently only looks at the Wikidata objects that are directly listed as being depicted on the file itself; it is unable to make logical inferences, such as the fact that depicting a Sequoia entails depicting a tree. This redundant tagging is disputed so please don't do it on a large scale.

To clarify, it's also very important to understand that Structured data on Commons does not function like the Commons category system, currently does not intend to, and it does not have the same clear hierarchy and division. The structured data in search and the functionality of the search engine is still very incomplete and not even near the usability of the category system, which itself is quite hard to use.

Mark as prominent[edit]

This section is not up to date with the new modeling guidelines. New examples should be added.

Pale Blue Dot (Q474472), photograph of Earth taken in 1990, by Voyager 1, from about 6 billion kilometers

In the depicts (P180) statement, it’s also possible to identify the main features of the image using the ‘mark as prominent’ option. This is particularly important if many depicts statements have been added to a file.

Multiple depicts statements makes it hard to know which statements are the most important or relevant ones in a file. Statements essentially only have two states: something is in the file, or it is not. There is no further detail about just how relevant something is in that file.

To choose which depicts statement to 'mark as prominent', you should ask yourself:

Are the depicts statements equally important, or is one of them the obvious subject and the other a less relevant background detail? If so, which? Is a depicts statement on one file more prominent than the same depicts statement on another?

Consider the Pale Blue Dot photographs: even though the earth makes up less than a pixel in the image set, it's a significant feature of the images. The “mark as prominent” depicts (P180) feature for statements is provided to address some of these issues.

The Sessão do Conselho de Estado (Q43485263) painting is also a good example of the use of ‘mark as prominent’. There are several characters depicted in the work, most of them on the same disposition, but the main subject and focus is the female character, Maria Leopoldina of Austria (Q84239), who is also represented in a different color.

Sessão do Conselho de Estado (Q43485263), a Brazilian painting by Georgina de Albuquerque (Q2855284)

It's important to differentiate between the 'marked as prominent' feature, in the depicts (P180) section, and the main subject (P921), which is also used by MediaSearch to rank media files that will be displayed first in searches on Commons. Sometimes, both properties might refer to the same element, as Mona Lisa, the person depicted in Mona Lisa (Q11879536), is the main subject on the Mona Lisa (Q12418), the painting, and the most prominent element depicted on the painting. On occasions, they might differ, like in the Sessão do Conselho de Estado (Q43485263) painting, in which the main element is the woman, Maria Leopoldina of Austria (Q84239), but the main subject is the Independence of Brazil (Q1548600).

As a feature, ‘mark as prominent’ enhances the accessibility of media files for people with visual disabilities, as it's a structured way to differentiate between elements displayed in an image, especially considering that not all media files on Wikimedia Commons have a Wikidata item (or a notable enough to have one) to be described on a structured and multilingual way over that platform.

‘Mark as prominent’ is used in the edit interface on the structure data tab for each file and in some tools, like the ISA tool. It can also be added through bots, but it's not available in other upload tools, such as Pattypan or OpenRefine.

Works of art[edit]

If a file depicts a work of art for which we have an item on Wikidata, we can link to it. Regardless of the type of work (2D or 3D) or how prominent it is in the image (from somewhere hiding in a corner to exact reproduction of a 2D work), we'll always add the depicts (P180) statement to the item on Wikidata. Depending on the type of artwork and how it's depicted, more statements might be added. This gives a scale from almost minimal inclusion (de minimis) to exact reproduction (not an original work).

Main subject work of art[edit]

If the main subject of an image is a work of art for which we have an item on Wikidata, main subject (P921) is used (together with the same depicts (P180) statement). This can be used for any type of visual artwork. If the file uses {{Artwork}} (or one of the related template) than the information from Wikidata about the work of art will be shown.

Digital representation of work of art[edit]

If the image is a digital representation of a two-dimensional visual artwork and an item for the artwork exists, digital representation of (P6243) is used (together with the same depicts (P180) and main subject (P921) statement). Only do this in cases where the principles of PD-art or PD-scan apply. In other cases main subject (P921) should be used. If the file uses {{Artwork}} (or one of the related template) than the information from Wikidata about the work of art will be shown.

Any statements in the structured data of the file that apply to the work of art (and not to the file) should be moved to the item on Wikidata.

If the media file is an installation photograph, showing the two-dimensional work of art in context or even alongside other works, you then treat the work of art as a three-dimensional work and not simply a digital representation. In this case main subject (P921) should be used instead of digital representation of (P6243). The line between a photograph being a digital representation or not is a blurry line that follows original work discussion. This is the reason why main subject (P921) is always added alongside digital representation of (P6243).

Applies to part[edit]

The use of the applies to part (P518) qualifier could help improve ranking, but those qualifiers are currently rarely used at all on Commons, though they have precedent on Wikidata. For example, on the Wikidata item for Mona Lisa, the depicted elements have 'applies to part' qualifiers that specify foreground or background, which could provide additional signals to the search ranking algorithm if used on Commons.

Other qualifiers[edit]

Qualifiers provide context for depicts, making it more specific and helping readers or machines to know which subject on the media file that specific depicts statement is referring to.

While the feature doesn't yet influence search ranking, it's possible to search a qualifier on MediaSearch using the haswbstatement search. To find every file that depicts a black cat, the data should have a depicts statement for house cat (Q146) and a qualifier of color color (P462) with the Wikidata item for black (Q23445).

haswbstatement:P180=Q146[P462=Q23445] (search here)

In the case of a person, the qualifier can be shown with features (P1354) to describe their physical characteristics, such as waist-length hair (Q14130), or even wears (P3828), hair color (P1884), or uses (P2283).

Some relevant qualifiers allowed for depicts[edit]

A gif showing the 'depicts' section of a media file using the quantity (P1114), color (P462), shape (P1419), and made from material (P186) qualifiers, in different languages

To know which values can be used as qualifiers without generating a warning, see the section 'allowed qualifiers constraint' within the depicts (P180) property.

Modelling various styles of depicts[edit]

Case Example file Statement Comment Questions
A highly-detailed picture with hundreds of structures
API
depicts
Normal rank Paris
0 references
add reference
Normal rank Seine
0 references
add reference


add value
Even though this picture technically depicts hundreds of structures, just the main ones should be added to the depicts section and the city of Paris (Q90) should be 'marked as prominent'. Should you also add satellite imagery (Q725252)?
A picture of a three-dimensional creative work
API
depicts
Normal rank lion
0 references
add reference


add value
The creative work is in the format of a lion (Q140), so that statement should be added as depicts (P180). Should you also add figurine (Q1066288), even though that could be modelled as medium?
A picture of a three-dimensional creative work depicting another creative work
API
depicts
Normal rank vase
0 references
add reference
Normal rank Dionysus
0 references
add reference
Normal rank maenad
0 references
add reference
Normal rank satyr
0 references
add reference


add value
The artistic work is both the vase, with its own creative elements, and the painting, with its characters.

At least, vase (Q191851) and Dionysus (Q41680) should be 'marked as prominent'.

It would also be helpful to clarify the use of depicts (P180) vs. instance of (P31) as used on Wikidata. For this example, we would enter 'vase' as instance of (P31) rather than depicts (P180). All the tags we’ve been adding refer to the depicted subject matter and not what the object actually is. I can see how it would be helpful to have this on Structured Data on Commons, so should be we be adding instance of (P31) items as depicts (P180) statements on Structured Data on Commons?
A picture of a structure depicting a creative work
API
depicts
Normal rank Moses
0 references
add reference
Normal rank mural
0 references
add reference
Normal rank building
0 references
add reference


add value
The Moses (Q1989744) painting, the mural, and the building should all be added to the depicts section, but the painting should be 'marked as prominent.
A media file using a qualifier as a reference (or a tag)
API
depicts
Normal rank Heracles
determination method Metropolitan Museum of Art Tagging Initiative
0 references
add reference
Normal rank nude
determination method Metropolitan Museum of Art Tagging Initiative
0 references
add reference


add value
This media file is using the determination method (P459) property as a qualifier in a way that resembles a reference and that also states its source, how it is an 'institutional metadata', uploaded through a GLAM partnership.

For a more general approach, there's the determined by GLAM institution and stated at its website (Q61848113) option.

Digital representation of a creative work that depicts a notable person
API
digital representation of
Normal rank Self-portrait
0 references
add reference


add value
depicts
Normal rank Vincent van Gogh
0 references
add reference


add value
On a digital representation of a creative work, the digital representation of (P6243) should link to the Wikidata item for the creative work and the depicts (P180) should link to the main features depicted. In the case of this painting, depicts should specifically link to Vincent van Gogh (Q5582), as he's a notable person. Should you also add Impressionism (Q40415), even though that could be modelled as the genre?
Creative work that depicts a character that symbolizes an idea, feeling, or another abstract concept
API
depicts
Normal rank woman
symbolizes idleness
0 references
add reference


add value
Some features in creative works are depicted to symbolize or represent certain ideas or feelings and this data should be added to the 'depicts' section. It's the case of the Idleness (Q19953492) painting, that depicts a woman that represents idleness (Q278781).
A picture that depicts an inscription
API
depicts
Normal rank Shugborough inscription
inscription O U O S V A V V (undetermined language)
0 references
add reference


add value
In a media file that depicts an inscription, the exact words or characters of that inscription should be added to the depicts (P180) section, even if it's in an undetermined language, like the example work.
A video being described using depicts statements and qualifiers
API
depicts
Normal rank solar eclipse
0 references
add reference
Normal rank Sun
applies to part background
0 references
add reference
Normal rank Moon
applies to part foreground
0 references
add reference


add value
Modeling the 'depicts' for a video can be difficult when the file shows many elements, but it's also possible using qualifiers.
A file in which the concept of 'depiction' is not as clear, like an audio file using a depicts (P180) statement
API
depicts
Normal rank wedding music
0 references
add reference


add value
This audio file is using wedding music (Q106015533), which is a music genre (Q188451), as a depicts (P180). Should depicts (P180) have a statement that is a music genre (Q188451) or should it have a statement referring to the file, like sound recording process (Q5057302) or audio file (Q26987229)? In a case of an audio with lyrics, should it contain the words described as well?

Examples[edit]

A list of files that serve of examples of best practices on how to model various situations.

See also[edit]