Commons:Dados legíveis por máquina

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
This page is a translated version of a page Commons:Machine-readable data and the translation is 53% complete. Changes to the translation template, respectively the source language can be submitted through Commons:Machine-readable data and have to be approved by a translation administrator.
Outdated translations are marked like this.

Shortcut: COM:MRD

No Wikimedia Commons, muito dos metadados (incluído a licença e o autor) não são legíveis por máquinas. Existe um módulo API, iiprop=extmetadata que pode ser usado para recuperar alguns valores (exemplo), mas à medida que a informação é introduzida como texto livre na página própria de descrição do ficheiro, mas a forma como a informação é inserida como texto livre na página de descrição do ficheiro em si não é perfeita. Há planos para mudar os metadados na base de dados$ref, mas isso não vai acontecer em breve.

Para compensar e facilitar a transição para dados mais estruturados num momento futuro, o Wikimedia Commons usa um conjunto de predefinições padrão que foram feitas de forma a serem legíveis por máquina de algumas formas, através de elementos HTML. Alguns scripts já fazem delas. É interessante salientar que esses dados estão disponíveis para qualquer wiki que use o Wikimedia Commons, onde podem ser lidos a partir do código HTML da página Ficheiro:, assim como outros dados locais.

Machine readable data set by infobox templates

These are several standard infobox templates tagging different elements of the template with different tags to allow parsing of the information. Several different styles of tags are used:

  • Microformat tags follow industry standards and can be parsed by already existing tools.
  • <td> id attributes (identifiers) are custom markings which allow more complete tags, which have to be read by custom tools. Most universal infoboxes have two column structure: column #1 holds name of the field and column #2 holds the value
    • Traditionally <td> id attributes were used to tag the name call in the first column in a row. To get the data, you would need to get the contents of the following <td> cell in the second column.
    • {{Creator}} and {{Institution}} templates have more complicated structure, so the cells with the actual data are tagged with attributes using magenta background.
Predefinição Nome de parâmetro da predefinição Descrição ID de atributo <td> Microformato Comentário
{{Information}} description descrição do ficheiro fileinfotpl_desc hProduct.description. Often contains multiple languages annotated with {{Lang}}.
{{Information}} date data original de criação da obra fileinfotpl_date hCalendar vevent.dtstart microformato adicionado pela predefinição {{Date}}
{{Information}} source fonte do ficheiro fileinfotpl_src Often contains entire tables. We have no good way to deal with this source templates yet. Source templates often have references to catalogue IDs, but these are also not machine readable.
{{Information}} author autor do ficheiro fileinfotpl_aut This can be author, creator and/or copyright holder and is used mixed. Often contains the {{Creator}} template which is described below.
{{Information}} permission licença/permissão do ficheiro fileinfotpl_perm
{{Information}} other versions outras versões do ficheiro fileinfotpl_ver
{{Artwork}} description descrição da obra de arte fileinfotpl_desc hProduct.description
{{Artwork}} date data original de criação da obra de arte fileinfotpl_date hCalendar vevent.dtstart microformat added by {{Date}} template
{{Artwork}} source fonte do ficheiro fileinfotpl_src
{{Artwork}} artist criador da obra de arte fileinfotpl_aut "hProduct.fn value"
{{Artwork}} author autor da obra de arte fileinfotpl_aut "hProduct.fn value"
{{Artwork}} permission licença/permissão do ficheiro e obra de arte fileinfotpl_perm
{{Artwork}} other versions outras versões do ficheiro fileinfotpl_ver
{{Artwork}} title título da obra de arte fileinfotpl_art_title hProduct.fn
{{Artwork}} object type tipo de objeto da obra de arte fileinfotpl_art_object_type
{{Artwork}} medium técnica e meios da obra de arte fileinfotpl_art_medium
{{Artwork}} dimensions dimensões da obra de arte fileinfotpl_art_dimensions
{{Artwork}} gallery instituição que possui a obra de arte fileinfotpl_art_gallery
{{Artwork}} location localização da obra de arte dentro da instituição fileinfotpl_art_location hProduct.locality
{{Artwork}} accession number número de acesso da obra de arte fileinfotpl_art_id hProduct.identifier
{{Artwork}} object history object history of the artwork fileinfotpl_art_object_history
{{Artwork}} exhibition history exhibition history of the artwork fileinfotpl_art_exhibition_history
{{Artwork}} credit line credit line of the artwork fileinfotpl_art_credit_line
{{Artwork}} inscriptions inscrições na obra de arte fileinfotpl_art_inscriptions
{{Artwork}} notes notas sobre a obra de arte fileinfotpl_art_notes
{{Artwork}} references referências relacionadas à obra de arte fileinfotpl_art_references
{{Book}} Author autor do livro fileinfotpl_author
{{Book}} Editor editor do livro fileinfotpl_book_editor
{{Book}} Translator tradutor do livro fileinfotpl_book_translator
{{Book}} Illustrator ilustrador do livro fileinfotpl_book_illustrator
{{Book}} Title título do livro fileinfotpl_book_title
{{Book}} Subtitle subtítulo do livro fileinfotpl_book_subtitle
{{Book}} Series title título da série do livro fileinfotpl_book_series-title
{{Book}} Authority file dados de controlo de autoridade fileinfotpl_book_authority
{{Book}} Publisher publicação do livro fileinfotpl_book_publisher
{{Book}} Printer impressor do livro fileinfotpl_book_printer
{{Book}} Year of publication data ou ano da publicação do livro fileinfotpl_date
{{Book}} Place of publication sítio ou cidade da publicação do livro fileinfotpl_book_place-of-publication
{{Book}} Language idioma do livro fileinfotpl_book_language
{{Book}} Description descrição do livro fileinfotpl_desc
{{Creator}} Name Nome do criador creator vCard.fn
{{Creator}} Alternative names Nomes alternativos do criador fileinfotpl_creator_alt-name_value vCard.nickname
{{Creator}} Description Nacionalidade e ocupação(ões) do criador fileinfotpl_creator_desc_value vCard.note
{{Creator}} Date of death Data da morte do criador fileinfotpl_creator_deathdate_value
{{Creator}} Date of birth Data do nascimento do criador fileinfotpl_creator_birthdate_value vCard.bday
{{Creator}} Location of birth/death Local da morte do criador fileinfotpl_creator_deathloc_value
{{Creator}} Location of birth Local de nascimento do criador fileinfotpl_creator_birthloc_value
{{Creator}} Work period Período de atividade do criador fileinfotpl_creator_work-period_value
{{Creator}} Work location Local de trabalho do criador fileinfotpl_creator_work-location_valuev
{{Creator}} Image retrato ou foto a mostrar o criador fileinfotpl_creator_image
{{Creator}} Authority file Controlo de autoridade relacionado com o criador fileinfotpl_creator_authority_value


{{FileContentsByBot}} (vários) depende, por favor confira {{FileContentsByBot}} (various) hproduct-by-bot grande conjunto de dados e ainda em crescimento, por favor confira {{FileContentsByBot}}
{{Photograph}} title título da fotografia fileinfotpl_art_title hProduct.fn
{{Photograph}} description descrição da fotografia fileinfotpl_desc hProduct.description
{{Photograph}} original description descrição arquivística original da fotografia fileinfotpl_desc hProduct.description
{{Photograph}} date data da criação da obra de arte original fileinfotpl_date hCalendar vevent.dtstart microformat added by {{Date}} template
{{Photograph}} medium técnica e meios da fotografia fileinfotpl_art_medium
{{Photograph}} dimensions dimensões da fotografia fileinfotpl_art_dimensions
{{Photograph}} artist criador da fotografia fileinfotpl_aut "hProduct.fn value"
{{Photograph}} institution instituição que possui a fotografia fileinfotpl_art_gallery
{{Photograph}} location localização da fotografia dentro da instituição fileinfotpl_art_location hProduct.locality
{{Photograph}} source fonte do ficheiro fileinfotpl_src
{{Photograph}} permission licença/permissão do ficheiro e obra de arte fileinfotpl_perm
{{Photograph}} other versions outras versões do ficheiro fileinfotpl_ver
{{Photograph}} accession number número de acesso da fotografia hProduct.identifier

Alternative format for CommonsMetadata

Because the table + id based format proved very hard to add to templates which were not formatted similarly to the Commons information template, CommonsMetadata allows an alternative format, similar to license templates: the whole information template has to be enclosed in a fileinfotpl class and the tag containing the specific information needs to have a fileinfotpl_* class (same names as above, but class, not id).

Conjunto de dados legíveis por máquina por predefinições de licença

Introduced in October 2010, using classes <span class="licensetpl_XXX">

licensetpl
An element identifying a license. Wraps the entire license code and should be a SINGLE license, not a multi license.
licensetpl_short
Short name of the license: “Public domain”, “CC BY-SA 3.0”, “CC by 2.0 fr”, etc.
licensetpl_long
Long name of the license: “Public domain”, “Creative Commons Attribution-Share Alike 3.0”,
licensetpl_attr_req
Whether attribution is required. “true” or “false”.
licensetpl_attr
The requested attribution: Free text.
licensetpl_link_req
Whether a link to the license is required for this license. “true” or “false”.
licensetpl_link
The link to the license deed. “www.creativecommons.org/licenses/by-sa/XXX/YYY”
licensetpl_nonfree
“true“ if this is a non-free license (not used on Commons, only on wikis with an EDP)

Multiple licensetpl blocks for the same work might be wrapped in a block using the class licensetpl_wrapper.

Templates setting this information

  • Templates setting licensetpl include:

{{PD-Layout}}, {{Cc-by-sa-3.0-migrated}}, {{Cc-by-layout}}, {{Cc-by-sa-layout}}, {{Cc-zero}}, {{FAL}}, {{GFDL}}, {{GFDL-1.2}}, {{GPL}} e {{LGPL}}.

Machine readable data set by style formatting templates

Style formatting templates, meant to provide uniform styles to different families of non-license templates, carry machine readable data identifying these families.

Predefinição Propósito nome da classe
{{Restriction-Layout}} used by Restriction tags restrictiontemplate
{{FoP-Layout}} used by freedom of panorama tags foptemplate
{{Partnership-Layout}} used by Partnership templates partnershiptemplate
{{Source-Layout}} used by generic Source templates sourcetemplate
{{Created with}} used by Created with ... templates createdwithtemplate

Machine readable data set by non-copyright restriction templates

Templates regarding non-copyright legal restrictions carry these classes to identify specific types of restrictions.

Template(s) Purpose class name
{{Trademarked}} Trademarked images restriction-trademarked
{{Copydesign}} Copyrighted designs restriction-design
{{Communist symbol}} Communist symbols restriction-communist
{{Italy-MiBAC-disclaimer}} {{Soprintendenza}} Italian cultural goods restriction-ita-mibac
{{Australian Commonwealth reserve}} Australian reserves restriction-aus-reserve
{{Personality rights}} {{Romania personality rights}} Personality rights restriction-personality
{{2257}} Child Protection and Obscenity Enforcement Act warning (United States) restriction-2257
{{Costume}} Costuming restriction-costume
{{Fan art}} Fan art restriction-fan-art
{{Currency}} Currency restriction-currency
{{IHL Symbol}} Symbols restricted by International Humanitarian Law restriction-ihl
{{Nazi symbol}} Nazi and fascist symbols restriction-nazi
{{Insignia}} Official insignia restriction-insignia

Machine readable data set by specific templates

More machine-readable data are set. Here is a non-exhaustive list:

{{Personality rights}}
<span class="commons-template-name" style="display:none" id="commons-template-personality-rights">Personality rights</span>
{{Credit line}}
<td id="fileinfotpl_credit" class="fileinfo-paramfield fileinfotpl_credit" style=""></td>

Machine-readable data set by location templates

{{Location}} and similar templates add machine-readable geocodes in the following format: <span class="geo">12.34;24.68</span> (latitude and longitude as floating-point numbers, separated by a semicolon). The coordinates use the en:WGS84 system (same as the GPS and most online maps). See Commons:Geocoding for more details.

Uso

MediaWiki API

The MediaWiki API now serves a limited number of metadata. Consider the following query:

(Open in API Sandbox) that returns some useful parameters such as Credit, Artist, LicenseUrl and Copyrighted and is used by Media Viewer, for example.

Scripts que usam dados legíveis por máquina

Ferramentas externas

Ver também

Defining new machine readable data

  • Do NOT use HTML id's, use classes. An ID can only be used once per page and most of these fields can occur multiple times per page. Consider for instance descriptions of derivative works, which can include information about the original and the derivative.
  • When possible, wrap the actual data, not some field header. This last method is historically used for all our Information templates, but much harder to support in the long run.
  • Wrap data, not the way the data is formatted.
  • Expect that formatting is lost when converting to data. Visual dress up is not part of the information.
  • Don't wrap multiple units of information inside one field. There is a difference between a publication date and a creation date. Both are dates, but both are different 'data fields'. Also CC BY-SA-4.0-3.0-2.5 is not a license name, those would be 3 licenses with the name CC BY-SA-##.
  • Make sure that the data value has one unit, or outputs one consistent unit.

Problems

There are a few things that are currently NOT or badly recognizable. These include:

  • Derivative works
  • Works included in works. See also Category:FoP_templates
  • licenses derivates or works included in works are a mess.
  • Author vs. Copyright holder
  • usernames vs 'real names'
  • Catalogue IDs etc
  • VRTS permissions
  • Publication date vs creation date