Wikidata:WikiProject Tabular data
Jump to navigation
Jump to search
Commons tabular data offers an alternative to Wikidata for data that can be especially useful for time series, numerical data or data that are only available under CC-BY or CC-BY-SA licence. In order to make them more usable, we need to link the from Commons and standarize them using Wikidata conventions and identifiers.
Comparison between Wikidata and tabular data on Commons
[edit]Subject | Wikidata items | Tabular data |
---|---|---|
Format | Wikibase | Json |
Comment on format | Relatively complex and expressive. Each statement can have an arbitrary number of qualifiers and sources of various types | Lightweight. |
Adapté pour | Can deal with complex, heterogenous data | Easier to deal with for harmonized time series or standardiez data data |
Data structure | Semantical structure: 1 concept = 1 item, with various links between items | Organized by file 1 There can be several files on the same topic, with different sources, timescales etc. |
Indexation and search | Search engine on Wikidata, and powerful search functionalities using Sparql | Data can be hard to find when they are not linked / documented from another page. Commons data namespace supports neither caterorization nor template-style documentation |
Use of data on Wikis | Data can be retrieved using Wiki mark-up, or various templates that have been locally developed. However, some statements have rather intricated or hackish data structure, resulting in unexpected results when no ad hoc template has been developed | Data are simpler, and are easy to use, provided the data structure in known. |
External use | Dumps and sparql endpoints allow various external reuses | Data can be downloaded |
Human readability | Statements are multilingual. However, items can be long, and messy, and the exact meaning of the various properties may not alway be clear to the casual reader. | Clean, spreadsheet-like pages, but the content itself might sometimes be obscure in the absence of complete documentation |
Edition manuelle | Editeur interactif. | Modification du code source Json. |
Multilinguisme | Multilinguisme natif pour les données de type "élément". | Possibilité de traduction des textes, mais fichier par fichier. Possibilité d'utiliser les identifiants Commons pour automatiser la traduction sur le site client. |
Risque de vandalisme | Modéré. Résumé et historique de modifications précis, mais la diversité des données et le grand nombre de modifs peuvent rendre le suivi en temps réel difficile. Risque de modification bien intentionnées mais contre-productives. | Sans doute faible. Données peu visibles. Contraintes formelles empêchant les modifications hâtives. |
Bots et outils | Communauté importante, outils variés. | Rien pour l'instant ? |
Licence | CC0 (équivalent domaine public). | CC0,CC attribution, ou CC attribution share alike. |
Properties
[edit]- Sandbox-Tabular data (P4045)
- weather history (P4150)
- tabular population (P4179)
- tabular software version (P4669)
- tabular case data (P8204)
- based on tabular data (P8265)
Up-to-date list on Sparql: https://w.wiki/ZC9
Potential Wikipedia usecases
[edit]- Demographic data (eg Template:US Census population (Q6157335))
- Climate date (Template:Weather box (Q6441801))
Data structure
[edit]Data linked from the same property should usually have similar data structures. When possible, the names of the fields should contain a Wikidata identifier for machine-readability.
Demographic data
[edit]- Recommended data structure: fields for point in time (P585), population (P1082), criterion used (P1013), determination method or standard (P459), stated in (P248) and reference URL (P854)
.
Participants
[edit]The participants listed below can be notified using the following template in discussions:{{Ping project|Tabular data}}
See also
[edit]- phab:T181319 (Support external tabular datasets in WDQS)