Wikidata:Project chat/Archive/2014/04

From Wikidata
Jump to navigation Jump to search

Error when adding commonscat

Hi. I frequently get the error Malformed input: xxx when adding a commonscat to an item. Someone who knows why? Trijnstel (talk) 22:02, 14 March 2014 (UTC)

I get this error often too when trying to add Commons category (P373). --Infovarius (talk) 18:41, 15 March 2014 (UTC)
I can reproduce it by adding a trailing space after the category name. If I remove the space after the category name, then I can save it without any error.--Micru (talk) 18:47, 15 March 2014 (UTC)
Can you two confirm please if the issue Micru mentioned is also the case for you? If so please open a bug on bugs.wikimedia.org. Thanks! :) --Lydia Pintscher (WMDE) (talk) 12:07, 16 March 2014 (UTC)
I have always assumed this is a pretty universal phenomenon, so much so that I forget where it occurs ("parent taxon"?); it is awkward when doing a copy-and-paste from some place where there is a space behind a word. - Brya (talk) 19:40, 16 March 2014 (UTC)

It's a phenomenon all over Wikidata, and it's the single most annoying thing in a user interface that is pretty annoying even without this silly bug. It's virtually impossible to copy and paste anything from Wikipedia or library Websites to authority data fields without getting this silly error every second time. It's not just spaces, it's possibly other invisible little elements. Mostly, deleting excess spaces helps, but not always. The software should spare us this task. --FA2010 (talk) 15:53, 3 April 2014 (UTC)

This is bug 45925. Mushroom (talk) 14:56, 4 April 2014 (UTC)

Wikiquote notability

Two types of categories

I think we currently only have Wikimedia category (Q4167836). In most Wikimedia sites there are two types of categories, and they often have different policies or practises. There are categories that are supposed to only contain content pages (e.g. Category:Sports governing bodies (Q6827986)), and categories for project/community pages (e.g. Category:Wikipedians (Q4654299)), and within project categories there are maintenance categories which group together pages that need similar maintenance activity (e.g. Category:Candidates for speedy deletion (Q5964)). I propose that we create at least 'Wikimedia content category page' and 'Wikimedia maintenance category page' as subclasses of Wikimedia category (Q4167836). John Vandenberg (talk) 02:53, 27 March 2014 (UTC)

A while ago I created Wikimedia administration category (Q15647814), but I think I'm the only one who uses it. I'll add your suggested names as aliases. --Arctic.gnome (talk) 03:37, 27 March 2014 (UTC)
And don't forget Wikimedia disambiguation category (Q15407973). I don't like the 'Wikimedia content category page' concept. Just create subclasses for the special cases where you think it would be useful. For the normal content categories it's just a lot of extra work to replace the claim and no real added value. Multichill (talk) 18:46, 27 March 2014 (UTC)

@Arctic.gnome:, you have now added a description to Wikimedia administration category (Q15647814) of "use with 'instance of' (P31) for wiki categories that contain non-content pages". That label excludes Category:Candidates for speedy deletion (Q5964), as that category contains content pages to be processed by administrators. Perhaps what we need is to separate 'topical categories' vs 'issue tracking categories'. user:Multichill, topical/content categories should all have a category's main topic (P301) (or a category combines topics (P971) for parts of the category tree that are insanely long), and it should usually be unique, and the WPs normally have a very similar set of categories, and Wikidata should be proactively helping them unify the category trees into a usable ontology. OTOH, non-content categories are a mess, and it isnt important for Wikidata to fix that mess as they are of little value to anyone outside of Wikimedia. I would go as far as to say that non-content categories should have been declared out of scope for Wikidata, as they only map the internal processes of Wikipedias. It is now too late to have that discussion, as the bots have imported nearly all of these non-content categories, causing Wikimedia category (Q4167836) to be so full of junk that it is almost junk data. If we do not create a item dedicated to content categories (or relabel Q4167836 to be that), bots and people will continue to use Q4167836 for non-content categories, instead of using more precise subclasses. John Vandenberg (talk) 02:30, 28 March 2014 (UTC)

A grey area can be seen with Category:Living people (Q5312304). It is either a content category or an admin category depending on what it is being used for, by the local editing community and by the reader (and data consumer). For a lot of purposes, it is an admin category - it exists primarily to enable policing of WP:BLP articles, and has so many members so the 'is living' test is easy to do and tools like w:Special:RecentChangesLinked/Category:Living_people catch all changes. However it is descriptive of the subject rather than the project, so some people might say it is a content category, and on many projects it is not a hidden category. This case is further complicated by the differing subcategories and sister-categories on different Wikipedia projects, such as Category:Date of birth missing (living people) (Q8363960) (English Wikipedia) and Category:Living people aged 1 (Q10217452) (Indonesian), Category:Living Centenarians (Q7887859) (Indonesian and Russian) Category:Living Australians (Q9946623) (Romanian), and less clearly 'living': Category:National leaders (Q5745782) (Commons & many wikis; not English) and Category:People involved in ongoing events (Q16056013) (Portuguese and Swedish), and the more clear admin subcats like Category:Wikipedia indefinitely semi-protected biographies of living people (Q8100724). Category:Possibly living people (Q8858736) is another one that straddles the line between an admin and content category.

Stub categories are another interesting case. They describe both subject and content status; i.e. membership of Category:Soviet football biography stubs (Q6996386) indicates the article is a Soviet footballer, but also that the article content is not mature. This case is easier to solve, as we can create a new subclass for 'Wikimedia stub category', which allows data consumers to easily understand thousands of categories very accurately. John Vandenberg (talk) 13:55, 5 April 2014 (UTC)

DBpedia headsup

Hoi, I was at a meeting at the Dutch chapter. We discussed importing data from the Dutch DBpedia. The plan is to create a procedure whereby basic information of "humans" will be included. They are

  • "instance of" "human"
  • sex gender
  • date of birth
  • date of death
  • place of birth
  • place of death
  • father
  • mother
  • spouse
  • child

When there is a discrepancy, it will be listed. The choice of the properties is intentionally limited. This exercise is to get experience with running a process like this. I expect that many items will get their first statements as a result. What is nice is that this same process can easily be run on any other language DBpedia. It will also result in an update of the DBpedia that monitors Wikidata.

As precondition of the project is the explicit statement by that DBpedia community that they are happy to share their data with us. GerardM (talk) 11:05, 27 March 2014 (UTC)

I would say don't import the data but just compare them in an external tool, analyze the results and after if the results are good do the importation. I think for large data import we need a short discussion after a comparison test with the current data we have in order to estimate the improvement or not of the importation for WD.
Again we are facing major sceptical comments from some WPs about the quality and the sources of the data from WD so if we can't show a better data quality than the existing one in the WPs few projects will use Wikidata as data source. Like in the market: when customers request good product quality offering product with worse quality doesn't mean you will able to sell it. Right now the big WPs are not interested in wikidata and they are not ready to exchange their current data by this available in wikidata (even if some templates are using lua to extract some data from WD, the priority is given to the existing sets of data in the articles indicating that WD is considered as a second choice). Snipre (talk) 17:03, 27 March 2014 (UTC)
We discussed that again and again : with no data, there is no chance to improve the datas. These majors Wikipedias (be precise please) have all the tools, and if they have high quality datas, wa got to find a way to import them, and they can filter sources, and it will be a win win. Otherwise nothing will move, and we will have nothing to give to them. If we were in the market, we would not contribute for free. TomT0m (talk) 17:22, 27 March 2014 (UTC)
@TomT0m: Yes we are speaking about that again and again because I have the impression that some persons don't leave one moment WD to go to WP and to discuss a little about the use of Wikidata by Wikipedians. WPs don't want bullshit data, that's a fact and until now I didn't receive a correct proposition to fulfill that kind of request. So TomTOm, don't spend time with me in discussion but please go in WP and try to sell Wikidata and try to hear what are the critisms of WPs about Wikidata. And if you want to use economical comparison, you are just trying to sell a product you don't know the origin and you just assume its good quality without having a simple tool to assess it: you are not a manufacturer of data but just an intermediary trying to sell suspect product. You want to become the leader on the market ? So first study that market and try to understand the needs of the customers.
Nothing is moving because nobody needs WD: it is still more simple to find good data using traditional ways. Internet is full of data but no bots are collecting data through Internet to import them into WP. And DBpedia exists since a certain time but which WP imports data from DBpedia until now ? If as you assume DBpedia was a source of good quality data, be sure that some bot operators would be requested to perform the importation. But until now the data exchange between DBpedia and WPs was going in only one direction. Why ? Just answer that questions, just go to the different projects of your WP and ask that questions. Snipre (talk) 12:46, 28 March 2014 (UTC)
I would be very reluctant to see this happening. As I understand it, DBpedia is a database extracted from Wikipedia. Thus, importing the contents from DBpedia means just another form of copying the same data. More or less the same as Freebase which is also extracted from Wikipedia. The whole purpose of Wikidata is to get away from the indiscrimate 'data' as it exists in Wikipedia, and to get to the good stuff.
        It looks to me that the swamp is getting deeper and deeper. There is a mass of data of dubious standard, which is being extracted by one database after the other and then copied to other databases. Like a runaway elephant, trampling all in its path. What there is of value in Wikidata will be harder and harder to find. Why not find a way to import (or enter) real data? - Brya (talk) 17:40, 27 March 2014 (UTC)
A lot of the 'real data' is in databases which are under restricted licenses. The advantage of adding this data from DBPedia, which comes originally from Wikipedia, is that we can then get bots to systematically compare the data to the data on other databases, and highlight the discrepancies which, as I understand it (but IANAL) is allowed.
Wikipedias are right not to use Wikidata as it is. The solution (IMHO) is to make it easier to edit wikidata from within wikipedia so that wikipedians can improve wikidata, one infobox at a time. When you edit an infobox in wikipedia it should open a window where you can edit wikidata, with the appropriate properties filled in. Filceolaire (talk) 18:46, 27 March 2014 (UTC)
Wikipedias don't use Wikidata because the inclusion syntax is too complicated for everyone but ubernerds, the parser function implementation is useless which means it's impossible to fetch any meaningful kind of data without knowing Lua. Let's not pretend that has a thing to do with quality or sources.--Underlying lk (talk) 19:48, 27 March 2014 (UTC)
DBpedia has a license which seems incompatible with uploading the data to Wikidata. --Denny (talk) 20:00, 27 March 2014 (UTC)
So what would we gain? A reliable source or just another imported from Wikimedia project (P143) nightmare? --Succu (talk) 20:24, 27 March 2014 (UTC)
If the result is having thousands of items with one or more claims which previously were empty and of no use to anyone, I'd say that imported from Wikimedia project (P143) DBpedia (Q465) is just what we need.--Underlying lk (talk) 20:30, 27 March 2014 (UTC)
I'm sorry, but this is not about a race. --Succu (talk) 21:03, 27 March 2014 (UTC)
No, but surely it is about content, and that is what's being offered here.--Underlying lk (talk) 21:18, 27 March 2014 (UTC)
Sure, but it seems to be about unreliable content. --Succu (talk) 21:44, 27 March 2014 (UTC)
It's about content we already have. What do you prefer : focus on clean up the datas collaboratively or focus on boring and tedious database filling (mostly from the same datasources, high quality datasets does not appear from the blue )?  – The preceding unsigned comment was added by TomT0m (talk • contribs) at 16:27, 28 March 2014 (UTC).

@GerardM:, the import will only create claims? It wont create items? Using DBpedia instead of Wikipedia is taking advantage of their data parsing. When was their data extracted from Wikipedia? The older their data extract is, the more wrong/outdated claims will be added. Often no claim is better than a wrong/old claim. John Vandenberg (talk) 01:37, 28 March 2014 (UTC)

Answering the questions, addressing the reservations

First of all, importing from DBpedia has been discussed before. The DBpedia people do run their processes all the time. Given the properties we are talking about, we are talking about properties that are extremely stable. The difference with the typical bot process is in two ways: this process will run regularly and, it will report the differences of existing values. The reporting will be of benefit to our community and to the Dutch Wikipedia community. It is for both communities to decide what to do with it.

In my original posting I indicated that the Dutch DBpedia chapter will have to state as a pre-condition that the data they add to Wikidata is available under a CC-0 license. They run the Dutch DBpedia, they have the right to do this. In the final analysis adding data in Wikidata has as a precondition that the data is made available under CC-0. Therefore they will be saying twice that there is no license issue.

Statistics indicate that over 50% of our items do have one or none statements. All those items have no informational value. We are adding data using bots in the same "indiscriminate" way this process is accused of. Except our usual processes do not report at all. It is even considered ok to have multiple values when multiple Wikipedias disagree and, this is NOT reported.

The "nightmare" we are having is that our community, our data will be enriched by people who are not the usual suspects. They are highly organised and they have academic credentials. They are on top of their game and, DBpedia is the most connected resource on the Semantic Web. They want to give back to Wikipedia. The best strategy they see is by helping us improve and enrich our data. Thanks, GerardM (talk) 06:03, 28 March 2014 (UTC)

The way you make it sound, DBpedia is so good that the best thing we can do is close down Wikidata and advocate using DBpedia.
DBpedia has its qualities. Wikidata has different qualities. The point is that we can collaborate and do all better. I am firmly a Wikidata person, not a DBpedia person. GerardM (talk) 08:29, 28 March 2014 (UTC)
The 'discussion' you point to looks like nothing more than an earlier attempt to promote DBpedia? - Brya (talk) 06:43, 28 March 2014 (UTC)
A RFC is exactly that, a request for comment. GerardM (talk) 08:29, 28 March 2014 (UTC)
@GerardM: A RFC with 5 contributors can't represent the WD community. Just by putting one comment on the project chat you got the same number of contributors expressing theirs doubts about that process. Opening a RFC doesn't mean that a decision can be taken: in order to use that argument you will need more comments, and a good description of what you want to do in order to allow contributors to judge the proposition like you did with this comment in the project chat.
Then as said above, if DBpedia is so reliable, can you give some example of direct data imports from DBpedia to WPs ? If not do you have an explanation why no persons tried this ? WPs are full of bot operators who can do that job and DBpedia exists since a certain time. You are just selling the DBpedia stuff under the cover of Wikidata, so if WPs and projects had good reasons to avoid the importation from DBpedia, these reasons will stay the same with Wikidata and worse all other data present in WD will be judged in the same way even if they are coming from other sources. Snipre (talk) 13:04, 28 March 2014 (UTC)
You fail to understand what they do. Just like our own bot operators they harvest information from Wikipedia. They are good at that. The arguments against their data are exactly the same arguments that apply to any and all of the existing bots. They will improve on what our bots do because they will report where the data in Wikidata differs from sources like nl.wp. GerardM (talk) 13:36, 28 March 2014 (UTC)
I am not judging the work of DBpedia, I judge an data import which doesn't answer the needs of wikipedians and doesn't show any advantage for wikidata: just being a mirror of DBpedia doesn't give any value to WD especially when DBpedia is only a collection of Wikipedia data. Collaboration with DBpedia, data comparison or data analysis in order to provide info about the best WP for future importation are not a problem but instead of using DBpedia as source better go to the primary source of DBpedia which is Wikipedia. Snipre (talk) 13:56, 28 March 2014 (UTC)
So young as a project and already have some "don't change anything" advocates. Extracting datas from Wikipedia is already done by Wikipedia, let's collaborate some way with them. You do not propose anything, implying you don't propose anything efficient. TomT0m (talk) 17:04, 28 March 2014 (UTC)
I am working in my field and trying to provide good data: look at Wikidata_talk:WikiProject_Chemistry#Collaboration_with_PubChem and at Wikidata:WikiProject Chemistry/ChemID before saying that I propose nothing. I am in touch with the chemistry projects in WP:en and WP:de, so I know quite well what this projects are thinking of WD and about their critisms. I was defending WD during the last survey about WD in the WP:fr Wikidata:Project_chat/Archive/2014/02#Small_survey_in_WP:fr. I was pushing for the development of the Help:Sources page. But as you are using personal arguments instead of answering my questions (why nobody use DBpedia for data import in WP until now? What are the needs of the wikipedians ? Do you know them ?) I think your argumentation is not grounded.
My propositions are not propositions, they are common sense: find reliable references (databases, books, articles), organize projects and distribute among contributors the data import tasks. Each subject is different, so there is no unique solution. I don't say "don't change anything" but I am not affraid in front of empty databases, I work for long term and I try to understand which will be the future use of wikidata because wikidata alone has no sense. Snipre (talk) 19:11, 28 March 2014 (UTC)
An all too common scenario is that a bunch of people gather a lot of stuff ("some data is better than no data, and a lot of data is better than some data"), leading to a mediocre bunch of stuff (the good, the bad and the ugly). They then get some poor souls to laboriously get out of the worst of the errors ("every contribution is appreciated!"). Then some bright light suggests linking and comparing this to other 'datasets'. In practice this means that all the original errors resurface ("Look, a difference between datasets! This must be meaningful!") and back at square one we are. The www will retain all errors and will never let them die. - Brya (talk) 17:36, 28 March 2014 (UTC)
That's vague and you are not providing any long term plan at all. Except "high quality datas will pop up". TomT0m (talk) 23:45, 28 March 2014 (UTC)
What do you want ? TomT0m (talk) 18:00, 28 March 2014 (UTC)
 Support adding all the info in the existing wikipedia infoboxes to wikidata. This is data which is considered good enough to have on wikipedia. Adding it to wikidata would jump start the process of improving the quality of this information by exposing it to every kind of programmatic tool. UI improvements to make it easier for wikipedians to edit wikidata will be another important step in improving our data. Sourcing and referencing tools that can generate a reference from a url will provide a further improvement. Step by step it will get better and transferring the existing wikipedia infobox data into wikidata is, I believe, the first step. Filceolaire (talk) 23:43, 28 March 2014 (UTC)
 Support a pilot, adding claims to existing Wikidata items only where the property is missing, and focusing on Dutch people only. Then the community should review the work, and review the import bot code, before the import starts creating new items, altering existing claims, or working on topics are not Dutch in nature.
  • Regarding my reservation about creating new items, in my import of periodicals I found many Dutch Wikipedia articles were not linked to the other wikis. International Review of Cytology (Q2687019), Mathematische Nachrichten (Q1389846), Algebra Universalis (Q4723972) and Memoirs of the American Mathematical Society (Q6815285) are a few examples of this, but there were lots of them. By using authority control records, the import process avoids creating more duplicates, and can even identify potential duplicates already in Wikidata.
  • Regarding my reservation about when the DBpedia extraction occurred, and the quality of the data, the last dump of DBpedia is February 2013. That is very old, and the Wikipedia extraction is probably much older. Wikipedia pages often have basic facts like DOB wrong, and they are fixed over time. One common source of this problem is when the infobox is copied from an existing article, and some facts about the wrong person are saved into the new article. Dutch Wikipedia is not updated as much as English Wikipedia, so the Dutch Wikipedia infobox data may be wrong when the English Wikipedia infobox data is correct. Dutch Wikipedia is probably better for Dutch people. It is not better for British people.
We need good bots importing high quality data, and good bot coders sharing their code so the next bots are smarter. John Vandenberg (talk) 01:03, 29 March 2014 (UTC)
@Denny:. Essentially, apart from the US goverment, no website, not even Wikipedia, has a licence compatible with CC0. If we can't extract data from them, I guess we really need to change Wikidata's licence. --Zolo (talk) 06:48, 30 March 2014 (UTC)
I blogged about Wikidata quality. My opinion is very much that many arguments are beside the point. When DBpedia had a dump is not relevant because it does not mean that it is used for exporting data. When subjects have multiple items, they clearly need to be merged but you cannot merge when they do not have a Wikidata item. In order to identify what needs to be merged, you NEED items so adding items is good. It is good because it enables bots to add statements.
When you want to raise quality, what you need to consider is what tools you need to raise quality. How can we enrich items more easily. How do we identify items to be merged. How and where is our data used and how can we bring more quality to that use case.
When DBpedia imports data and, when it identifies the differences between what we have and what it knows, this import process is a massive improvement over all our previous import processes. It reports. Consequently, it is less of an issue when that data is "dated" because the process of adding information will run again. GerardM (talk) 07:11, 30 March 2014 (UTC)
@GerardM: I think you doing a wrong assumption: adding data is not quality improvment. You can have very few data and the highest quality. And what I don't understand in your explanation is the need to import data before doing any evaluation of some data sets between DBpedia and the current database of WD. Just take one type of data, perform an external comparision and give us an comparison analysis to show us that your assumption about the improvment of DBpedia's data import. You have in front of you people who are in favor of Wikidata use and you haven't no fact to convince them that your import will be a good thing for WD. So how do you will convince wikipedians to use WD when they have more doubts than us ? I playing the devil's advocate because I think about the next step which is convince wikipedians to use WD and if you bring only data with the same quality than the quality WPs have now in their articles, few wikipedians will be interested. If there is just a question of data import from DBpedia, WPs can do that directly using their bots without the need of WD.
If you have to have a win-win stategy, forget about massive imports, find one or two projects interested by WD, ask them what kind of data they want and try to help them in that objective. Small steps, feet after feet. Don't hurry, WPs work since years withour WD, they can continue to work without WD. Snipre (talk) 15:21, 31 March 2014 (UTC)
Imo what we really need is improvements everywhere, there is not one big blocking point, there is several small one. Nothing will happen while it will be hard to build infoboxes. When infobox migration will be easy, it will happen naturally. So we need to make infobox migration easy. What's a motivational to write actual easy to use code to write infobox ? There is datas in Wikidata we don't have. Where does this data come from ? (That's the thing you don't really provide an answer to) maybe from other big Wikipedias. Then Wikidata can make the bridge beetween them. It will then become natural to use Wikidata to add datas at first, and we can drop the dbpedia import code (because, yes, dbpedia already is that bridge). There is other ways to make Wikidata more in the mind of Wikipedians : focus on things that Wikidata is an answer of, identifying red links and provide datas to them through reasonator for example, see Gerard blogpost. A collaborative project is the sum of small steps, sure. But a lot of small steps, in every directions. We should not ignore any of them. It's a puzzle, adding a piece, any piece, improves the whole. TomT0m (talk) 15:41, 31 March 2014 (UTC)

I second the doubts expressed against a simple upload of data from a low-quality external source. There are high quality external sources, and I am quite in favor of using them. These sources might be authoritative. And then it is a pleasure to add them as a source. But DBpedia as a source? An error-prone extraction performed on a tertiary text?

But there is no need to simply upload the data. I think, DBpedia's data might be extremely useful in other ways suggested above: for example, to compare them with existing data and create reports on the errors. Or find missing data and let users act on that missing data, and add external sources, e.g. with a tool like the WIkidata annotation tool. Or similar tools.

I was always afraid of, and I have repeatedly warned against letting Wikidata become the biggest data heap on the Web. Too fast growth of the content is not what we are aiming for. We need to ensure that the content growths no faster than the community does. Wikidata is in the comfortable position of having the fifth-largest active community of all Wikimedia projects (we passed the French Wikipedia in February), but our bots have created content on a much faster scale. This is fine for bootstrapping, but I think we should consider whether we are outgrowing that bootstrapping phase, and whether we should pay much more attention to quality, to sources, to processes than to the explosive growth we have encountered so far.

Again - uploads from high quality databases? Yes, absolutely. If we can upload census data from the US government about US cities, I hardly see a reason against that. But uploading data of lower quality?

You point to DBpedia's centrality in the Web of Data as an argument. I disagree with that argument. I would claim, that basically all of the datasources linking to DBpedia do this because of they want to use Wikipedia as an entity library to link to, and DBpedia provides a convenient, standard conforming way to do that. They do not link to DBpedia in order to use the knowledge in DBpedia. I would be happy to be proven wrong. Find, say, two applications that actually use the data from DBpedia. I doubt that the quality of DBpedia data is sufficient to warrant even an initial upload of large parts of it to Wikidata.

Furthermore, even though my license argument was dismissed, and although it is said that the DBpedia community has agreed to change it, I cannot find this change anywhere. Please point me to the actual license change that allows us to upload data to Wikidata. It should be obvious that this has to be settled before we can upload anything to Wikidata. This is true for any external database that anyone wants to upload to Wikidata. If an external datasource is more restrictive in the usage of their data, then no, it is not possible to upload the data to Wikidata.

So, to summary - this was too much of a rant, sorry:

  • we should not upload data from sources not having sufficient quality.
  • data from such sources can still be useful to create reports and to power external tools that help with curating and growing Wikidata.
  • we should not upload data from sources with incompatible licenses.
  • uploading high quality or even authoritative data with compatible licenses sounds like a good thing. These are usually domain-specific. --Denny (talk) 19:20, 31 March 2014 (UTC)
+1000 to everything Denny said. We're not in the race to be the biggest pile of useless data. We're trying to create a useful and reasonably accurate knowledgebase. There is no deadline! --LydiaPintscher (talk) 19:27, 31 March 2014 (UTC)
@Denny, LydiaPintscher: Any suggestion of high quality datasets in CCO ? TomT0m (talk) 19:45, 31 March 2014 (UTC)
Why does it even matter? But since you ask: World Fact book, NASA, anything by the US government basically, or check the Creative Commons site on CC0. There is plenty of that. But again, I do not think that the sheer current availability of this data is not so relevant to the discussion. --Denny (talk) 20:19, 31 March 2014 (UTC)
It matters because we are in Wikidata :) This was a real question. TomT0m (talk) 21:20, 31 March 2014 (UTC)
@TomT0m: Our goal is to build up a free (semantic) database, not to import one from various sources. Our wikipedias started small and grew up to full-blown encyclopedias. This was done by users, not bots. Bots are only a tool to make editing a little bit simpler. Quality is important, not speed. --Succu (talk) 21:11, 31 March 2014 (UTC)
@Succu: speed of the tools is an important thing to do important tasks instead of clicking many many times to do boring things. The more interesting task you propose, the less tired users you will get. I find stupid not to reuse datas that has been accumulated by thousands of human work and redo everything one more time just a little differently. When we started, we imported a lot of language links, with some mistakes but mostly with correct sitelinks. Because the work was mostly done by human and bots. Let's use the work already done in infoboxes and all. TomT0m (talk) 21:18, 31 March 2014 (UTC)
Repeating a „boring“ task is the foundation of our knowledge. Some examples I'm familiar with are the works of Linné, Darwin, Rutherford or Marie Curie. The outcomes of doing these „boring“ tasks are (hopefully) well known. And let me be blunt: the last heavy selling of unverified data was done in the Middle Ages. Sure we need better (inline) tools. Today creating a „real” source (eg. Petenaeaceae, a new angiosperm family in Huerteales with a distant relationship to Gerrardina (Gerrardinaceae) (Q15989664)) is nearly impossible. But without the help of interested user like John Vandenberg it's impossible to generate „helper tools“ which makes sourcing easier . --Succu (talk) 22:20, 31 March 2014 (UTC) PS: Nearly every day superfluous sitelink-items are created, by humans and bots. Nothing has changed.

Some more questions answered

The biggest point in what we do is NOT we collect data to become the biggest. The point is that we want to use that data. When there is no use case for data, there is not much point in having it.

One of the purposes of Wikidata is for it to be used in the context of the WMF projects. It is intended to support its aim; to "share in the sum of all knowledge". When you consider the items of Wikidata, most items are linked to Wikipedia articles. Most items have an article in only one language. By having an article, the subject is notable enough however we can not share information that is locked in an article. We can when it gets extracted from that article because once it is in a statement, we can present it in any language.

How this extraction process is performed, is not so relevant. It can be done manually, by bot either operated by ourselves or by others. Manual extraction is likely to bring the most errors into Wikidata. Bot processes may report on what they find.

Given the statistics about Wikidata content, we are arguably far away from covering the basic information of the majority of the items we have. By connecting items to other items, by adding labels to items the existing data gains value and can be presented. There is a subset of data that already has a quite reasonable quality. This quality becomes only apparent when it is expressed in for instance the Reasonator or through WDQ.

At this stage the most important thing we can do for ourselves is invest in tools and use tools that help us relate more information. Having and using the best tools will make us spend our time effectively and provide us with increased quality and quantity of information. Thanks, GerardM (talk) 13:45, 1 April 2014 (UTC)

@GerardM: Thanks for bloging this inside, not outside of our community as you prefer to do. You wrote „The point is that we want to use that data”. Sure „we” want this. But to be convincing we have to do a boring job: adding „real” sources. --Succu (talk) 22:16, 1 April 2014 (UTC)
No, we do not. The boring job is for those who want to be bored with it. At this stage we are not mature enough to insist on it. Maybe for the niche that you are working on. As it is taxonomy, it sure has its issues aside from sources. GerardM (talk) 04:41, 2 April 2014 (UTC)
I deeply appreciate everyone working in their "niche" of expertise. Subject matter experts are extremely valuable for knowledge base creation. Could we refrain from trying to hurt and belittle each other when all we do is to resolve some procedural disagreements? --Denny (talk) 17:14, 5 April 2014 (UTC)
This sure sounds like "I am collecting a big bunch of stuff, because I can, and because it is fun, but I am leaving the real work of turning it into something sensible to the little people.". This is a disease we have seen too much of. - Brya (talk) 05:44, 2 April 2014 (UTC)
Hardly. The edits I like to do are about prizes, they are about people who died, they are correcting things that are arguably wrong. I like to add successions either in a job or in a dynasty. Yes, I add big stuff as well but do explain why and how this has a negative impact on our quality. My understanding is that the value of Wikidata is in the connections of its items. Thanks, GerardM (talk) 07:17, 2 April 2014 (UTC)

Is our data really bad?

There have been quite a few dicussions on data-quality recently. I would like to propose the following thoughts. (1) First of all, the sourcing of our data is not worse than Wikipedia. On Wikipedia most individual data parcels (birthdays, chemical formulas, etc...) are not sourced. We should view the sources in the item and all articles as a "sourcing environment". Confidence in text and data is growing in the environment. (2) Many people confuse data-sourcing with data-quality. Just because a statement is sourced does not mean it is a) the correct source, b) correct data, c) or the same data as in the source. Confidence in our data is governed by the Wiki-principle. Many people checking data and iterative improvements are more important than the academic approach to data quality (Alhtough we want to move in the direction of having both). Also thing like "repeating experiments in different laboratories" can be seen as a concept similar to a Wiki. (3) Wikidata will not solve the problems of incorrect data in scientific literature, mass-media and the internet. This is pretty obvious to anybody from the natural sciences. As methods improve a lot of things are proven wrong (or are improved). Until these things are published it can take years. It can take decades until it reaches text books. (5) Wikidata is not a scientific database. We should not [!], and can not [!] expect for our data to be as accurate as the cutting edge of a scientific field. Should we fill Wikidata with scatter plots, error propagation formulas, multivariate statistics and what else you can think of? No. We should gather established data and leave the uncertainty for the Wikipedia article.
I hope that some people with a pessimistic outlook will consider this in the future. We are really doing well and moving in the right direction. --Tobias1984 (talk) 18:43, 28 March 2014 (UTC)

Who speak about WD as scientific database ? We never request the highest quality data, we just want to follow that conclusion: WD is not a reference authority, so every data should be sourced. And for the reason to avoid Wikipedia as source, it is very simple: Wikipedia defines itself as not being a source. That's it. Not more complex than that.
We can speak hours about what will be WD or what sould be WD, I don't care about what you think, because I have better than an opinion: I spent time to discuss with wikipedians, with the people who will use WD and I heard what they said about WD.
In the french WP we had a survey: 33% against the use of wikidata in WP now, 43% saying that WD should be modified before a large data use and only 23% ready to use it without any constraint. Your speech about quality data is bullshit for the persons who answered that survey. They have their principles/ideas, if you don't match them, they won't take care about WD because they were able to work without WD until now. That's the reality.
This is an unique question: What is the purpose of wikidata ? If Wikidata should be the support of wikipedia so you have to focus on what wikipedia is requesting, or if Wikidata should be a database full of unreliable data and not used by anybody, do what you want.
What is wrong in your argumentation is that assumption "Many people checking data", ok, that's correct IF THERE ARE SOME PEOPLE USING YOUR DATA, but when people say "We won't use your data before we have a certain confidence in them", your assumption is wrong and your explanation how WD will improve is wind. Snipre (talk) 19:37, 28 March 2014 (UTC)
@Snipre: Can you pass a link to the discussion on the French Wikipedia?--Underlying lk (talk) 20:03, 28 March 2014 (UTC)
@Underlying lk: Wikidata:Project_chat/Archive/2014/02#Small_survey_in_WP:fr and for the direct link w:fr:Wikipédia:Sondage/Wikidata Phase 2 Snipre (talk) 20:08, 28 March 2014 (UTC)
Thanks. I must say that I agree with the overall objections of the French Wikipedians, which seems to be that the data becomes 'invisible' to editors and readers alike and somewhat beyond their reach, and that the editing and sourcing need to be made easier before Wikidata can be deployed massively on a large Wikipedia. However, I don't see it as being a judgement on the accuracy of the data itself, which was mentioned very seldom by those who were opposed.--Underlying lk (talk) 20:19, 28 March 2014 (UTC)
Look at the objections of the people against the use of Wikidata: 4 over 16 mention as main objection the quality of the data and around 7 over the 60 persons who answered mention the problem of the quality. 10% is not the majority but as the other objections can be easily corrected by programming, the accuracy will stay as the critical question at the end. But I had other discussions with other wikiprojects like the english and german chemistry projects and the accuracy was a common objection. Snipre (talk) 20:40, 28 March 2014 (UTC)
In my opinion what is needed for wikipedians to start using data from Wikidata is a much simpler mechanism for wikipedians to edit and add sources to wikidata. Adding or editing an infobox to wikipedia should open a window where you can edit wikidata but with lots of nudges appropriate to that infobox - suggested properties etc.
Having all the data in the existing wikipedia infoboxes added automagically via DBpedia would, in my opinion be a great way to start this process. Filceolaire (talk) 23:32, 28 March 2014 (UTC)
In the discussion any complaint about accuracy was quite simply drowned out by the number and recurrence of complaints about the complexity and opacity of contributing to Wikidata in its current implementation. Overcoming similar hurdles is not that trivial unfortunately, or we wouldn't be discussing them some 18 months after Wikidata has been launched.--Underlying lk (talk) 02:49, 29 March 2014 (UTC)
Filceolaire: your idea sounds similar to TemplateData for Visual Editor, and given past experiences a similar solution is likely to be strongly resisted by the English Wikipedia at a minimum. Editing information through forms is slower and more restrictive compared to using a simple text editor, and currently requires being acquainted with a large number of properties, each with its own specific usage which often requires reading talk pages just to be understood. And when it comes to sourcing, in the time it takes to add a reference to a Wikidata claim you can add five with Wikipedia's cite tool.--Underlying lk (talk) 03:03, 29 March 2014 (UTC)
I agree. Until we fix those problems I don't think we are going anywhere. Filceolaire (talk) 12:49, 29 March 2014 (UTC)
Module:Wikidata at test2wiki does not support sources today. To be able to prove the benefit of Wikidata to the wikipedians, we need modules with such functions. Very few of us know how to create such modules. -- Lavallen (talk) 13:20, 29 March 2014 (UTC)
I think main problems Wikidata has to face are... 1) simplify sourcing (speeding it up), maybe a copypaste of Wikipedia templates (cite book, cite journal,...) and somehow pasting it in Wikidata and the system "translating" this into our database could be "science fiction"/"rocket science" right now but... On the other hand I think it could be necessary to protect well-sourced data, once a statement is "academically" sourced it should be difficult deleting or modifying to an IP or an unqualified user. Changes like these would drive to a greater confidence in our data and a quicker development of the project. And yes, I'm a Wikidata-n00b. :PTotemkin (talk) 22:53, 29 March 2014 (UTC)
Both, Lavallen and Totemkin are right: Sourcing is not possible for everyone who does not operate a bot (in WP, you can create a huge list and add one source, here you need ~5 edits per statement) and sources cannot be used in WP yet as most of our sources have own items (and this makes perfect sense). As we still cannot access arbitrary items from LUA, showing sources is just impossible. Thus, it is obvious that noone at WP is interested in using WD.
@Tobias: Even assuming our data quality is not worse than that of WP (it is often just a mix, importing the worst?/best? from all WPs) and that individual data is also not sourced in WPs, the author of the article has more or less the responsibility for data quality. Why should they use WD, i.e., give up control of the data in their article, while getting nothing for it (currently)? Our data quality must be better, not equally bad, to be convincing.  — Felix Reimann (talk) 08:02, 31 March 2014 (UTC)
@FelixReimann: Yes, and that is also actually all I am trying to say. We are comparable to Wikipedia in data accuracy. I am just fed up by some people advocating terms like "data cementary" and "data swamp" for Wikidata. We have long passed this test. Implementation of our data on Wikipedia is whole different story and you are also right, that when it will be done it has to improve the sourcing situation of the article. --Tobias1984 (talk) 21:54, 1 April 2014 (UTC)

Broken gadgets

I already reported it above, but MediaWiki:Gadget-AuthorityControl.js and MediaWiki:Gadget-Descriptions.js are still broken. Since they are essential for many contributors (see this thread in the French project chat for example), I hope someone will be able to fix them (ping @Ricordisamoa, Legoktm, Tpt, Fomafix, Bene*, Lydia Pintscher (WMDE):. Thanks! — Ayack (talk) 10:49, 31 March 2014 (UTC)

I've made a small change and authority-control works for me now. Tpt (talk) 14:21, 31 March 2014 (UTC)
Thanks, it's working for me now. — Ayack (talk) 10:13, 1 April 2014 (UTC)
Still not working for me, despite that I commented out all of my common.js. I guess it's a side-effect of one of the other gadgets that Infovarius reported as broken:
✗ LabelLister, Merge, Move, FindRedirectsForAliases, Preview, SitelinkCheck, and NavPopups (in items), MediaWiki:Gadget-AuthorityControl.js and MediaWiki:Gadget-Descriptions.js, seemingly related to this change (suggested by Underlying lk).
Questions for dev team & @Lydia Pintscher (WMDE): I am the only one to wonder how such a large-impact change gets committed without anyone preparing for it? Are all other Wikimedia projects similarly impacted? Is there some WD maintenance team that can describe necessary changes to Wikidata:Tools ? LaddΩ chat ;) 18:56, 31 March 2014 (UTC)
That was kinda my fault. We've made these changes to improve performance. I didn't realize that they will break gadgets. The dev team is only 10 people and not all of them work on Wikidata full-time. Things like this suck and I apologize for it but it happens. I'll try to keep a better eye on it. --Lydia Pintscher (WMDE) (talk) 19:03, 31 March 2014 (UTC)


Shouldn't the Authority Control gadget have some dependencies defined? Don't know if it will help matters any (still no workie here) but it seems like some module(s) should be present in order for that thing to load right.

Something like...

* AuthorityControl[ResourceLoader|dependencies=mediawiki.util,jquery.ui.dialog,jquery.spinner|default]|AuthorityControl.js

...but I'm not the most fluent person when it comes to this stuff.

Can some coder-type please look into this either way? TIA. -- George Orwell III (talk) 00:16, 3 April 2014 (UTC)

Changes to the default site typography coming soon

This week, the typography on Wikimedia sites will be updated for all readers and editors who use the default "Vector" skin. This change will involve new serif fonts for some headings, small tweaks to body content fonts, text size, text color, and spacing between elements. The schedule is:

  • April 1st: non-Wikipedia projects will see this change live
  • April 3rd: Wikipedias will see this change live

This change is very similar to the "Typography Update" Beta Feature that has been available on Wikimedia projects since November 2013. After several rounds of testing and with feedback from the community, this Beta Feature will be disabled and successful aspects enabled in the default site appearance. Users who are logged in may still choose to use another skin, or alter their personal CSS, if they prefer a different appearance. Local common CSS styles will also apply as normal, for issues with local styles and scripts that impact all users.

For more information:

-- Steven Walling (Product Manager) on behalf of the Wikimedia Foundation's User Experience Design team

It is good to see some serif fonts being used, but it would make more sense to reverse this (serif body text, sans-serif headings), given how long the lines are (or can be). - Brya (talk) 05:59, 1 April 2014 (UTC)

What is the best way to report errors to VIAF?

Hi folks, Over at enWS we have been working on linking Author pages to Wikidata. In the course of this we've been doing authority control connections as well. Unfortunately, we have found several errors in VIAF records (such as parallel records that need to be merged, etc.). I've tried reporting them over on Wikipedia, but no action seems to be taken. Does anyone know a better way to report these problems to OCLC/VIAF to have action taken to fix these errors? I figured the WikiData crew might be in closer contact with OCLC/VIAF. Thanks in advance! Tertiaryresources (talk) 13:30, 1 April 2014 (UTC)

I reported such issues (mostly related to Russian authors with many names transliteration variants) to them via e-mail oclcresearch at oclc dot org (mentioned somewhere on their contact page). Some of problems where fixed. But I think simpler way to report problems on their site should be implemented. --EugeneZelenko (talk) 14:21, 1 April 2014 (UTC)
You can contact VIAF by e-mail but nobody will respond. (I've tried it several times.) VIAF just harvest authority data but the corrections have to be made in the original authority files like GND (they respond quickly, doing merges and add aditional information) or LCNAF (they only correct serious errors). --Kolja21 (talk) 19:49, 1 April 2014 (UTC)

was created by ‎Magnus Manske.--GZWDer (talk) 14:48, 1 April 2014 (UTC)

added a few claims, creating a few items while doing this hope they are not duplicates, please check articles in your languages :) TomT0m (talk) 15:14, 1 April 2014 (UTC)

qLabel - a jQuery library using Wikidata (and other) for creating multilingual content

I just published qLabel, a JavaScript library that allows to annotate HTML elements with Wikidata Q-IDs (or Freebase IDs, or, technically, with any other Semantic Web / Linked Data URI), and then grabs the labels and displays them in the selected language of the user. Put differently, it allows for the easy creation of multilingual structured websites. And it is one more way in which Wikidata data can be used, by anyone. Contributors and users are more than welcome! --Denny (talk) 18:07, 1 April 2014 (UTC)

Font

Is anybody else seeing text on Wikidata in a different font? Nowhere else on Wikimedia, just here. What's going on? --AmaryllisGardener (talk) 19:13, 1 April 2014 (UTC)

Did you read Wikidata:Project_chat#Changes_to_the_default_site_typography_coming_soon? --Stryn (talk) 19:16, 1 April 2014 (UTC)
Nope, thanks! --AmaryllisGardener (talk) 19:17, 1 April 2014 (UTC)
This font is really horrible. Hard to read an hurts my eyes. We might as well switch to comic sans. How to change this back? Multichill (talk) 19:20, 1 April 2014 (UTC)
To change it back we cause an uproar forcing the Foundation to revert it. That's the only way atm. John F. Lewis (talk) 19:22, 1 April 2014 (UTC)
Or you can switch to Monobook. --Rschen7754 19:31, 1 April 2014 (UTC)
Or we could uproar.. John F. Lewis (talk) 19:33, 1 April 2014 (UTC)
The new default font on Wikidata has odd-looking Fs
On my system (Chrome, WinXP) it looks like someone has emboldened every letter "f". Horrible, Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:40, 1 April 2014 (UTC)
Screenshot now attached. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:31, 1 April 2014 (UTC)
Try turning on anti-aliasing and see if any problem persists. --—Wylve (talk) 16:29, 2 April 2014 (UTC)
No uproar or switching to Monobook needed. See Mw:Typography_refresh#Can_I_opt_out_of_changes_to_the_default_fonts.3F. --AmaryllisGardener (talk) 19:50, 1 April 2014 (UTC)
But Lydia said I could uproar :( John F. Lewis (talk) 19:52, 1 April 2014 (UTC)

Ugh, on the default size everything is so freaking huge now, was this really necessary?--Underlying lk (talk) 20:21, 1 April 2014 (UTC)

It looks good to me. Filceolaire (talk) 20:48, 1 April 2014 (UTC)
I'm with you... it's pretty nice. But, it's a change, so it must be worse than what was there before. Ajraddatz (talk) 21:14, 1 April 2014 (UTC)
It looks like we are stuck with it. Let's see if it stops hurting the eyes after a few days (and how many days). I notice I am starting to change my way of phrasing, to compensate. - Brya (talk) 05:49, 2 April 2014 (UTC)
Arimo is a brilliant font, and Liberation is nice to. But it's change, so like Ajraddatz I will probably hate it for at least three weeks from now... Rotsee (talk) 07:23, 2 April 2014 (UTC)
Even after modifying my vector.css I'm still seeing the font on items. --AmaryllisGardener (talk) 12:09, 2 April 2014 (UTC)
I am certainly fine with the change (Windows 7, FF2*)--Ymblanter (talk) 20:13, 2 April 2014 (UTC)
Complaints from IPs at the help desk at enwiki are pouring in, and they have to log in to create a personal css; and creating one doesn't change the font on all pages (like the special pages at enwiki and items here). This ugly font has to go. --AmaryllisGardener (talk) 15:59, 4 April 2014 (UTC)

Q1471454

Hi, I have some problems with this item : on the french wikipedia, Phinée fils de Bélos has no link to the version in english, but the other ones have. Thank for your help. Franz53sda (talk) 21:08, 1 April 2014 (UTC)

You added that link to it six minute ago :) Ajraddatz (talk) 21:12, 1 April 2014 (UTC)

Huey Tlatoani vs |Tlatoani; and... I can not edit manually those or other at all...

Can somebody fix both entries? because there are different articles on Wikipedia for both. Also, I cannot edit manually any entry at all since a few months ago... anyone has the same problem?... Best regards. —Jmvgpartner (talk) 06:12, 2 April 2014 (UTC)

Link:

Statements for items referring to more than one event

I've been adding statements to all of the FA Cup Final matches, and I'm not sure what to do about some items that refer to 2 separate matches (where the first match ended in a draw so a replay was required). An example of this is 1911 Fa Cup Final. I thought it was reasonable to add two separate point in time values (although not ideal), but then realised we would then need 2 separate values for all other statements which are different for each match (e.g. event location & attendance), it would then not be clear which statements referred to which of the 2 matches. Any ideas about how best to deal with this situation? NavinoEvans (talk) 13:16, 2 April 2014 (UTC)

And if you put only the last one ? You will loose some precision but how many cases like that do you have ? Again I don't think we can put every exception and I know this will reduce the quality of the data but in the other hand putting to many details will be a nightmare when dealing with the infoboxes or in the syntaxe code for data import. Perhaps you can put some comment in the talk page mentionning that data selection to keep a trace. Snipre (talk) 13:38, 2 April 2014 (UTC)
I have tested one thing in Q239276. That item is about a province and an island with the same name. It links with has part(s) (P527) to one item about the province and one to the island, since they need both separate types of statements, and different claims. The province is located on an island, the island is located in a sea. The province was founded a few hundred years ago, the island was founded several thousands of years ago. One is a administrative unit, the other is a geological and geographic feature.
I guess you can do the same with your item. Have a statement in the firsta item who points to the two matches. -- Lavallen (talk) 15:12, 3 April 2014 (UTC)
Not really because the best way to described your case is to create 2 items: one for the geographical item and one for the administrative item. Snipre (talk) 18:26, 3 April 2014 (UTC)
But the articles are about both. -- Lavallen (talk) 07:02, 4 April 2014 (UTC)

Just to answer the first question from Snipre, I've come across 3 examples so far and I'm only up to 1911 FA Cup Final so I imagine there will be quite a few more just in this series of items. For me,Lavallen's suggestion makes the most sense. I think having a separate distinct item for each match might be the only way to store the relevant data consistently. For example, if we just choose the last match for the data entry, then a graph of attendance of FA Cup Finals would appear have a huge dip on the years where there were 2 matches (the examples so far have shown massive reduction in attendance for the replay match).NavinoEvans (talk) 23:55, 3 April 2014 (UTC)

NavinoEvans, having two items for the two matchs is a good way to solve your problem but the complexity in data structure increases and the data extraction in WP becomes very difficult. In my opinion we have to save data in WD in a similar way we use it in WP. And before to be able to display the data you have to extract it so you need a specific code able to read the data and to catch what you want. Splitting event into 2 differents items means that the extraction code would be able to understand that situation and will be able to go deeper in the data structure to find the desired value. Do what you want but I have the impression that this complex configuration would be a problem for data use. Snipre (talk) 05:14, 4 April 2014 (UTC)
@snipre ... Wikipedia and Wikidata have not much in common when you consider complexity. The practices at Wikipedia are not consistent with ease of data retrieval in a database sense. Thanks GerardM (talk) 07:04, 4 April 2014 (UTC)
@GerardM I agree but here we need a clear decision of the projects/contributors working on that kind of data to see 1) how accurate the data should be and 2) which structure should be used to describe this exception. I am afraid that NavinoEvans develops its own solution and nobody will follow the data structure in other cases or won't be aware of these exceptions.
The use of the data is not the priority but it is important to think about it especially at the begining because if the data structure is too complex or contains to many exceptions nobody will use wikidata because it needs too complex lua module to extract them in wikipedia. The access to the data is capital to attract people especially if you expect their help to improve wikidata. Snipre (talk) 09:11, 4 April 2014 (UTC)
Sorry, you do not parse. Thanks GerardM (talk) 10:38, 4 April 2014 (UTC)

@Snipre, I can really see how the extra freedom for item creation could cause issues with complexity, but I do think that it limits the potential power of Wikidata to remain too rigidly fixed to Wikipedia articles. After all, the decision about whether to create a single Wikipedia article or split a topic it into a series of separate articles rests with the individual editors involved, so there is a lot of inconsistency between different topics and language versions (for example, there are hosts of paintings that are only listed in a single article). I just think whole system would be so much more powerful if every distinct entity can potentially be an item, obviously with rigid guidelines in place, and all of the new items linked into the existing network of data by correct use of properties like part of (P361) and has part(s) (P527). Of course, something much simpler needs to be decided for this particular FA Cup Final issue though :) NavinoEvans (talk) 21:54, 4 April 2014 (UTC)

@NavinoEvans I was not thinking about the problem of the corresponding WD item - WP article because this is only one part of the problem. I was thinking on the data extraction of the date of your event in an infobox about football matches. If you never try to extract the information you put in WD or if you are not able to query the list of all dates for FA finals using WDQ, I think you aren't able to understand what complexity is. WD is a database and the purpose of a database isn't to be full of detailed things, but to provide complete answers when a query is performed. Snipre (talk) 04:01, 5 April 2014 (UTC)
Ah, I see what you mean now. Thanks @Snipre. So do you think it's best that I just enter the details of the second match played as you first suggested? NavinoEvans (talk) 20:02, 5 April 2014 (UTC)

ask for handling

Hey Wikidatians,
end of april austrian, polish, german and czech wikmedians have a come together for doing some work for free content :) . We want to collect photos and geodata for National Heritage Sites in the Oberlausitz - articles about this houses are relevant in german Wikipedia, we also collect them in lists.

My plan is to use Wikidata to generate Maps like this (here are marked some of the houses, maybe 300-500 of them are possible at this weekend). So we need a Item for every house - most of them have no article yet. The aim is later to have a big map for all houeses still existing in germany, poland and czechia. What do you think is the right way to do this collection?

Thank you very much for ideas, Conny (talk) 15:14, 2 April 2014 (UTC). BTW: If you are a specialist, we need your help there - maybe you can join us and help to make things happen

To be more clearer: is it a problem, that I need Items of National Heritage Sites without Wikipedia Articles? Greetings, Conny (talk) 07:48, 3 April 2014 (UTC).

Theoretically, no. I think it very well fits the idea of Wikidata. But we have many users who look to litterally on WD:N, and they may start to question such items today. If you find a way to create internal links to such items, (from ns:0 here) the problem will disappear. -- Lavallen (talk) 09:53, 3 April 2014 (UTC)

2 years of Wikidata development \o/

Hey folks :)

2 years ago today we started development on Wikidata. It's been an amazing ride. Thanks to everyone of you for being a part of it and making this the great project it is. I wanted to share a real cake with all of you but technology failed me. Here's a picture of the yummy cake we had here in the office though: https://twitter.com/wikidata/status/451407996132659200/photo/1 ;-)

Cheers --Lydia Pintscher (WMDE) (talk) 17:25, 2 April 2014 (UTC)

Happy two years and thanks a lot for all the amazing work that have been done! Tpt (talk) 17:58, 2 April 2014 (UTC)
Thank you for creating Wikidata. Unfortunately I'm not able to eat that cake, but I can smell it through my screen. --Stryn (talk) 18:04, 2 April 2014 (UTC)
\o/ TomT0m (talk) 16:24, 3 April 2014 (UTC)

Tool required for data input for same source

When adding reference data WITH sources from biographical dictionaries, it is a tedious process to do so, for each piece of data, I have to add the same source in multiple parts (find and add source; reopen, then add volume number, add page number, and author). What I would like to be able to do is to set the source first with the source detail, then add the biographical detail. When the items can be things like date of birth, date of death, academic degrees, alma mater, parents, spouse, children, let us say that it becomes very tiresome, very quickly.

Alternative solutions accepted. Please ping me if there are specific answers.  — billinghurst sDrewth 23:45, 2 April 2014 (UTC)

@billinghurst: Use bot. My proposition is simple put everything in Excel sheet and find a bot which can do the import task. Some requests to have a gadget allowing to save source data several times were already done but it is difficult to find someone working on that field now. Snipre (talk) 07:34, 3 April 2014 (UTC)
In the past I developed a bot for that task (see user:Chembot but I didn't modify it since several months and it isn't updated to the last modification of wikidata. But I think that every bot operator can easily import data from an excel sheet or a table if the structure is well defined. Snipre (talk) 07:40, 3 April 2014 (UTC)
@Snipre: Though I am not solely working on WD data, that would bore me shitless, and I have done my hard slog on indexing and not wanting that t-shirt again soon. I am usually working on a person xwiki, often enWS, Commons, enWP and here, so I want to update a person's details here, pull properties and move on.  — billinghurst sDrewth 04:16, 4 April 2014 (UTC)

The ability to cut and paste a reference easily from one statement to another would be a major step forward for the user interface. It should be a priority if we ever hope to build Wikidata's reputation for reliability and verifiability. Pichpich (talk) 17:46, 3 April 2014 (UTC)

"Attaching" Q-data blue links (for existing articles) to red links (on Wikipedias where the said article does not exist) - feasible?

Is it possible for Wikidata to "attach" so that redlinks on other Wikis can show a listing of existing articles about this subject, using the Wikidata system?

For example, on the Russian Wikipedia a red link for the Russian for the American journalist en:Clifford J. Levy is paired with a blue link to the English Wikipedia: ru:Новая гуманитарная школа

On the Spanish Wikipedia: es:Wikipedia:Café/Archivo/Políticas/Actual#Question_about_inter-wikipedia_links_to_articles_on_other_language_editions_that_do_not_have_versions_on_the_Spanish_Wikipedia - An editor has concerns that linking to a blue link to one or two languages would show a bias towards that language (for instance linking to the English Wikipedia article about a U.S.-related subject (that is not in Spanish) would snew things towards the English Wikipedia) - In my opinion linking to a list of existing articles would be the best solution but I want to check if it is feasible.

Thanks WhisperToMe (talk) 12:46, 3 April 2014 (UTC)

Hi, this has been achived trough templates : see Template:Red wikidata link (Q15977575). GerardM blogged about this. I for myself added a link into the french wikidata link template, that was created a while ago. Of course this is not a really strong link. TomT0m (talk) 15:50, 3 April 2014 (UTC)

Discussions on possibly obscure pages

I'm not sure if anyone will see them otherwise, so I will post this here. I have commented on Property_talk:P133#Point, Property_talk:P134#Dialects_vs._languages, Property_talk:P1098#Native_speakers.3F. πr2 (t • c) 17:32, 3 April 2014 (UTC)

Karl-Heinz Bernhardt merging

I have linguistic difficulties to understand if Q1453803 and Q1729814 are about the same person. They have the same VIAF id that seems about two different persons. --Sbisolo (talk) 13:01, 4 April 2014 (UTC)

It looks like Q1453803 is about a meteorologist born in 1935 and Q1729814 is about a theologian who was born in 1927 and died in 2004. Different. One must have the wrong VIAF ID. --AmaryllisGardener (talk) 14:27, 4 April 2014 (UTC)
I've added to both items some statements, which should help to identify them as two different persons. -- Pütz M. (talk) 19:06, 4 April 2014 (UTC)

Citing spreadsheets

Population of French communes should soon be uploaded to Wikidata. The data come from an 8mb Excel file http://www.insee.fr/fr/ppp/bases-de-donnees/recensement/populations-legales/pages2013/xls/ensemble.xls . I though it shoul be the URL cited in the source. However user:JulesWinnfield-hu (WD:Bot requests#Population of French communes) suggested that it would be better to link to webpage from which the spreadsheet is available. This page is surely easier to read for most readers - and faster to load - but this is not really the page from which the data were extracted, and given that there are several side links on the page and links on the page, it might be slightly confusing. Do/should we have guidelines for that ? --Zolo (talk) 16:54, 4 April 2014 (UTC)

I think an item with a label which states what it is with too properties such as official site and raw datas would be great. A url is not really informative by itself, and a link to a spreadsheet is kind of useless witouht any context information. We should be able to know that we site the french census without having to click. TomT0m (talk) 16:59, 4 April 2014 (UTC)
Help:sources section webpage. Snipre (talk) 17:51, 4 April 2014 (UTC)
And for the title, just use Populations légales 2011. Snipre (talk) 17:56, 4 April 2014 (UTC)
@TomT0m:
  • for official websites, there is already official website (P856), but I do not think it makes sense to use it in the sources part.
  • I think the source section should provide (directly or through the item contained in stated in (P248)) the data that would be needed if we wanted to provide the reference in Wikipédia (or in a scholarly article for that matter). That includes author, title, but not "French census" as such I think. To me if we want to say that these figures were produced as part of the French census, that should go to the qualifiers rather than the source (and actually in this case this is most cases were not a direct product of a sample, but an extrapolation from previous censuses)
  • A dedicated "raw data" property sounds like a good idea.
@Snipre: I do not see anything answering my question in help:Sources my question being which URL(s) should we provide as a source). --18:54, 4 April 2014 (UTC)
Sorry to say that but your question is brain masturbation: the goal of a source is to provide a link between WD and the data source. If you can easily access to the excel sheet from the webpage put the webpage which link to the excel sheet or if the excel sheet is more difficult to find put the direct link to the excel sheet. In WP we never define how to use template like here so I don't understand why this should be different in WD. Snipre (talk) 04:26, 5 April 2014 (UTC)
An excel sheet needs metadata to know what it is and where it come from. In that case we need a dedicated item to describe the source. A census from an official organism, an independant estimation, a link to the website of the organism ... TomT0m (talk) 13:46, 5 April 2014 (UTC)
'Source' should link to the source, not to an excel spreadsheet which reproduces the source info in an easily accessible form - we already have the info in an easily accessible form on wikidata.
The bigger issue is how we will store all the census info on wikidata. If we are just going to store the Total population figure then it can be on the page for the census defined area but if we are going to store the breakdown of that total by race, religion, age and whatever other factor the census provides then I think we will have to have a separate item for "2010 census for Bar census district".
I think that when we start to store election results on wikidata we will need to do something similar - creating an item for each electoral contest such as "2010 general election in Foo electoral district".
Wherever you have a table of info you need to create a new item for it. Even economic info.
or we add hundreds more properties to existing items. Filceolaire (talk) 01:44, 5 April 2014 (UTC)
@Filceolaire:. What do you mean by "'Source' should link to the source". The excel spreadhseet is the source. It is what we relied on to fill Wikidata, and there is no such a thing as a "single official source for French census data" that would take precedence on it. The government's decree about municipal population figure has this baffingly vague sentence "Commune's populatin figures are those that can be seen in the tables on the INSEE's website" (art.2). --Zolo (talk) 08:10, 5 April 2014 (UTC)
He means that the real source is a process, which is an identifiable entity, whose the spreadshet is the result. A spreadshet is ... just a document, not always self explanatory. TomT0m (talk) 13:46, 5 April 2014 (UTC)
The source is a process ? Ok, TomT0m, go to the end of your idea and for the url, just provide the web address of the INSEE, http://www.insee.fr. By the way just provide the http://, that's enough.
Please can we stop that joke ? WP is using url to source values/comments since several years without problem and suddenly this is a problem in WD ? Are we more stupid than wikipedians ?
There is no rule, the guy who put the url chooses how he will select the url. The important thing is to be able to find the excel sheet with the data if necessary. If there is a web page which allows to find that excel sheet easily and provides in the same way some explanations, use the webpage. In the other case provide the direct link to the excel sheet. Chinese or indian guy doesn't care about the french explanations. Snipre (talk) 15:44, 5 April 2014 (UTC)
OK, the question has never been asked so :). This is no joke and pushing my reasoning comes nowhere near where you want to push it. TomT0m (talk) 16:54, 5 April 2014 (UTC)

Hi, for ERA 2010 journal list (Q15735759) and ERA 2012 journal list (Q15794938), I have used official website (P856) to link to the webpage, and reference URL (P854) to link to the data file. John Vandenberg (talk) 01:09, 6 April 2014 (UTC)

One of the groups from the #WikiHack event at the Sunlight foundation today have put together a schema for adding Sunlight's bill data to WikiData. Comments would be much appreciated! -- TaraInDC (talk) 21:06, 5 April 2014 (UTC)

Relationships between Wikimedia category pages in Wikidata

While waiting for comment/approval of user:MedalBot, I have been working on a tool to monitor Wikipedia activity related to an international event (e.g. the Paralympics). The current code is at User:JVbot/wikipedia-sync.py (If you want to see it 'working', I suggest running "wikipedia-sync.py -qid:1045816 -once" as I have been working on the related Wikidata items for a week now, on and off).

It currently suggests when a new Wikipedia category should be created, based on category membership of the same category on other languages. For example, on Category:Alpine skiers at the 2014 Winter Paralympics (Q15909926), it reports that Christoph Kunz (Q1085367),Philipp Bonadimann (Q14530680),Markus Salcher (Q15919654) would all be suitable members of a category on German Wikipedia. However, there are two issues:

  1. it doesn't yet guess whether German Wikipedia wants Category:Alpine skiers at the 2014 Winter Paralympics (Q15909926) ; the category system on each Wikipedia is slightly different, and the community has different goals.
  2. it cant (easily) guess what the new Wikipedia category for Category:Alpine skiers at the 2014 Winter Paralympics (Q15909926) on German Wikipedia should be called.

Both of those could be 'solved' by linking Category:Alpine skiers at the 2014 Winter Paralympics (Q15909926) to Category:Alpine skiers at the 2010 Winter Paralympics (Q6989544). If German Wikipedia did have that category for the 2010 Paralympics, they probably want the same category for the 2014 Paralympics, and replacing '2010' with '2014' in the category name is a easy solution to the second issue above.

I could solve this problem by adding a lot of per-Wikipedia category knowledge to the bot, but it would be nice if the link between Category:Alpine skiers at the 2014 Winter Paralympics (Q15909926) to Category:Alpine skiers at the 2010 Winter Paralympics (Q6989544) was recorded in Wikidata in a consistent manner. John Vandenberg (talk) 00:51, 15 March 2014 (UTC)

category combines topics (P971) is useful, but it isnt a good solution. Ideally we should be able to say that
  1. Category:Alpine skiers at the 2014 Winter Paralympics (Q15909926) and Category:Alpine skiers at the 2010 Winter Paralympics (Q6989544) are both instance of (P31) Category:Paralympic alpine skiers by year (Q8713614);
  2. Category:Alpine skiers at the 2014 Winter Paralympics (Q15909926) is a part of (P361) Category:Alpine skiing at the 2014 Winter Paralympics (Q15925239), and is part of Category:Competitors at the 2014 Winter Paralympics (Q15915754);
  3. Category:Competitors at the 2014 Winter Paralympics (Q15915754) is an instance of Category:Paralympic competitors by year (Q8713879); etc.
It would be great to record, in Wikidata, the structure of Wikipedia categories across all languages. John Vandenberg (talk) 15:38, 15 March 2014 (UTC)
I would better use subclass of (P279) for 1 and 3. --Infovarius (talk) 18:49, 15 March 2014 (UTC)
Categories are neither subclasses nor instances (options 1 and 3) of other categories (as is trivially evidenced by the fact that they are instances of categories). What should be done is by connecting the categories to their actual topics.
As for the structure of all categories (option 2), I'd oppose this--category structures will differ across the wikis. It also induces a ton of maintenance upkeep, trying to agree with the wikis on which categories are subcategories of others. category combines topics (P971) combined with the relations established by P910‎ (P910‎)/category's main topic (P301) and instance of (P31)/subclass of (P279) would seem to do the job that you want with option 2. --Izno (talk) 19:31, 15 March 2014 (UTC)
Hi Izno, the complete category tree on each Wikipedia are of course different, and I do not want to create parent/sub categories relationships, because as you say, the relationship between categories can differ between Wikipedia. (And I am also not interested in relationships between pages and categories, for that reason and others). I am trying to semantically describe categories relationships with other categories, where it makes sense. For example, two very top level categories, Category:Sports by year (Q7017138) and Category:Sport by year (Q12362196), both combine sport and years. How do we distinguish them with data? That will be hard, and probably not worth trying to do. But a more reasonable expectation is that Category:2014 Winter Olympics competitors for the United Kingdom (Q15726967) is most closely related to Category:2014 Winter Olympics competitors by country (Q15709210) on Wikipedia; somehow we need to model that relationship. Category:2014 Winter Olympics competitors for the United Kingdom (Q15726967) is implicitly a very strictly defined category - any Wikipedia that has that category will have the exact same kind of children categories. The semantic relationship between those two category items will always the same.
Category:Competitors at the 2014 Winter Paralympics by country (Q15928741) only exists in Norwegian and Swedish Wikipedia, but they are the same thing. English has Category:2014 Winter Paralympians of Sweden (Q15865024) and doesnt have Category:Competitors at the 2014 Winter Paralympics by country (Q15928741), but that doesnt change the semantic relationship between Category:2014 Winter Paralympians of Sweden (Q15865024) and Category:Competitors at the 2014 Winter Paralympics by country (Q15928741).
Describing categories across the wikis is a lot of work, but I am interested in doing it for the areas I am working in, because it helps me to be able to navigate between category items in Wikidata, and I also use Wikidata category items in my bots to find Wikipedia pages any other languages. For example in the Olympics and Paralympics, the 'first' article/thing about a medalist is likely to be written in the language of their country, or one of the bigger wikis. Often the first 'thing' to exist about a medalist is actually a Commons category, as they often appeared in photos before they become 'notable' enough for Wikipedia. The wikimedia category trees are the easiest way to find the new pages as they are created, so that it can be added to Wikidata or linked to an existing item in Wikidata.
The category trees on each wikimedia project are remarkably similar. I have found that the category systems are often not linked together with interwikis (now Wikidata items), but they share the same structures anyway. The biggest difference is that smaller wikis are missing parts of the category system. There are occasionally big differences, but it is common for group of languages to use the same category system that is different from the category system used on other Wikipedias. (the Scandinavian language for example, care about winter sports more than anyone else; many European languages categorise periodicals differently to English; etc). Perhaps the similarities are because Wikimedia Commons categories have provided a unified category system for all Wikipedias to link to, or other wikis are copying category tree designs from each other but not bothering to create interwiki links. There are a lot of duplicate category items in Wikidata. The bots create a duplicate Wikidata item each time a Wikipedia creates the same category; if the category items had more semantic data, the bots should be able to automatically link new categories to the existing category item. John Vandenberg (talk) 17:22, 2 April 2014 (UTC)
@John Vandenberg: You have to be aware of that the fragmentation of the cat-tree in the Scandinavian projects is mainly a product of very few users with a very high editing-rate. That fragmentation has a very weak support in the community in those projects. -- Lavallen (talk) 18:07, 2 April 2014 (UTC)
@Lavallen:, in this case I think the explanation is that the Scandinavian Wikipedia have very good coverage of winter sports, and their coverage of Paralympic Winter Sports is often better than English Wikipedia due to their strong cultural ties with winter sports. Spanish and Portuguese Wikipedia also includes the 'competitors at <event> by country' category structure. I know English Wikipedia is not keen on adopting it, calling it 'over categorisation' as they have the 'competitors at <event> by sport' and they dont want lots of categories on articles. Other Wikipedias are slowly adopting 'competitors at <event> by country', but smaller Wikipedia usually only have one or two of the children categories (i.e. 'competitors at <event> from <country>') for the countries which that language has a strength in.(e.g. Ukrainian Wikipedia has <country> = Ukraine but not <country> = Australia). John Vandenberg (talk) 05:09, 3 April 2014 (UTC)
What I try to describe is that it's not the same users who set up the articles as the categories. "by year" is highly disputed and "people from" is also a very disputed fragmentation on svwp for example, and it has caused a lot of discussions. But we have failed to stop the few users who create these categories. The edit-rate of these users is 100-1000 times the edit-rate of normal high-active users. I have even questioned the "by country"-categories, with very little feedback. In what way is Norwegian skiers different from Swedish? It's only when they open their mouth to talk you may notice some difference and I doubt that somebody from Ukraine would notice it. -- Lavallen (talk) 06:32, 3 April 2014 (UTC) (And, yes, where I live, we have snow > 6 months per year.)
I think we should get rid of most of the Category items by combining them with their main topic items - both deal with the same topic. This will require changes to the software (See bugzilla 52971) so in the meantime we should link the main topic items, using 'subclass of' etc., and link the categories to these main topic items, using 'Category main topic' etc. IMHO Filceolaire (talk) 07:30, 16 March 2014 (UTC)
I don't think that's going to happen. Multichill (talk) 19:32, 20 March 2014 (UTC)

merging

hi;

congratulations on making merging (FAR) more complicated than it needs to be! ^__^

this single-entry item: https://www.wikidata.org/wiki/Q5621538

belongs in this listing: https://www.wikidata.org/wiki/Q1555465#sitelinks-wikipedia

i tried to add it, but apparently that was too easy & simple & obvious for wikidata. i saw your "merge requests" thing, but I CAN'T BE BOTHERED GOING TO ALL THAT TROUBLE, FOR WHAT SHOULD BE A SIMPLE <1-MINUTE PROCEDURE.

since i can't fix this item myself, i leave it to somebody else to do (extra) work here.

if you ever get yourselves sorted out, so that such simple merge (etc.) functions don't have to be handled in such a cumbersome, roundabout manner, let me know? till then, i shall avoid putting any effort into discovering duplications/needed merges, because this is bloody ridiculous.

adieu,

Lx 121 (talk) 03:06, 5 April 2014 (UTC)

Well let's translate your statement: "I'm new to Wikidata. Can someone explain to me how to merge items?" Of cause, Lx 121, we are happy to help you. Click on Preferences / Gadgets:
Merge: This script adds a tool for merging items
After this 5 seconds you need 5 more seconds to merge the two items. Makes 10 seconds, including RfD that has been automatically created. Cheers --Kolja21 (talk) 05:13, 5 April 2014 (UTC)
Well, everything is easy when you know how to handle it. But in the case of merging different Wikidata-files, it seems that only some elected people shall do it. I also do not know how to merge and even how to delete a false Q-type here. This Q15110449 has to be deleted as I have noted yesterday in the article (without being noted by anyone) because it belongs to Q3058793. If everyone likes, (s)he may change it or let it wrong how it is! I am despaired, would wish to make things easier like in every local Wiki but here is terrible!!! Thanks to everyone who understands :) Chivista (talk) 15:47, 5 April 2014 (UTC)
P.S. I mean that I do not know how to merge different Q-identities. This is difficult, indeed. Would it be easier, more people could/would help!!! Chivista (talk) 15:54, 5 April 2014 (UTC)

Merging items on Wikidata:

  • Remove link(s) from one item.
  • Add the link(s) to the other item.
  • Request deletion of empty items at WD:RfD.

Or activate your merge gadget in your preferences, and the above is done automatically. More detailed info is found at Help:Merge. I don't see how that is any more complicated than the merging and deletion procedure at Wikipedia. There you have a complex system of templates, wiki markup syntax and PROD/AfD/RM/CfD/TfD etc. appeals to go through, or you can activate the Twinkle gadget to do it automatically. If you can think of a way of improving the ease of merging, please suggest it at WD:Contact the development team. Delsion23 (talk) 15:41, 6 April 2014 (UTC)

Searching categories

Why when I am typing "Unionoida" in search field I am not getting Q9728608 in the list? Infovarius (talk) 16:24, 6 April 2014 (UTC)

It is a bug when the label starts with Category: autocomplete works, but special:search doesnt. John Vandenberg (talk) 17:05, 6 April 2014 (UTC)

Do you find what you need with the Reasonator ? Thanks, GerardM (talk) 07:02, 7 April 2014 (UTC)
Are you asking me? If so, I dont understand your post here in this thread. John Vandenberg (talk) 08:55, 7 April 2014 (UTC)

New report

Hi, I've created a new report on property creation. You can see it at User:Jakec/List of users by number of properties created if you're interested. --Jakob (talk) 00:49, 7 April 2014 (UTC)

IEG proposal on the category system in the English Wikipedia, with an eye toward Wikidata

I have submitted a proposal for an Individual Engagement Grant for the first phase of a project looking at the category systems in Wikimedia wikis. In this first phase I will research the nature of the English Wikipedia's category system, as the first step in designing ways to optimize category systems throughout WMF wikis. In later phases, I plan to

  • Research how readers and editors utilize the category system in the English Wikipedia.
  • Investigate the category systems in other language Wikipedias and in other WMF projects.
  • Explore the value and feasibility of using Wikidata as the basis for the category system across WMF wikis. If deemed appropriate by the community, work with the community to develop and implement this.
  • Utilize user-centered design methodologies to prototype various enhancements to the category system to improve the user experience. If deemed appropriate by the community, work with the community to develop and implement such enhancements.

If you would like to endorse this proposal, you can do so here. I would also appreciate any other feedback, pro or con, which can be posted here. Thanks! Libcub (talk) 07:26, 7 April 2014 (UTC)

Jsc

What do you wonder about the Template: JscQ16126323, at http://wikidata.org/wiki/Q16126323

We don't wonder now, Sjoerddebruin wondered for us and deleted it. John F. Lewis (talk) 10:44, 7 April 2014 (UTC)

New Wikidata page

L'Auberge du Pont de Collonges (Q1880610) >> How I can add the english version ? Thanks Mike Coppolano (talk) 11:07, 8 April 2014 (UTC)

In this case, pages Q1880610 & Q6455626 had to be merged. Already done for you. Matěj Suchánek (talk) 11:16, 8 April 2014 (UTC)
Thanks Mike Coppolano (talk) 11:19, 8 April 2014 (UTC)

Can I quote you on that, sister? (aka Wikiquote is here)

Hey folks :)

Just wanted to let you know that we have just enabled interwiki links for Wikiquote via Wikidata. Issues, questions and more please to Wikidata:Wikiquote.

Welcome, Wikiquote!

Cheers --Lydia Pintscher (WMDE) (talk) 17:55, 8 April 2014 (UTC)

Problematic edits including Commons category unknown value

Some of Special:Contributions/90.182.83.10 should probably be reverted. The Danish April sitelink edits were good, but needed a bit of tidying up. The changes to Category:Biblical apostles (Q7165129), Category:Aladdin (Q8381850) and Category:2014 in Israel (Q13626865) are all good. The others could have been mistakes or unintentional software issues. Is setting Commons category (P373) to 'unknown value' a bug? I've seen that quite a bit; can we set up an edit filter to prevent those? John Vandenberg (talk) 11:20, 8 April 2014 (UTC)

Do you know any other way how to remove incorrect commonscat and prevent bots to readding them (as they did with many items [before https://www.wikidata.org/w/index.php?title=Q6648272&action=history])? JAn Dudík (talk) 20:12, 8 April 2014 (UTC)

Flooders!

Goddamn flooders blocking various pages, like new pages! There, just wanted to get that off my chest. Palosirkka (talk) 17:36, 8 April 2014 (UTC)

Try https://www.wikidata.org/w/index.php?title=Special:NewPages&hidebots=1 maybe? --Jakob (talk) 17:42, 8 April 2014 (UTC)
Thanks, sure but it only blocks people who have the bot/flooder flag. Palosirkka (talk) 17:43, 8 April 2014 (UTC)
The problem is mass creation of new items via Widar. This blocks Special:NewItem. --Succu (talk) 17:49, 8 April 2014 (UTC) PS: Same problem as this --Succu (talk) 17:53, 8 April 2014 (UTC)
Shouldn't users obtain a flood flag before they use Widar to mass create items? —Wylve (talk) 12:01, 9 April 2014 (UTC)
See Wikidata:Bureaucrats' noticeboard#flood flag. --John Vandenberg (talk) 12:09, 9 April 2014 (UTC)

List of Qs

If I have a long list of pages in some wikipedia, how can I get a second column with their item IDs? --geraki talk 15:24, 9 April 2014 (UTC)

If you want to use the API, a query like this should do the trick. --YMS (talk) 15:43, 9 April 2014 (UTC)
I talk about wiki pages or text files containing hundreds of links (nevermind the exact format). I'm looking for an easy way, your suggestion needs a lot of scripting that I'm not capable to do. -geraki talk 16:05, 9 April 2014 (UTC)
First solution: BOT, Second solution: like said YMS API (another example:3 item) but there is some limit about length of the string, Third solution: If the txt list is create from a category, you can use Category tree intersection (example) --ValterVB (talk) 18:47, 9 April 2014 (UTC)

Time localisation bug

It seems that there was an update to show time values in correct format for (at least) greek language, with the genitive form of the month (eg. "9 Μαρτίου 2014" and not "9 Μάρτιος 2014"). But now, for every time value with any precision on a single year, it adds "Ιανουαρίου" in front of it. e.g. "Ιανουαρίου 2014" (January 2014) and not just "2014" and "Ιανουαρίου 20. century"! -geraki talk 16:12, 9 April 2014 (UTC)

A new version of Wikibase was deployed last night (UTC) and had these issues. It is not something we can fix quick enough, so we just reverted Wikibase back to the previously deployed version. This does not affect Wikiquote at all. The localisation bugs should be gone. (If you have non-JS, you might need to purge a page if the dates are misformatted.) Aude (talk) 16:42, 9 April 2014 (UTC)

Incerteza em número

Adicionei o IDH no Rio de Janeiro, mas o valor ficou como 0.761±0.001 ao invés de 0.761 . Quando edito para tirar o final a página não deixa salvar. Outros lugares, como EUA tem o IDH sem essa incerteza. Como tira esse final? Rjclaudio (talk) 17:58, 9 April 2014 (UTC)

Undeletion redux

Divorced but date unknown

I have added David Gilmour's past and current wives to his entry, Q178517. He and the first, Ginger Gilmour, were divorced before his remarriage, but we don't have the date. How do I show the divorce, so that we don't wrongly paint him as a bigamist? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:31, 9 April 2014 (UTC)

What about the qualifier end time (P582) with the special value "unknown"? --YMS (talk) 15:47, 9 April 2014 (UTC)
Nice idea, but that gives me a "no valid value recognized" error and won't let me save; it seems that the property requires an actual date. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:21, 9 April 2014 (UTC)
I made the edit YMS meant. — TintoMeches, 20:39, 9 April 2014 (UTC)
Andy, The "unknown value" that these guys talk about is a special value that can be selected from the tiny icon that appears on the left side of the value field, after you have selected "edit" to change it. - LaddΩ chat ;) 20:59, 9 April 2014 (UTC)
When I look at David Gilmour (Q178517) and his marriages, I'm missing one information: the properties tell us, that he married Ginger Gilmour (Q5563037) in 1975 and that this marriage ended at some unknown time, but it doesn't tell us why it ended. Besides divorce there's also the possibility that his wife died. (Those are the two obvious possibilities for me, I'm sure there are other, more obscure ways to end a marriage in the law systems of the world.) I think this is important information and there should be a way to have this information in Wikidata. --Slomox (talk) 07:12, 10 April 2014 (UTC)
You're right, we have cause of destruction (P770) and cause of death (P509) but they don't fit this kind of usage. If you make a property proposal (something like "cause of ending") I will support it. Mushroom (talk) 09:44, 10 April 2014 (UTC)

Modifying license ?

Is there a possibility to change the current license of WD form CC0 to CC-BY in order to have access to more databases ? We have now the structure to include sources for each statement imported from external sources so we can upgrade the license in order to share more data with other databases. The big problem is to define if a data coming from WP sourced with "imported from" can be considered as sourced with a valid reference. Snipre (talk) 14:14, 4 April 2014 (UTC)

As far as I'm concerned changing the license defeats the purpose of this site.今時 (talk) 20:28, 4 April 2014 (UTC)
Can you explain why ? Snipre (talk) 20:52, 4 April 2014 (UTC)
In my opinion, CC0 is the ideal licence. It is the freest licence , allowing the greatest possible reuse of data and the closest thing to the public domain possible in many legal systems. CC-BY starts putting up boundaries and restrictions. Besides which, raw data is not always protected by copyright, so CC-BY may be copy fraud for a lot of content. - AdamBMorgan (talk) 23:45, 4 April 2014 (UTC)
@AdamBMorgan CC-BY license is corresponding to the correct spirit of Wikimedia: Wikimedia is not a reference source so every information should be sourced. I think people never understand what is data. This a value AND a source. You can't split them because if you split them you won't be able to do comparison when different values exist for the same statement. Without a source nobody can trust a value and be confident to use it in another project. Snipre (talk) 04:15, 5 April 2014 (UTC)
@Snipre: I think CC0 is closer to the ideal of the spirit of Wikimedia and I try to release my own stuff under that licence whenever possible. CC-BY is just pragmatism, acknowledging that most people won't just dedicate their work to the public domain. Neither CC0 nor CC-BY have anything to do with sourcing. A downstream user can state a source for CC0 data if they wish and CC0 does not stop Wikidata from providing sources for all of its data. If the downstream user doesn't, that's their problem; they had the option and chose not to make use of it. CC-BY is just an application of French-style moral rights for the creator(s) of a piece of work, and even then, it can just be their pseudonymous name, which means nothing source-wise to anyone else reading the attribution. - AdamBMorgan (talk) 20:52, 7 April 2014 (UTC)
@AdamBMorgan You are right about the fact that CC-BY doesn't deal with sourcing but if WD applies a strict policy about sourcing why not directly using CC-BY which requires to cite author and have te opportunity to deal with more databases ? Sourcing and giving the information about who provides the original data is not an option: 1) it is just an act of justice for the author, 2) it is an important step to be able to understand the importance of value of a data and to do comparison. My reasoning is very simple: we want to do a correct job with WD so we source every data taken from an external source so by doing that we comply with the CC-BY so we can use that license. CC-BY is no a contraint because it is the natural way to do a good job in documentation. Snipre (talk) 21:31, 7 April 2014 (UTC)
@Snipre: Wikidata has internal policies to deal with sourcing. The licence only matters for re-users: a CC-BY licence only means that re-users have to name Wikidata as the source. It does not mean Wikidata has to provide sources. That being the case, being CC-BY won't make Wikidata any more reliable, nor any more well sourced. What CC-BY will do is put up barriers. We would be saying to people "No, you can't use our data unless you jump through these hoops." CC0 is free. It has no barriers, restrictions, hoops, hurdles, obstacles or constraints (except those that are inescapable by law). It means anyone can use this data, in any way, for any reason. This project it probably powering Google's Knowledge Graph information boxes, or is intended to (Google did put up a lot of the money to fund Wikidata). Google can do so under CC0 without linking back to Wikidata or crediting each and every person back down the chain. Everyone else can do so too. That is providing everyone with the sum of human knowledge; CC-BY is providing it only to those that agree to certain preconditions and aren't scared off by the legalese of the licensing terms. - AdamBMorgan (talk) 22:15, 7 April 2014 (UTC)
@AdamBMorgan: No, you aren't right: you mix different things. WP as WD are not creators of data. The author mentioned in the sources should always be cited. WD can receive some attribution but never considered as the author. So CC-BY requires to give the name of the author that implies to give the original source of a data. So if you provide the sources of your data you respect that point of the CC-BY lisence which is the most critical. I don't understand why you consider citing sources like a barrier. Why is it a restriction ? Again it is never a constraint if you are doing a good job because a good job implies automatically the citation of the sources. And here you see the critical point of a database: to be able to use a set of data you should use have comparable data. If I take the example of denny of cities of the world and one typical data like the population, you can do a good job only if you have data of the same year and if possible calculated with the same method. For that you need to find sources offering the largest set of data with the same parameters so to find other databases. Working with databases reduces the list of authors to cite at the same time because you can source a whole table with the author of the data (not WD) and mention WD if you want (depending on the use of the data from WD in the case of short citation the name of the author is sufficient) as attribution. Snipre (talk) 23:16, 7 April 2014 (UTC)
@Snipre:No, I think you're mixing things up. Licensing is completely irrelevant to sourcing. Wikidata can source claims while under a CC0 licence. Wikidata does source claims while under a CC0 licence. Licensing only affects downstream users. If I paint a picture and license it as CC-BY I am placing a barrier on re-use (they have to give me credit; if they don't, they can't use it). If I released the picture as CC0 there is no barrier (anyone can use it without having to fit into arbitrary constraints such as giving me credit). The same applies to Wikidata's data. Under CC-BY a reuser (e.g. Google, but this applies to everyone) would have to give credit to Wikidata, all the original sources, and possibly all the editors inbetween. What if Google doesn't want to do that? What is Google doesn't care about doing a good job and just wants a quick information box? Under CC-BY we would be forcing Google to source each and every datum; we would be placing a restriction on access to the data. Under CC0, Google has the option of citing our sources; they can choose for themselves to either add citations or not bother. (Again, I am using Google only for convenience; this applies to each and every user.) None of that affects whether or not Wikidata sources its claims, nor whether or not Wikidata is doing a "good job". - AdamBMorgan (talk) 12:25, 8 April 2014 (UTC)
No, I know the difference, but as I said, if we apply a strict policy about sourcing, we are safe concernaning a CC-BY, this means we don't have to modify our work. You are right about the lisence, this will affect the data users, but if the data user applies a strict policy about sourcing too, he just have to mention WD as additional parameter in his references to respect the CC-BY. What I mean is CC-BY is not a problem for data user which works in an intelligent way. The main question is not CC-BY but if we apply a strict policy about sources. If yes CC-BY won't chamge anything for us. For WD users it is the same. I am not working for Google, but for WP, and the wikipedians want to have sources, so yes every datum should have source in WD because the main users of WD which are WP require it. If you don't provide source, WD is a copy of freebase so why do we have to spend hours in data import if Google can do it with its computers ? If we have a proper organization for sourcing (and we have it) respecting the CC-BY is not a problem because you can grab the source data in the same way as the value you want.
And you care about data users who don't want to source and what's about those which source their data ? Following your reasoning they will have hundreds of different sources for one type of data because with a CC0 lisence we weren't able to import data from an unique database.
Again here we have a question about the objective of WD. If WD is only a free data supplier in the web I thing we are doing the wrong choice because Google will always do better tools (or we will do similar thing than DBpedia) than us and specialists will always use their specific datbases. No if we want to provide good quality data for WP, so changing the lisence won't be a problem because they will respect the CC-BY lisence. Snipre (talk) 11:57, 9 April 2014 (UTC)
In the opinion of the WMF legal team "For EU databases, bots or other automated ways of extracting data should also be avoided because of the Directive’s prohibition on “repeated and systematic extraction” of even insubstantial amounts of data." Raw data is subject to this directive once it is put in a database. Many of the most useful databases are under CC-BY compatible licenses and I believe it would be worth switching Wikidata to that license so we can import their data. Any reputable reuser of our data would be happy to note where the info came from so I don't think it will create problems for them. Filceolaire (talk) 01:59, 5 April 2014 (UTC)
Agree! If a lot of data from Europe have to be > 15 years old before we can add them here, the gain of Wikidata becomes limited. -- Lavallen (talk) 09:15, 5 April 2014 (UTC)
Agree too. That said, I am wondering about uses like qlabel mentionned above. Would we need to make an exception for the label, or would there be other solutions (since the label does not necessitate any mention of a more primary source than Wikidata, perhaps it is ok in this case to just mention somewhere in the client website that it directly or indirectly uses data from Wikidata). --Zolo (talk) 09:30, 5 April 2014 (UTC)
For application like qlabel a simple mention is enough. This are a lot of websites which are copyrighted but this is mention in only one place. Snipre (talk) 11:10, 5 April 2014 (UTC)

I stated my opinion on free data here. I think it would be a huge mistake to not have a CC0 license. You say "For applications like qlabel a simple mention is enough" - which is not correct. CC-BY requires "You must give appropriate credit, provide a link to the license, and indicate if changes were made." Appropriate credit is defined as "If supplied, you must provide the name of the creator and attribution parties, a copyright notice, a license notice, a disclaimer notice, and a link to the material. Prior versions of CC licenses have slightly different attribution requirements." (Source: CC-BY)

Furthermore, you cannot simply grab the data from Wikidata and display it - the suggestion is to change the license of Wikidata so we can integrate data from external CC-BY sources. This in turn means that whenever some data from Wikidata is used, the usage needs to check if the statement has a source which itself is CC-BY, because that source needs to be mentioned as well - it is not sufficient to give credit only to Wikidata, but to all sources as well. Imagine a barchart of the population of the largest cities by continent, and imagine that each cities population comes from a different, CC-BY licenses source. A simple barchart would require lines of lines of required crediting, and half a dozen of links. Is that really what you want?

Also, to estimate the benefit of that switch - Which sources could we add additionally? What would be the gain? Why not CC-BY-SA, which would open even more sources? Etc.

I find the original argument "let's choose a more restrictive license, because it allows us to add more data" as faulty as if we would have been saying in 2003 for Wikipedia "let's choose a more restrictive license because then we could easier agree with content providers who want to donate their content to Wikipedia". No. If a data source sees the ridiculous and erroneous need not to be under CC0, well, let them do so until they understand how foolish and wrongheaded that is. But please keep Wikidata free. --Denny (talk) 17:10, 5 April 2014 (UTC)

I agree with Denny, especially the last paragraph. Wikimedia was a key player in the adoption of CC-BY-SA and CC-BY, and that makes sense for artistic works. It doesnt make sense for databases of facts. I personally hope Wikidata never contains any data which is copyrightable - that is creative/artistic content, and it belongs on a separate project better suited to evaluating and combining creative contributions. Beyond the personal preference, I should also point out that using CC-BY is going to introduce all the problems of attribution of deleted entities - when we merge items, the history of the deleted item is not archived in a CC-BY compliant manner currently. If we assume copyright isnt relevant, the remaining issue is database rights. The extent to which EU databases are going to present a problem is yet to be seen. The EU directive is only one aspect. Wikipedia and Wikimedia Commons have had a very good track record in changing copyright practises of the content produces, and I am quite sure Wikidata will be just as successful. We are already seeing partners complain that Wikimedia Commons is not CC0. I expect that most databases we want to import will switch to CC0, or Open Database License (Q2419612) (like OpenStreetMap (Q936)) for the facts contained in their database, with CC-BY or similar for the artistic works contained within the database. John Vandenberg (talk) 04:17, 6 April 2014 (UTC)

Where I live, something can be protected by law even if it isn't "creative/artistic". I cannot add such material here and I cannot use such material in WP, no matter if it's only facts. I even doubt that CC0 even is compatible with the law here, since I theoretically never can give up the copyright of my work. -- Lavallen (talk) 07:59, 6 April 2014 (UTC)
Which country are you referring to? is there a court opinion in your jurisdiction which causes you to believe non-"creative/artistic" enjoy copyright protection? John Vandenberg (talk) 14:40, 6 April 2014 (UTC)
49 § Swedish copyright law protects databases, if they lack creative content, for 15 years. If the database is modified, the 15 years are prolonged.
And photos who are not considered as creative are protected 50 years.
It's the workload who is protected, not the artistic work.
Also photos taken from areoplanes and strategic installations, like military camps are protected. I even break the law when I tell you that the local branch office of the national bank in my neighbour city has a yellow painting. -- Lavallen (talk) 17:42, 7 April 2014 (UTC)
Switching to CC-BY would not solve any of these problems, though. --Denny (talk) 21:44, 7 April 2014 (UTC)
A lot of databases are accessible under a CC-BY-compatible license already today. That's why we today can use them on Wikipedia. Without a link to the database, the use of that data would be illegal also on WP. -- Lavallen (talk) 11:11, 8 April 2014 (UTC)
@Denny:. Actually, it depends on what we consider to be copyrighted in a copyrightable database. Apparently, there is no database right in the US, and that means we can just extract whatever we please. But things are different in the EU and other juridictions. What should we make of the legal team opinion cited by Filceolaire ? If it means that we can't upload data from CC-BY-but-not-CC0 compatible European databases, this is a major loss (it seems to include all major open data websites, including such basic data as official populatin figures).
I do not think that the Wikipedia comparison is entirely relevant. In Wikipedia, if we can't get a ready-made external article, contributors can always create in using info they got from copyrighted sources. Here, if we are not allowed to get the data from an external database, there may not be any good way to get them at all. --Zolo (talk) 08:49, 6 April 2014 (UTC)
@Zolo:: I find the answer to your question obvious and painful: I am not a lawyer. I can not and must not give legal advise. Listen to the legal team. They are amazing, and they know their job. If they say no to write bots that copy data from restricted databases, don't do that. (And if you are in the EU, please get your governments to release basic data under non-restrictive licenses and get your chapters to lobby for that). Let's have Wikidata flourish and bloom - and one day they will want to get in. And then they will need to change. Don't have us change for them. --Denny (talk) 16:44, 7 April 2014 (UTC)
We are together with the se.chapter trying to have the geoshapes of some entities under a free license. But since it's a private company who makes them, I doubt that it would be easy to make them release them under PD/CC0. Map-designers and local authorities are the customers of this company and if they no longer can make any money, I'm afraid that information not even will be produced in the future. The national goverment is not interested in paying the company those money. -- Lavallen (talk) 18:09, 7 April 2014 (UTC)
@Lavallen:: Is that company willing to release the data under CC-BY instead of CC0? If yes, why would their customers pay for that data? I fail to see how releasing the data under CC-BY or CC-0 makes any difference for the business model of the company. In both cases their data can be used for free. What am I missing? --Denny (talk) 19:37, 7 April 2014 (UTC)
They are maybe willing to release a few years old data to a free license. Now we have only access to 20 year old data and that by copying data from one of their customers, a customer we already are cooperating with. -- Lavallen (talk) 11:11, 8 April 2014 (UTC)
Another example might be shapefiles for municipalities, counties and electional districts i Sweden. I believe that since the Swedish Ordnance Survey is by law obliged to give necessary data to the election authority and since they do not sell the data to that authority, it is free under Swedish Copyright Law, URL 9. However, I am not absolutely sure about my interpretation and therfore I asked the Ordnance Survey with a reference to Commons licensing. The response was that it is OK to use the data provided that they are attributet with "© Lantmäteriet". These shapefiles over election districts are of high quality and can be used to extract borders of communes and counties and can form the base for maps over a number of other administrative areas. They are of course perfect for visualisations of election results. Edaen (talk) 13:46, 8 April 2014 (UTC)
@Denny: the strange thing is that we are apparently allowed to use data from non-CC0 database as long as it is not done in a systematic fashion. But crowd-sourcing + human/bot intermingling might make the distinction quickly moot. We may not be allowed to upload the whole official population database of the EU, but we are allowed to import parts of it. What if it is copied by tidbits over a few years by tens of contributors ? For the record, there was a discussion about importing data from the French National Library (which are under a CC-BY of sorts, and the institution's open data manager apparently didn't see how importing them to Wikidata could be a problem, see Wikidata:Requests for permissions/Bot/SamoaBot 32). --Zolo (talk) 05:19, 8 April 2014 (UTC)
+1 That is a problem nobody forseen: for most data import we can work under the short citation rule which mean by sourcing the data we don't have to obtain an use agreement. But for some data there are very few good sources or users will select some sources because of their accessibility and after some years we will have a large portion of different databases included in WD. Here we will have a problem even if it was not our intention. Most public data providers require the credit as condition to use their and often no commercial use. Here we will have a problem for the commercial option, but if we can already fix the problem of the credit, we will protect our interests. Snipre (talk) 12:14, 9 April 2014 (UTC)
+1 Doesn't make sense "adding-data-in-small-pieces" is considered valid and importing same data with a bot "illegal", because as it's said, maybe after several years (or decades) we'll have imported the same amount of data through "individual edits", probably from the same sources too (population figures at least). As long as we avoid this "atribbution issue" better for us. The point is this licensing change... that'd be a point-of-no-return, doesn't it?Totemkin (talk) 12:42, 9 April 2014 (UTC)
After some Google-search, it seems that that just like for Sweden, lawyers consider that the CC0 license is meaningless in France (because the law says that the author's moral right are "inalienable"). I guess that keeping a CC0 license would mean that we do not try to respect local law (and presumably that we only try to respect US law). In that case, and given that there are no database-rights in the US, I suppose we can upload data from non CC0 databases as well ? --Zolo (talk) 21:23, 9 April 2014 (UTC)
I would strongly recommend to ask the WMF legal team about input on this question.
Also, my understanding of CC0 is not that you give up your authorship (i.e. your moral rights), but rather that you explicitly state that everyone is free to do whatever they want with your creation, without restrictions - something that actually requires your authorship in the first place. That is the part relevant here.
Also, claiming that CC0 is meaningless in France and Sweden seems like FUD to me. Is there any court decision that actually confirms that? Any statement by CC? --Denny (talk) 17:23, 10 April 2014 (UTC)
After this discussion e after the answer of @Lydia Pintscher (WMDE): in Wikidata:Contact_the_development_team#License I am a bit 'worried. Maybe is better If I delete all the population data of Italian municipality that I have added with my bot, because the only source of the data is Italian National Institute of Statistics (Q214195) and they have a license "CC-BY" (Legal notes). In this situation none can add population (P1082), so I will delete all population (P1082) that I have added with my bot these week end (Saturday or Sunday). --ValterVB (talk) 19:32, 10 April 2014 (UTC)
Yeah, too bad. I wanted to add the Dutch municipality population figures~by my bot, but the Central Bureau for Statistics requires attribution, so its not CC0. Michiel1972 (talk) 19:49, 10 April 2014 (UTC)
Same for French and Swiss data about statistics. 141.6.11.19 09:00, 11 April 2014 (UTC)

geographical Properites missing: height (e.g. of mountain peak) and area (e.g. of a lake)

  • Height/Elevation (e.g. of mounttain peak) unit=meter
  • surface area (e.g. of a lake) unit=km*km;

no such thing in wikidata, but wikipedia has them typically:

Aleks-ger (talk) 23:02, 11 April 2014 (UTC)

We don't have the number with dimension datatype yet. Not sure when it's supposed to arrive. --Jakob (talk) 23:40, 11 April 2014 (UTC)
How to push the process? These numbers are could be restricted to SI-units in the first. Aleks-ger (talk) 09:10, 12 April 2014 (UTC)
See Wikidata:Development plan#Quantities and Wikidata:Requests for comment/Dimensions and units for the quantity datatype. - LaddΩ chat ;) 14:04, 12 April 2014 (UTC)

All labels are broken

I see all these <wikibase-itemlink>‎‎, <wikibase-history-title-with-label>, <wikibase-sitelinks-wikipedia>, <wikibase-aliases-label> and what is worse - I cannot see label changing in diffs. Infovarius (talk) 06:08, 10 April 2014 (UTC)

Same problem. Both in english and greek. -geraki talk 06:17, 10 April 2014 (UTC)

:( Arctic.gnome (talk) 06:31, 10 April 2014 (UTC)

Arghhhh. We're looking into it. Not cool. Sorry folks. --Lydia Pintscher (WMDE) (talk) 06:59, 10 April 2014 (UTC)
Still investigating. It's only happening now in JS (obviously no good) and trying to figure out how to fix this. Aude (talk) 09:27, 10 April 2014 (UTC)
Messages should be fixed now on Wikidata. You might need to purge your browser caches. There still are issues with a few messages in the clients (e.g. Commons) and working on that. Aude (talk) 13:35, 10 April 2014 (UTC)
@Aude: Nope. Labels are still defective for me. - LaddΩ chat ;) 13:40, 10 April 2014 (UTC)
We refreshed the caches, rebuilt all the things and everything possible, and put wikidata back on wmf21 (core). I think it should be fixed for real now, but can folks please confirm. Aude (talk) 15:17, 10 April 2014 (UTC)
Now switched to correct labels for me. Ahoerstemeier (talk) 15:20, 10 April 2014 (UTC)
Yay! I think I know what to do to fix this quicker if it happens again (which scripts to run) and maybe how to avoid this in the first place. (scripts to run yesterday when we switched wikibase back to older version, due to time localisation bug) Thanks for patience with the problem. Aude (talk) 15:45, 10 April 2014 (UTC)
All is correct now. Thank you, friends! --Infovarius (talk) 16:47, 10 April 2014 (UTC)
It's happening again, but only on certain things. I'm seeing <wikibase-statements> and <wikibase-sitelinks-wikipedia>. --AmaryllisGardener (talk) 14:25, 11 April 2014 (UTC)
Nevermind, it's ok now. --AmaryllisGardener (talk) 14:27, 11 April 2014 (UTC)
Hm, I see raw variables here. Infovarius (talk) 18:57, 12 April 2014 (UTC)
Good to see this fixed. But to put it in perspective, this did not make the interface all that much worse; something like the "In other languages"-hindrance is at least as troublesome. - Brya (talk) 07:32, 13 April 2014 (UTC)

Deletionists

Why was Pong (Q16263330) deleted? It was not an orphan item - Wat Phra That Doi Yuak (Q16004213) links it as the subdistrict in which that temple is located; and if I am not mistaken my bot already had statements added to it. That item clearly is within the notability policy! If Wikidata does not want entries on all administrative country subdivisions, even they don't have an Wikipedia article yet, then I can stop editing here! This happened the second time to me already, again without even being notified, so I am starting to get pissed! Ahoerstemeier (talk) 16:52, 10 April 2014 (UTC)

The item as you created it was blank besides the label, so how were we supposed to know? Then again that was pretty rapid of a deletion. This should be discussed at the administrators' noticebaord.--Jasper Deng (talk) 17:16, 10 April 2014 (UTC)
@Ahoerstemeier: I restored it. Please try to add more statements to items to protect them from being deleted. Mistakes happen during housekeeping but it is one of the most important tasks. Tobias1984 (talk) 17:41, 10 April 2014 (UTC)
OK, seems it was one of the around 30 items I created last evening, all linked but not yet filled by my bot - planned to do that once I had all the subdistricts needed to connect the Thai temples with their location, which was earlier today when the bot spotted the missing item. Why was just one of the 30 deleted, was it just the name which looked suspicious to be a test? Seems to me that housekeeping process is still flawed here, there need to be more helper pages like User:Pasleim/notability to spot the really problematic items. By random manual deletion of suspicious items it will have many bad items survive unspotted, and collateral damage like this one. Ahoerstemeier (talk) 20:26, 10 April 2014 (UTC)
@Ahoerstemeier: This item was probably more problematic because it is a short string and therefore causes many duplicate warnings (most likely with the game "Pong"). I would be better if the bot would add a instance of (P31) or subclass of (P279) statement immediately after item creation. The bot should also fill the description and not just the label. The job of manually housekeeping a database of this size will always cause some mistakes. Hopefully we will have better tools in the future to tackle this. In any case it is good that we all keep an eye out for each other and assume good faith. That will keep this project on its course. Tobias1984 (talk) 21:22, 10 April 2014 (UTC)
Thanks for notifying me for this discussion guys... But anyway, I was just cleaning up and there was only a label for this item. Sorry, I'll will wait longer next time. Sjoerd de Bruin (talk) 13:46, 11 April 2014 (UTC)
@Sjoerddebruin: Sorry, I thought I added you to the ping template but I forgot. Tobias1984 (talk) 11:47, 13 April 2014 (UTC)

@Ahoerstemeier: If you have any suggestions to improve User:Pasleim/notability or ideas for new helper pages, feel free to tell them. --Pasleim (talk) 15:00, 13 April 2014 (UTC)

Data release email templates

Splitting off from #Modifying license ?, we should develop template emails to send to rights owners of datasets, asking them for CC0 release and also a release of Database-Rights, like commons:Commons:Email templates. For datasets we already host, the template should inform them that some of their data is in Wikidata, and will be deleted if they do not explicitly consent to the data being in Wikidata. John Vandenberg (talk) 23:36, 10 April 2014 (UTC)

I really like this idea. We should probably have the legal team do the final review on those, but we can probably draft good templates to use in a wide array of cases. --Denny (talk) 03:38, 11 April 2014 (UTC)
 Support with help from m:Legal. And a notice like "if you don't give us your data, we will take it from a better source!" XD --Ricordisamoa 03:15, 12 April 2014 (UTC)
Wikidata:OTRS but here we need the Wikidata team to do the job: they have to define what is required to be in the good side of the law.
@Ricordisamoa Don't expect to much from the legal team of the WMF: as Wikidata is a separated project, they always send back to the Wikidata team when you ask too precise questions. I had already had this experience about question on databases. WD is not under the respponsability of WMF. Snipre (talk) 07:05, 12 April 2014 (UTC)
I do not understand this idea. When we aggregate data from many sources, we will overlap with other databases. As I understand the law, there is no issue at all when we do so. The fact that we include the same data does not make us a copy of a database. My question is what is the point ? GerardM (talk) 08:30, 12 April 2014 (UTC)
@GerardM "When we aggregate data from many sources..." there is the problem: if we are not using enough different sources we will include large parts of some databses and then we will fall under the law. And in some cases there is only one source. One example is the population of the communes in different countries: there is only one source for this kind of data even if you find the data in different documents there is only one authority source. So by a systematic import by different contributors you will have at the end a large part of the official databases included in WD and then we have to deal with the law about databases. Snipre (talk) 09:49, 12 April 2014 (UTC)
When multiple sources are used that could be traced back to one source, it is unlikely to the extreme that we will have any problems. GerardM (talk) 11:35, 12 April 2014 (UTC)
@GerardM You can pretend good faith and avoid any fees but if the authors require deletion we will have to do it especially if the authors uses a CC-BY licence: if the intermediary documents didn't the correct sourcing it is not our fault but this won't be an excuse to not respect the licence. So at the end the is the same for WD: no data. So if you want to spend time to import data and risk at every moment a massive deletion, fine for you but not for me. Snipre (talk) 15:49, 12 April 2014 (UTC)
@Snipre you assume a lot. As I understand the rules on facts, you are completely mistaken. To me it seems that you have a paranoid view on these things. Facts cannot be copyrighted and consequently they cannot be licensed. GerardM (talk) 16:23, 12 April 2014 (UTC)
@GerardM Before saying that I am paranoiac read that and please understand that if there a law about facts in databases, you can have trouble. All my assumptions are based on those laws and you on what is based your assumption that nothing can happen ? What you don't understand is if there is no problem with isolatd facts once you start to collect them in a systematic way you enter in another world. Snipre (talk) 09:45, 13 April 2014 (UTC)

Adminship anniversary

4 days ago, it was one year since my RFA passed (3 days ago for Hahc21). I know we haven't a reconfirmation process anymore, but I'd like to hear some opinions from the community about my sysop activity. I apologize for being less active recently – because of both real-life duties and deeper involvement in other wiki projects – but I am sure I will soon be back full-time. --Ricordisamoa 16:24, 8 April 2014 (UTC)

So I suppose you have no objections... --Ricordisamoa 19:15, 13 April 2014 (UTC)

Data owners

Over at Wikimedia Sverige (Q15279144) we've been talking about the possibility of having data owners automatically updating entities related to their data. Examples could be:

  • Swedish National Heritage Board (Q631844) automatically updating the legal protection of a monument if the corresponding object changes in their database.
  • Statistics Sweden (Q1472511) automatically updating the population of a city when they have new statistics.
  • An authority automatically adding their identifier to an object which gets tagged with same_as:Wikidata:Q... in their database.

This could either be done by the authorities/data owners themselves or through some script watching for changes to their datasets. In all cases the datasets would be CC0 and the data owners would be aware of the process taking place.

What I wanted to know though is the position of the Wikidata community with regards to these things. Are there previous examples of organisations doing this (e.g. freebase-identifiers)? Is there a general feeling to how frequent such updates would be allowed to be? Would there be a difference whether:

  1. new entities were added (conforming with existing guidelines on relevance);
  2. new statements were added to existing entities;
  3. new values were added to existing statements.

Cheers, André Costa (WMSE) (talk) 11:59, 12 April 2014 (UTC)

@André Costa (WMSE): There is the hot topic of the moment. Before any automatic system we have to solve the problem of the licence and without a clear authorization of the data owners to provide data under a CC0 licence we face a legal problem. Snipre (talk) 12:49, 12 April 2014 (UTC)
According to this a CC0 licence is not compatible with copyright of Sweden Statistics:
"Question: Am I required to state the source when statistics are further distributed?
Answer: Yes, you are always required to state Statistics Sweden as the source. ("Source: Statistics Sweden")"
So the first task is to get the agreement for a data release under CC0. Snipre (talk) 13:02, 12 April 2014 (UTC)
When the copyright holder uploads information to Wikidata, it is implicit that the data becomes available under our license. Thanks, GerardM (talk) 15:12, 12 April 2014 (UTC)
@GerardM: No, we need a special agreement in order to be sure that everything is clear. For me if I just contribute once I see only this in the edition interface:
By saving changes, you agree to the Terms of Use, and you irrevocably agree to release your contribution under the CC BY-SA 3.0 License and the GFDL. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
So if you want that by a simple upload everything is settled we should specify in the edition interface that data is under CC0 licence. Again an unique edit taken separatly from the rest is not a problem this is the whole database which is dangerous: even if one official society uploads some data this is not the proof that we can upload everything from their database. With implicit things you can always have problem, a clear agreement is better than everything. Snipre (talk) 15:38, 12 April 2014 (UTC)
My fault for using a bad example. Lets ignore the SCB/licensing-issue for now though, the main question is how we look at these types of automatic systems in general. (If I misunderstood and the licensing/OTRS issue in general is the current blocker for these efforts then my apologies). /André Costa (WMSE) (talk) 16:03, 12 April 2014 (UTC)
Another technical problem is maybe that we have to solve the cases when Swedish National Heritage Board (Q631844) has several items in their database, where Wikidata/WP have only one. The example about Lunds domkyrka you mention at WD:PP is such an example. It also looks like fmis also have an item for many of the items in the bbr-database. -- Lavallen (talk) 17:06, 12 April 2014 (UTC)
That is a good point. For the case of many-to-one relations I guess it would be down to which one wikidata/or the data owner considers to be the same_as/equivalent object. If it's the case that several identifiers are linked to one wikidata item then I would assume that any syncing mechanism would have to be set up so as to only update non conflicting statements. For the case of protected buildings listed by Swedish National Heritage Board (Q631844) their structure is building < complex < environment so for a complex with only one building the two are equivalent (as is the case for Lund Cathedral (Q1236689)). Similarly objects can belong to several databases (e.g. bbr and fmis as mentioned above or a person can be in both a database of painters and one of authors). /André Costa (WMSE) (talk) 09:35, 13 April 2014 (UTC)
My opinion is that we here should have one item for every item in other organisations databases. There is a related RFC about that. Two items in the bbr-database (protected buildings) would then result in two items here but maybe not an extra for fmis (ancient monuments), since it can be considered as another database, even if both bbr and fmiss belongs to Swedish National Heritage Board (Q631844). The relation to the WP article and the two items about Lunds domkyrka should then be related to the WP-article by an items who links to these two items like Domback (Q1879056) does today. Observe that this is my opinion, we have not fully agreed to work this way yet. This far, I have not seen any better proposal. -- Lavallen (talk) 14:50, 13 April 2014 (UTC)
@André Costa (WMSE): There is no technical problem to import data: we need depending on the data a preprocessing step. But currently we have already some massive imports for official databses. The unique problem is the licence and the agreement to release data under CC0. It is not interesting to speak about technical problems and solutions if at the end there is no possibility to import due to licence reason. Snipre (talk) 07:45, 14 April 2014 (UTC)
@Snipre: I am talking to big enough institutions that will import data to Wikidata. For them the license of Wikidata is not a problem. Your premise that technical issues are not relevant at this time are just wrong. GerardM (talk) 11:15, 14 April 2014 (UTC)
There are obviously two points of view: Bot owners do not see a major issue in importing data. Using pywikibot or other frameworks, this requires 20 lines of code (however, these are often problem specific). Thus, if you have data (in a specific format) ask here or at Wikidata:Bot requests and you will get help. However, I want to support Snipre: I also struggle with the fact that most data collections out there 'm interested in are not compatible to CC-0. While facts per se cannot be copyrighted, a collection of facts might bear copyrights nonetheless.  — Felix Reimann (talk) 12:18, 14 April 2014 (UTC)
Then let's start to change this situation and get people to understand why opening up their data completely is a good thing. Wikipedia and Commons have done this before very successfully. We can do it too. --Lydia Pintscher (WMDE) (talk) 12:21, 14 April 2014 (UTC)
True. @André: Regarding your question: Update frequency for these cases is IMHO not a problem. Do it whenever you have new data.  — Felix Reimann (talk) 13:00, 14 April 2014 (UTC)
@GerardM: Ok, GerardM, next time you talk with your contacts, ask them to send a mail to the wikidata team with an agreement for CC0 release in the conditions specified in commons. In discussion no commitment is taken and this is the key point for WD. Snipre (talk) 15:22, 14 April 2014 (UTC)
@Lydia Pintscher (WMDE): Lydia, can you say us if somebody from the wikidata team or WMF Deutschland can handle a template for demand of release under CC0 ? Or if we have to build something ourselves ? There are different persons working with data about populations who are waiting for a solution before doing some importations so if we can provide something quite fast we can preserve their enthusiasm. Wikidata:OTRS is still red. Snipre (talk) 15:22, 14 April 2014 (UTC)
I will ask. --Lydia Pintscher (WMDE) (talk) 15:41, 14 April 2014 (UTC)

@Snipre: - you say "For me if I just contribute once I see only this in the edition interface: By saving changes, you agree to the Terms of Use, and you irrevocably agree to release your contribution under the CC BY-SA 3.0 License and the GFDL. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license."

This is not correct. If you edit data, it actually states "By clicking "save", you agree to the terms of use, and you irrevocably agree to release your contribution under the Creative Commons CC0 License.". Therefore a right owner of a database by uploading data to Wikidata explicitly agrees to CC0. This is allowed. Try with an incognito mode if this doesn't show for you anymore, because you can make that statement for the future, which is why you might be not seeing it currently.

Also Andre Costas original message stated that the data would be under CC0, which means there is no problem in the first place. I would prefer to keep this thread on the actual discussion, like entity reconciliation and how to map properties, and how to come up with processes where an external institution wants to keep a set of items up to date. --Denny (talk) 17:47, 14 April 2014 (UTC)

Commons taxon categories

I am trying to obtain the Wikidata item that describes commons:File:Appias_hombroni.JPG. I know its species from the category 'commons:Category:Appias hombroni', but I cant see how to travel from that to a Appias hombroni (Q1963032) via mediawiki UI / APIs. This isnt an isolated case; I've noticed that the vast majority of taxon categories on Wikimedia Commons do not have a Wikimedia category (Q4167836) here, and some do not have a Commons category (P373). Is this intentional?

The only API I can find that returns an item for the query taxon name (P225)=>'Appias hombroni' is WikiDataQuery - e.g. https://wdq.wmflabs.org/api?q=string[225:%22Appias%20hombroni%22] That works for me, but it doesnt help the average person.

Do we want Category items for all Commons categories? The majority of Commons categories are not going to have interwiki links to Wikipedia, due to Commons having a much more detail-oriented categorisation system. I'm guessing that bots havent already created items for Commons as we're waiting for a better solution, such as items for File: pages, which would allow describing an image without needing Commons category system. John Vandenberg (talk) 05:31, 14 April 2014 (UTC)

I've seen that on some categories like commons:Category:Terpsiphone mutata there are interwikis to the Wikipedias, and those Wikipedia pages typically will have data items. But there are many species without a Wikipedia page in any language, like commons:Category:Sinthusa indrasari.[1]. I also see that my problem is Wikidata:Requests for comment/Commons links. ;-( John Vandenberg (talk) 15:59, 14 April 2014 (UTC)

Need some comment about this request: Delete all population (P1082) added to Italian municipality

I think the right solution is to delete the data, but it is better to have a more extensive discussion. --ValterVB (talk) 18:39, 14 April 2014 (UTC)

population entry precision

i was adding a population for the state of Maryland Q1391, and the ±1 is imposed. this is not correct. this estimate has its own precision, or error that is not the last significant figure. is there a way to indicate this? Slowking4 (talk) 15:24, 14 April 2014 (UTC)

@Slowking4: Just add +/-n to the end of the number when you add it in (where n is the uncertainty). --Jakob (talk) 15:53, 14 April 2014 (UTC)
Concerning the number datatype, I've noticed that it tends to round some figures up/down. Is this intended? -Digipoke (talk) 18:42, 14 April 2014 (UTC)
sorry don't know the number, i only know some census statisticians are perturbed by your information presentation. automatically assuming and presenting false precision is wrong. might want to fix it. Slowking4 (talk) 20:55, 14 April 2014 (UTC)

Linking an audio file pronouncing a topic to that object

Is there a property for an audio file which contains the pronounciation of that particular object? For example: in commons has the file De-Berlin.ogg which is the German pronounciation for Q64. Please let me know if it exists. Ogmios (talk) 20:49, 14 April 2014 (UTC)

Four possible properties: Search:P:Audio. - LaddΩ chat ;) 21:42, 14 April 2014 (UTC)
Thanks, that was what I was looking for. Ogmios (talk) 17:16, 15 April 2014 (UTC)
pronunciation audio (P443); don't forget to add a language qualifier.. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:27, 15 April 2014 (UTC)
Thank you too. But what do you mean with language qualifier? Ogmios (talk) 17:16, 15 April 2014 (UTC)
Paris (Q90) is pronounced pAris in English and parI in French and parIs in Swedish, you therefor have to tell what language the pronounciation-file is in. -- Lavallen (talk) 17:39, 15 April 2014 (UTC)
@Ogmios: See Property talk:P443 and Help:Qualifiers - LaddΩ chat ;) 18:40, 15 April 2014 (UTC)

Approximate dates

How do uncertain dates get entered? I can't find anything in the help or policy pages. This is necessary with, for example, birth or death dates that are "circa X" or "between X and Y". I think I can approximate the latter with start time (P580) and end time (P582) as qualifiers, but that does feel like the best way to handle it. Not sure how to handle "circa" as that does not give a strict range of dates. - AdamBMorgan (talk) 12:33, 15 April 2014 (UTC)

Unfortunately you can't do it through the UI... — Ayack (talk) 12:57, 15 April 2014 (UTC)
yes you can, to a limited extent. When editting a date you will see a link labelled "advanced adjustments". Click on this and you get options to switch from Gregorian to Julian and to change the precision from day to month to year to decade to century to millenium etc. Filceolaire (talk) 21:50, 15 April 2014 (UTC)
Thanks. I've noticed that the Time data type has "before" and "after" values for uncertainty/precision. Are these currently used? Otherwise: I already use the month/year/etc precision (although not decade or higher so far), so that's one solution, and I'm thinking of proposing new qualifiers to be added to uncertain dates, depending on whether they would be redundant to before/after values. - AdamBMorgan (talk) 17:06, 16 April 2014 (UTC)
I've gone ahead and proposed a new property (floruit) and three new qualifiers (earliest date, latest date and circa) to cover this problem. I think this approach will work. - AdamBMorgan (talk) 22:27, 16 April 2014 (UTC)

large dispatch lag currently

Hey folks,

Due to a too high edit frequency there is currently a massive lag in the change propagation to Wikipedia and co. This means changes don't show up timely there (but will eventually). We've already started stopping the worst offenders. I hope the lag goes down now. If not I will have to forcefully stop a few more. If you are editing at a high frequency with a bot or Widar please stop until the lag goes down. Thanks!

We're taking measures to prevent this in the future.

Please keep an eye on the lag at https://www.wikidata.org/wiki/Special:DispatchStats

Cheers --Lydia Pintscher (WMDE) (talk) 13:57, 16 April 2014 (UTC)

Ok it seems the situation is actually worse. No changes are being dispatched. We're investigating. --Lydia Pintscher (WMDE) (talk) 15:26, 16 April 2014 (UTC)
We've found the issue on MediaWiki core and fixed it. Dispatches are working again. It will take some time to catch up. So please for now no bots and Widar still. You can run them again when the average lag at https://www.wikidata.org/wiki/Special:DispatchStats is down to a few minutes. Sorry again for the issue. --Lydia Pintscher (WMDE) (talk) 18:03, 16 April 2014 (UTC)
It's done. So let's the bots loose. :) --Succu (talk) 22:06, 16 April 2014 (UTC)

Hello,
Does anyone know when featured/good articles interwiki system will be managed by Wikidata so that these templates Link FA/Link GA will be removed from each Wikipedia ? Is it still in project ? I was told, when Wikidata was launched, that it was forecast.
I would be a great step, because bots are not very efficient to deal promoted/upgraded/downgraded/demoted articles and manual actualization on each Wikipedia is tedious.
Thanks for your answer. Best regards. Gemini1980 (talk) 15:27, 16 April 2014 (UTC)

See Wikidata:Development plan#Badges. -- Lavallen (talk) 15:48, 16 April 2014 (UTC)

What can we do with ... Foldscope (Q15935848)  View with Reasonator View with SQID ?

do we have yet properties for cost, resolution, magnification range and so on or will this wait until we finally get quantities with units ? TomT0m (talk) 20:27, 16 April 2014 (UTC)

Confusion about redirects

Hi. Regarding data item The Gambler (Q7735675), I'm having an issue over redirects. I'm not sure how Wikidata handles redirects and I'm having trouble finding help; so any links for that would be good. The problem is this: On the English Wikipedia, there is an article for the entire Gambler TV movie series (The Gambler (TV movie series)) and each movie is a redirect which points to it. The data item is however clearly referring to the first film but its link was pointing to the series. I tried to fix this by adding the correct title of the redirect Kenny Rogers as The Gambler but Wikidata changes my entry to the target of the redirect, which is the series. Is this the behaviour we want? Jason Quinn (talk) 21:21, 16 April 2014 (UTC)

Everlasting issue, see Wikidata:Project_chat/Archive/2014/01#redirects. No development planned for that. In the mean time, "edit [the WP redirect page] to make it an ordinary page, add the page to [WD item], and then immediately make it a redirect again". - LaddΩ chat ;) 00:08, 17 April 2014 (UTC)
Or create new dataitems for every film in the serie, see Kenny Rogers as The Gambler (Q16363554) Michiel1972 (talk) 09:12, 17 April 2014 (UTC)

Do we have showcase items for work of literature, collection of such works, etc?

I'm also interesting in metadata examples which describe work. I'm aware about characters, narrative set in properties, but I'd like to know how to describe approximate time frame narrated in work (for example, war, revolution, natural disaster, etc); topic or problem raised in work (social issues, politics, travel notes describing particular place, etc); etc.

Which property should be used for date when work was created?

EugeneZelenko (talk) 03:13, 17 April 2014 (UTC)

I can't help too much because I'd like to know the answer too (I saw your post on Wikidata talk:Wikisource and hoped someone would answer there). the best way to incorporate and describe literature seems to still be in the process of being sorted out; I don't think even the finalised decisions like the work/edition separation are well understood by many users yet. That said, I can solve one thing: main subject (P921) is for the topic of a work; for example, see this this AutoList query. I don't think there is any "date of creation" property beyond publication date (P577) nor anything for time frame, although significant event (P793) might work and some situations are possibly covered by specific properties (e.g. conflict (P607) might work for wars or battles). It's likely that we need more properties to support this. An equivalent of narrative location (P840) for period instead of location would be nice, for example. - AdamBMorgan (talk) 17:16, 17 April 2014 (UTC)

Identifying properties that represent items in other linked data sets

We have a number of properties, whose values represent identifiers in other online databases/ linked data repositories. IMDb, for example. Some of these datasets are not well known, outside a region or a topical specialism, or both.

Suppose a string "foo" represents a data entry at example.com whose URI is https://example.com/soometing/1/foo and P1 may have the value, "foo", to represent the equivalent entry in that database.

The documentation of P1 should include something that indicates the relationship, and URI structure, in a human- and machine-readable form.

By "URI structure", I mean that we tell parsers to concatenate "[string]"+"[value]" to make a URI; where, in this example, "[string]" is "https://example.com/soometing/1/".

I have implemented this on the OpenStreetMap wiki, as can be seen in the infobox on http://wiki.openstreetmap.org/wiki/Key:openplaques_plaque using a simple "URI pattern" parameter.

How might this be achieved on Wikidata? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:41, 8 April 2014 (UTC)

Properties on properties: Wikidata:Development plan#Statements on properties :)  — Felix Reimann (talk) 15:55, 8 April 2014 (UTC)
Thank you, but that seems to be something different - how would you envisage it working, for the use-case described above? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:50, 8 April 2014 (UTC)
Add a statement to property P1. This statement to have a new property "URI pattern" with the value "https://example.com/soometing/1/". This depends on the software being changed to allow statements on properties and the creation of a the "URI pattern" property with string datatype. 86.6.107.229 00:14, 10 April 2014 (UTC)
That won't work, because we can't assume that the [value]] part is always the most right-hand component to the URI. Consider "http://example.com/foo/[value]/bar". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:14, 10 April 2014 (UTC)
You could do something like a $1, as is common in regexes e.g. http://example.com/foo/$1/bar. I don't see it as a difficult problem to solve; we just have to let external reusers know the expected format for each URI "pattern". --Izno (talk) 03:30, 11 April 2014 (UTC)
What happens of the URL structure at the target site already includes the string "$1"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:31, 15 April 2014 (UTC)
@Izno: Did you see this? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:04, 18 April 2014 (UTC)

I have now been informed of this page on meta, where Reasonator does the same thing. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:14, 10 April 2014 (UTC)

Hey folks :) Tpt has been awesome and developed a new feature. It allows a Wikimedia project to automatically link to other Wikimedia projects in their sidebars based on Wikidata data. So for example Wikipedia could automatically add Commons links in the sidebar of their articles. http://lists.wikimedia.org/pipermail/wikidata-l/2014-April/003690.html has more details as well as the process for getting it enabled on your project. Thanks Tpt!  – The preceding unsigned comment was added by Lydia Pintscher (WMDE) (talk • contribs) at 10:41, 12 April 2014 (UTC).

Is it possible to add optional icons (could be enabled/disabled in user preferences) to links? Such eye candy was implemented in Russian Wikipedia.
I think will be good idea to hide Data item link which duplicates Wikidata in your implementation. This was also implemented in Russian Wikipedia.
It's necessary to disable mw:Extension:RelatedSites in Wikivoyage if your code will be deployed there.
EugeneZelenko (talk) 14:21, 12 April 2014 (UTC)
@EugeneZelenko, Tpt:If you want to replace mw:Extension:RelatedSites, Please consider how to link to dmoz using Wikidata's Curlie ID (P998). a lot of Wikivoyage pages have dmoz link at related sites section.--GZWDer (talk) 03:59, 13 April 2014 (UTC)
I think it will be reasonable to have additional links for family of projects of particular project. For example VIAF for all Wikisource editions and national library links for particular language editions. Sure, configuration options will be needed. --EugeneZelenko (talk) 14:09, 13 April 2014 (UTC)
@EugeneZelenko: About icons I believe it isn't in vector skin spirit so I would prefer an on wiki implementation using a gadget. But if there is a strong demand from more than one projects, an implementation in Wikibase is maybe a good idea.
Since all project names start from Wiki, icons make names more recognizable. Language names are more diverse in comparison. Common implementation would be better then project specific ones. --EugeneZelenko (talk) 14:07, 14 April 2014 (UTC)
About mw:Extension:RelatedSites we should find a way to migrate smoothly from RelatedSite to this new Wikibase feature. It's something that won't be very easy and will require some help from Wikiyoyage contributors.
@EugeneZelenko, GZWDer: About dmos/VIAF links, as they aren't managed as Wikibase sitelink but as statements, it would require some very specific code. I'm not sure that Wikidata dev team will love such things.
Tpt (talk) 07:30, 14 April 2014 (UTC)
Regarding DMOZ on Wikivoyage, they might not continue using that. See comment by user:Jmh649 at v:Wikivoyage:Travellers' pub#Formal request: Renaming "Open Directory" as "DMOZ" in the sidebar[2]. However they are also considering adding OpenStreetmap - see v:Wikivoyage:Travellers' pub#Formal request: Including OpenStreetMap in the sidebar. I think it might be better for these extra related sites to continue being part of mw:Extension:RelatedSites, but have that extension use Wikidata. i.e. complete sitematrix/sitelinks support in Wikidata client core ; other sites in Extension:RelatedSites (using the interwiki table I presume). John Vandenberg (talk) 07:58, 14 April 2014 (UTC)
I'd presume the migration from mw:extension:RelatedSites would be done in the same manner as was done for the inter-language links; a robot pulls the [[wikipedia: (or [[fr: [[de: [[it: or whatever) link out of articles as the sidebar links are moved to Wikidata? K7L (talk) 17:01, 16 April 2014 (UTC)

Is it possible to show a Commons category instead of a Commons page (or additionally) if there is no Commons page? --RolandUnger (talk) 15:03, 18 April 2014 (UTC)

List of all properties

We have a problem with the list of properties: it is not updated and the splitting is not more correct according to the real use of properties. There a proposition to create a unique table with all properties with one column defining the application field. See Wikidata_talk:List_of_properties#Modify_the_list. Please commetn and propose ideas because it is important to be able to look for properties especially for newbies who are not familiar to the current names of properties so a search by application is necessary. Snipre (talk) 09:58, 17 April 2014 (UTC)

One effective solution would be to push the use of the missing props gadget which could . Otherwise think of the organisation of a by class list of properties using the {{Class documentation}} and {{Item documentation}} template and associate a relevant list of properties to a class it its documentation. Soon I'll add the code to show also the properties of the parent classes to the list, it's easy to code in lua. Also think of the proposition to think of it in the property proposal process and make the contract with property creators guys that they should update the class doc when they create the property. TomT0m (talk) 10:18, 17 April 2014 (UTC)
Also note that the property proposal place in Wikidata seems like a big mess to me and that the creating rate of properties and the number of properties in the pipe is also quite high, this highligts we also have a problem with it. TomT0m (talk) 10:28, 17 April 2014 (UTC)
Whenever I need a list of all properties, I use this: Wikidata:Database reports/Constraint violations/All properties. It's much more intuitive and up-to-date than the normal list.--Underlying lk (talk) 13:29, 17 April 2014 (UTC)
But are all properties on that list ?
We could also put a maintenance category on the {{Property documentation}} template. TomT0m (talk) 16:10, 17 April 2014 (UTC)
✓ Done : Category:All Properties, there only is one or two extra pages we should try to remove from this category. TomT0m (talk) 16:18, 17 April 2014 (UTC)
Thanks but this is useless for search. Snipre (talk) 08:05, 18 April 2014 (UTC)
The category "All Properties" is good: with the statements on properties (see this announcement of Lydia) we can create automatically sub-categories, but for me this table is the better thing to search between properties (see comment above by Snipre), even if only one table will be very difficult to manage. With LUA it is possible to create tables like this. Is it possible to recover automatically the lists of properties actually existing? --Paperoastro (talk) 09:52, 18 April 2014 (UTC)
I'd be surprised if there was nothing in the API to retrieve the content of this category, which has no subcategory so the code is trivial. TomT0m (talk) 10:22, 18 April 2014 (UTC)
To search for a property use the search page and put P: in front of the search term.
Yes this is not documented anywhere. And Yes there should be a "Properties" option on the search page but in the meantime this works. Filceolaire (talk) 14:17, 18 April 2014 (UTC)
@Filceolaire: If you copy the final line from User:Jakec/common.js into your own common.js page, you'll see a "search for property" link in the tools section of the sidebar. --Jakob (talk) 14:23, 18 April 2014 (UTC)

Notability: Wikipedia a requirement?

We seem to have an issue with some admins claiming that the "general spirit of the notability policy is that Wikipedia [or, presumably a sister project] finds you notable", even though that's not what WD:N says and other admins clearly do not see that requirement. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:05, 18 April 2014 (UTC)

There is no issue really. An item must be one of the three notability points, which when rounded up means an item must either have a link to any page on any Wikimedia project or contribute to the knowledgebase where if the item did not exist, it would mean an item is not correctly representative of the subject or information is missing. The item you linked to meets this as it is used in another item, thus contributes to the thoroughness and accuracy of the knowledgebase. So to what you said - yes, a sister project does class you as notable in Wikidata as well as other notable items. John F. Lewis (talk) 15:10, 18 April 2014 (UTC)
That's my reading too - but we still have colleagues, including admins, insisting on a Wikipedia [sic] link. How is this best dealt with? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:16, 18 April 2014 (UTC)
If that is what they insist - tell them to re-read the policy. An admin position is not a dictator position. If such items in the past have been deleted due to this, feel free to poke me or post on the admin's noticeboard and if it happens in future, do the same. If we feel an item has been incorrectly deleted by an admin - we do speak to them before/after we undelete it to clarify things. If policy is the thing needing to be clarified - we will. John F. Lewis (talk) 15:21, 18 April 2014 (UTC)
@Pigsonthewing: If you're still looking to have your item undeleted, please just drop it and quit wasting the community's time. I wasn't kidding when I told you to stop forum shopping for this. I will repeat one more time: the community's consensus was that you failed criterion 2 (the only one you could've possibly been eligible for), and even if you do meet the notability policy, you don't get to self-promote (you have not explained why else you would so strongly desire your item, as Hahc21 said on my talk page), especially not when you have a conflict of interest.--Jasper Deng (talk) 17:31, 18 April 2014 (UTC)
That's the second time you've accused me of forum shopping, after I've accepted your invitations to start discussion on a new page ("If anyone disagrees with my close, they should bring up their objection(s) on my talk page"]; "General discussion of notability can always occur, feel free to start one at project chat about notability in general"). I also note that you have now closed the discussion on your talk page, rather than justify, when challenged, your unwarranted behaviour. I suggest you note John F. Lewis's comment, immediately preceding yours here. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:38, 18 April 2014 (UTC)
I thought I was clear that when I said that, I wasn't endorsing a re-hash of the discussion over your item in particular. If you want evidence for forum-shopping: why are you posting here and on John's talk page? John also never made a specific comment on this particular dispute, only general ones that don't lend any strength to your argument.
I closed the discussion on my talk page partly because of Ajraddatz's comment, because frankly, it's a waste of time to keep repeating the same comments to you. I am now involved, but my warning that you can be blocked for this editing still stands.--Jasper Deng (talk) 17:43, 18 April 2014 (UTC)
Last thing first: I'm posting here by your invitation ("General discussion of notability can always occur, feel free to start one at project chat about notability in general"; I thought I already made that clear), and on John's talk page by his invitation ("If such items in the past have been deleted due to this, feel free to poke me"). As to your comment about "a re-hash of the discussion over your item in particular", where do I mention that in my post in this section? I started a "general discussion of notability", as you invited me to do. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:52, 18 April 2014 (UTC)
Right, @Pigsonthewing, Jasper Deng: can we please drop this now? As in no more discussion of the matter, no more accusations, no more continuation of this at all? This is really starting to get annoying now. Also common sense should prevail here. If you've been warned - take it as a policy 'shut up now'. As pointed out else where, this is starting to border on w:WP:POINT really. John F. Lewis (talk) 17:55, 18 April 2014 (UTC)

Honorific suffixes

This good-faith edit appears problematic; while the subject does indeed hold a Queen's Police Medal, that entitled them to use the honorific suffix "QPM". The words "Queens Police Medal" are, AIUI, never used as an honorific suffix. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:30, 18 April 2014 (UTC)

honorific suffix (P1035) = QPM (Q7265514) was wrong since it is a Wikimedia disambiguation page (Q4167410), so I merely replaced it with the best existing item: King's Police Medal (Q2792177). I agree property award received (P166) might be better a better fit for the latter, though. What would you suggest, creating a dedicated Q-item for each honorific suffix like "QPM" or "PhD", or changing property honorific suffix (P1035) to type string ? LaddΩ chat ;) 18:26, 18 April 2014 (UTC)
I'm not sure, but possibly P166? But not all suffixes are for awards. Perhaps honorific suffix (P1035) (and honorific prefix (P511)) should apply to awards, official posts and so on, and not to people? So we can say that Derrick Capper had an award called the Queen's Police Medal, and from that we can determine that that gave him the hon suffix "QPM". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:57, 18 April 2014 (UTC)
Makes sense. Since honorific suffix (P1035) was primarily meant to help identify which honorific suffixes should be applied to humans, we might: a) rename it to something like "honorific qualification", pointing to applicable items such as awards and academic degrees (e.g. King's Police Medal (Q2792177)), and b) associate the applicable string suffix to the award, degree or qualification. I wonder if there is already a string property for that? P743 (P743), like on Doctor of Philosophy (Q752297) ? - LaddΩ chat ;) 19:44, 18 April 2014 (UTC)
So how do we bring this about? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:38, 19 April 2014 (UTC)
Discussion moved to Property talk:P1035#Honorific suffixes. LaddΩ chat ;) 15:44, 19 April 2014 (UTC)

It's been pointed out at Talk:White King at en.wp that the interwiki links, based on Q912226#sitelinks-wikipedia, are incorrect. "Eau de Javel" and "Bleekloog" appear to refer to sodium hypochlorite, the active ingredient in bleach, and not to this particular brand. I'm not sure about the others. Can someone who knows there way around this project sort this out? Thanks, Adrian J. Hunter (talk) 04:20, 19 April 2014 (UTC)

If you don't know what to do during the Easter holiday ....

Here are still waiting around 225.000 items on persons, that need to be shifted from P107 to P31 is "human". With AutoList you have a powerful tool to claim a high number of items within short time, based on searches in categories of your local Wikipedia.
Please note, that there are also mythological figures and fictional characters defined as P107 "person". They, of course, should not get a claim "human". Dexbot is removing the claim P107 as soon it detects a claim P31 "human". -- Pütz M. (talk) 14:33, 19 April 2014 (UTC)

Subclass / Instance

I think we need yet another subclass of (P279) and / or instance of (P31) debate about this edit: https://www.wikidata.org/w/index.php?title=Q15631639&diff=0&oldid=112326667 - How is it possible that a file format is instance of (P31). How does it qualify as an individual? What is the point then of saying "instance of = human"?, if we can't distinguish if that statement means that somebody is an individual human being or a subclass of human? Tobias1984 (talk) 18:37, 18 April 2014 (UTC)

Item classes must be subclass of (P279) more general items but may also be instance of (P31) some other class type; for example there are dozens of instances of ship classes, all of them one way or another subclasses of ship (Q11446).
Help:Basic membership properties needs to be beefed up with a few rules and explanations. Does anyone recall where there was a good description of how to identify what's a class and what's an instance? - LaddΩ chat ;) 20:21, 18 April 2014 (UTC)
For the best explanation is: if following one classification line you arrive at the end (lowest level) of it you should use instance of (P31). Snipre (talk) 06:21, 19 April 2014 (UTC)
@Stiegenaufgang: Tobias1984 (talk) 08:54, 19 April 2014 (UTC)
A class is a set of instances. Barack Obama is a human, there is a lot of humans like him.
⟨ Barack Obama ⟩ instance of (P31) View with SQID ⟨ Human ⟩
is valid. A rule of thumb is : is there a lot of other things like that I could use is a <human> ? If yes, <human> is a class. Where it gets a little more complicated : classes are very general.
  • A bit further : Barack Obama is a <human>, he is also a <living organism>. There is a lot of <living organism> like Barack Obama, Dolly the first clone was, for example. Living Organism is hence a class also. Now every <human> is a <living organism>. We can denote this relation as
    ⟨ human ⟩ subclass of (P279) View with SQID ⟨ living organism ⟩
    , which basically means exactly that : every human (instance of human) is also a living organism instance (instance of living organisme)
  • further the furtherness : we can class things in a lot of possible ways. There could be edge cases or corner cases for which different persons or different organisms (statistical organisms, state administation for example) uses a slightly variant of the definition of a concept, which can result in a different set of instances for a class. Maybe for example in france the INSEE sets a city as a village with more than 10000 inhabitants, but the united states uses 15000. To reflect that, we can use class classification and create one class item for each definition <city as in INSEE definition> and <city as in US definition>. Then we can create an item to regroup all the classes defined by insee definition, another one to regroup classes defined by US defintion, and sort them using instance of (P31) :
    ⟨ city as in INSEE definition ⟩ instance of (P31) View with SQID ⟨ class as defined by INSEE ⟩
    an
    ⟨ city as in INSEE definition ⟩ instance of (P31) View with SQID ⟨ class as defined in the US ⟩
    . This still fits the first point and gives extra flexibility into building and filtering class e want to show, while being NPOV. TomT0m (talk) 10:02, 19 April 2014 (UTC)
Some time ago, I looked into some details on how citys/towns are classified in a few states in US and Canada. It existed not only Citys and Towns, but also "First class citys", "Second class citys" etc. And if any city once had qualified to one status, it was not automatically declassified only because it no longer fullfilled the qualifications, or the qualifications had been changed. And in some cases, a town can choose to stay as town, even if it qualified to be "promoted" to a city. To be promoted, the town have to make an application to the state authorities, and the town can by political reasons choose to not apply for a promotion. These promotion-processes makes it hard to "translate" terms from one standard to another. The authorities in France would never review the status of a town in Alabama, and the Alabama-authorities would never review the status of a city in France.
Administrative citys do not exists in Sweden any longer, but when they existed, a municipality could choose to not be promoted to a city. A higher status implied more local administration and therefor higher taxes. If a municipality could fullfill the needs of the population without being promoted, they choosed not to. The smallest citys in Sweden had a population < 20 and the largest non-citys had a population > 5000. In the latter case, many such things as firefighting, water and wastewater-supply was administrated by private companys. All of this is difficult to "translate" since it's not about simple logic, it's politics. -- Innocent bystander (talk) (The user previously known as Lavallen) 09:42, 20 April 2014 (UTC)
In those cases we need several items and properties to state a week equivalence of these items. Then the translations are easier as we can translate as USA city or anything better. This is one of the reasons why we need a good way to class classes. TomT0m (talk) 09:53, 20 April 2014 (UTC)

Help with properties and queries for importance football templates

Pls see User:Xaris333/Football. 22:07, 19 April 2014 (UTC)

$number needs to be a integer

Hi, when I try to link a page from within a wikipedia (e.g. en, de...), I do get the following error message: In the "Link with page" window: "An unexpected error occurred. $number needs to be a integer". I got this message more than ones in different language versions. E.g. try to link en:RSNA with de:RSNA, using the "Add links" button in wikipedia (not wikidata). Can you please fix this. thx --Teilzeittroll (talk) 08:57, 20 April 2014 (UTC)

Sorry for the issue. It's been reported and fixed already in the code. The fix will go live on the sites soon. --Lydia Pintscher (WMDE) (talk) 09:06, 20 April 2014 (UTC)
Thanks for the information! --Teilzeittroll (talk) 09:10, 20 April 2014 (UTC)

List of items without images but possibly free images available

The weekly update includes a link to "List of items without images but possibly free images available". That's useful, but has issues. Who made it?

Specifically, the first two examples I tried, Valeriana dioica (Q159036); Callidium aeneum (Q882815), already listed their respective commons categories. There's no point sending people to Flickr to obtain free images, when we already have them in that sister project. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:36, 20 April 2014 (UTC)

Magnus created this tool. He also created this one. It does look for pictures on Commons. NB It is good to have both tools ... GerardM (talk) 11:02, 20 April 2014 (UTC)
@Magnus Manske: Please can you add a check for Commons category (P373) to each of these tools? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:41, 20 April 2014 (UTC)
Today I am playing around with image tracking like Commons category tracking and coordinates tracking. This template sets a category based on what is available locally and on Wikidata (image (P18)). It's now enabled on a limited number of pages (about 2000), but already contains a lot of useful suggestions. We should expand this to more templates and to other languages to get more articles illustrated.
For example with Valeriana dioica (Q159036) the Dutch article already contains an infobox with an image that could easily be imported here. Multichill (talk) 15:09, 20 April 2014 (UTC)
Both tools use a pre-computed list; adding categories would be ... tricky. Meanwhile, this link lists ~750K items that have a Commons category but no image. Skip over the first few (months? WTF?) to find some good ones. I may turn this into a one-click tool later. --Magnus Manske (talk) 15:58, 20 April 2014 (UTC)

Proposal: change "source" to "reference"

There is, in my opinion, a small mistake in the UI of statements, as it uses the word "source" instead of "reference". I would like to see that changed. Reasons:

  • "source" implies that the claim is taken from that source. "Reference" implies only that the reference supports the claim.
  • "source" seems to carry a stronger legal impact than "reference". Copying content from a source can be problematic. Citing external content as a reference should not raise legal issues (but IANAL).
  • there could only be one source per claim, but numerous references. The latter is what the software supports. It also seems to be more useful for our goals.
  • it aligns more to the data model, the code, and the APIs, where "reference" is used. It is only the UI that is wrong about this.

Hope this finds support. --Denny (talk) 20:12, 14 April 2014 (UTC)

I agree - In all honesty, you should have fixed this before you left for Google though :p John F. Lewis (talk) 20:14, 14 April 2014 (UTC)
I completely agree. It was an oversight on my part. As said, the data model and the code speak of "references" throughout. It really is just the UI message. It might have been that the shorter word was preferred and therefore used, to make the UI more concise. Not being a native speaker I did not realize the implication. --Denny (talk) 20:21, 14 April 2014 (UTC)
 Support That was one of the main misunderstandings happening during the early talks on sourcing, still a good time to correct it.--Micru (talk) 22:33, 14 April 2014 (UTC)
 Support. --AmaryllisGardener (talk) 22:40, 14 April 2014 (UTC)
 Support. If I am not mistaken, this is a MediaWiki message, and this proposal is to alter the English language message only (in the core code or the wiki interface message?), but with the expectation that other languages might update their translation if appropriate. 'source' has two meanings in English that are relevant - one meaning is the actual source where a fact was obtained from; the other meaning is any source where you may find further information about the fact. I really like 'sources', as that ambiguity works quite well in English for native English speakers, but I can imagine it will confuse many. Wrt 'there could only be one source per claim', that is complicated by scenarios where qualifiers may only be verifiable in a different source than the main claim. Anyway, I think Wikipedia contributors will find 'references' more intuitive, and that is what Wikidata should be aiming for at this stage. John Vandenberg (talk) 03:52, 15 April 2014 (UTC)
 SupportWylve (talk) 05:27, 15 April 2014 (UTC)
 Support. Ralgis (talk) 05:51, 15 April 2014 (UTC)
 Support Good plan. Lymantria (talk) 06:29, 15 April 2014 (UTC)
 Support - Brya (talk) 10:43, 15 April 2014 (UTC) As long as the "imported from this here Wikipedia" is not transposed as well (a Wikipedia is not a reference).
@Brya:, I think imported from Wikimedia project (P143) would appear in this new 'References' section. That is why I personally prefer the ambiguity of 'Sources'. However, we should eventually replace all 'imported from Wikipedia' with 'Stated in <authoritative source>'. John Vandenberg (talk) 12:29, 15 April 2014 (UTC)
@John Vandenberg:, including imported from Wikimedia project (P143) might be acceptable when it concerns an external database, but the "imported from this here Wikipedia" is not information, but just metadata without real value. BTW: I don't really expect to live to see the day when all 'imported from Wikipedia' are replaced with 'Stated in <authoritative source>'. - Brya (talk) 16:29, 15 April 2014 (UTC)
"the 'imported from this here Wikipedia' is not information, but just metadata without real value"
+1. Emw (talk) 01:27, 17 April 2014 (UTC)
 Support Tpt (talk) 11:36, 15 April 2014 (UTC)
 Support --Paperoastro (talk) 16:36, 15 April 2014 (UTC)
 SupportTotemkin (talk) 17:42, 15 April 2014 (UTC)
 Support Jason Quinn (talk) 21:23, 16 April 2014 (UTC)
 SupportΛΧΣ21 01:03, 17 April 2014 (UTC)
 Support. Emw (talk) 01:27, 17 April 2014 (UTC)

I made a patch to the discussed effect. Please feel free to merge. --Denny (talk) 16:45, 17 April 2014 (UTC)

does this also apply to the documentation? Help:Sources talks about sources and sourcing all over the page. I would like to also adjust the misleading German translation but the English help page is inconsistent as well. -- JakobVoss (talk) 20:54, 20 April 2014 (UTC)
That will also need to be changed, yes. The patch by Denny doesn't cover that (because it can't). --Lydia Pintscher (WMDE) (talk) 08:47, 21 April 2014 (UTC)

There are quite a lot of place related categories (born in, died in, etc.). So there is necessity of automating extraction of such categories using corresponding property value. Easiest solution to create property for each type of categories, but such solution will not scale and will complicate place items maintenance. What other solutions may be? --EugeneZelenko (talk) 15:03, 19 April 2014 (UTC)

May be use subclass of (P279) of some type of categories (if I right understand what you want). Нужно выделить класс категорий? --Infovarius (talk) 07:19, 21 April 2014 (UTC)
@EugeneZelenko, Infovarius: category combines topics (P971).--GZWDer (talk) 11:03, 21 April 2014 (UTC)
This is reasonable way to classify categories, but how you could find out Category:People from XYZ from item XYZ? --EugeneZelenko (talk) 14:16, 21 April 2014 (UTC)

I think we need an RFC on categories. See Wikidata:Project_chat/Archive/2014/04#Two_types_of_categories and Wikidata:Project_chat/Archive/2014/04#Relationships_between_Wikimedia_category_pages_in_Wikidata. I see Wikidata:Requests for comment/Define lists on both "Wikimedia lists" and "Wikimedia categories", but I have to admit I dont quite understand where that RfC is going. John Vandenberg (talk) 12:06, 21 April 2014 (UTC)

Death dates conflicts

Hi, I ran a bot to run on recent deaths and get the conflicts and report it. there is a list here User:Ladsgroup/Date of death problems If you think It's a useful task please tell me to request for approval and run it regularly, If you have any suggestion I would be happy to hear it Amir (talk) 17:53, 19 April 2014 (UTC)

Good job but which WP did you use ? The english one ? Perhaps by using data from different WP you can treat automatically the conflicts ? Perhaps you can do the same job with the german WP and put all data in the same table. It can be a good subject for an scientific paper: which level of matching is necessary between different sources to select the good data. Snipre (talk) 20:19, 19 April 2014 (UTC)
I ran it on English Wikipedia but I can expand it very easily Amir (talk) 21:12, 19 April 2014 (UTC)
✓ Done 3 ambiguous and 3 not found news --ValterVB (talk) 09:02, 21 April 2014 (UTC)
Thank you Amir (talk) 11:50, 22 April 2014 (UTC)

Wikiquote interwiki conflicts

Hello all, this is the list of interwiki conflicts in Wikiquote please review it and help to fix them, it's in all Wikiquote languages so help of everyone is needed. User:Ladsgroup/Wikiquote integration Best Amir (talk) 16:10, 20 April 2014 (UTC)

maybe someone else was there before me but every one of the wikiquote pages listed here that I checked had a good link in Wikidata. Am I missing something? Filceolaire (talk) 15:18, 21 April 2014 (UTC)
AFAICS, those are pages whose wikitext contains one or more interwikis which could not be added to the wikidata item due to a conflict. So you have to compare the local list of interwikis to wikidata's and find out why they differ, then fix it. --Nemo 18:21, 21 April 2014 (UTC)
you're right dear Nemo :) Amir (talk) 11:50, 22 April 2014 (UTC)
I did check a few of these links. In most cases the lack of a wikidata link was correct as the link was to a more general page - e.g. from the page for a book to the page for the books author in another language, usually via a redirect. What is a good way to mark something as sorted? Filceolaire (talk) 14:26, 22 April 2014 (UTC)
Remove incorrect interwiki links (to more general ones) and if there is no other porblem mark it as fixed (remove it from the report) Amir (talk) 15:01, 22 April 2014 (UTC)

next office hour

Hey everyone :)

I'll be doing another Wikidata office hour on IRC. It will take place on May 19th at 5PM UTC in #wikimedia-office. For your timezone please see here. I'll be giving a status update and then answer whatever Wikidata-related questions you have. Hope to see many of you there.

Cheers --Lydia Pintscher (WMDE) (talk) 13:41, 22 April 2014 (UTC)

The nl-article of the above item have the name Jobsbo en Marleberg. At least the statistics reports from 2000 and 2005 confirms Marleberg, but the 2010-report tells Marieberg. And I see nothing on my maps that confirms Marleberg. Instead I see Marieberg everywhere. Marleberg is possible to pronounce in Swedish, but it's a challenge to say. All of this, makes me think Marleberg is a mistake from Statistics Sweden. It's not unusual to find this kind of mistake in this kind on entity. Statistics Sweden does not spend much time to give them a good designation. They care more about the statistics-code. -- Innocent bystander (talk) (The user previously known as Lavallen) 15:13, 18 April 2014 (UTC)

This 2005 XLS-file indeed names Marleberg, while the newer 2010 XLS-file has Marieberg. I have changed the name at nlwiki. Lymantria (talk) 13:57, 23 April 2014 (UTC)

Self Contained Water Fountain

I have seen on TV a demonstration of a water fountain that uses no electricity, and re-uses its own water in a closed system. How do I build one? 96.238.219.251 05:45, 23 April 2014 (UTC)

you're definitely in the wrong place here. wikidata will not host any manuals on building water fountains. --Akkakk 06:48, 23 April 2014 (UTC)

Translated namespace prefixes

Right now Help:Sources/de redirects to a page titled "Help:Belege". This mix of languages ("Help" is English, "Belege" is German) is against the spririt of a translated user interface. Especially help pages should not provide a mix of English and another language. This is confusing to non-English speakers! To solve this, one could add a namespace alias for the Help-Namespace for each language (e.g. Hilfe:Belege). I guess the labels for "Help" in other languages are already given in the standard translation of MediaWiki but how to we get them in Wikidata? -- JakobVoss (talk) 20:14, 21 April 2014 (UTC)

Fixed in your example. It's also possible to do the same on other pages. Not sure whether it is really worth to implement all known namespace prefixes, though. Vogone talk 21:00, 21 April 2014 (UTC)
Personally I don't like to translate namespaces in those messages. Because here only exists the English namespaces, so this just makes mess to "create" namespaces which does not exist. --Stryn (talk) 04:59, 22 April 2014 (UTC)
This fix does not work, try https://www.wikidata.org/wiki/Hilfe:Belege - the URL should match the title that is shown to the user. -- JakobVoss (talk) 20:09, 22 April 2014 (UTC)
As I said above, "here only exists the English namespaces, so this just makes mess" when translating namespaces on those messages. This wiki (like other multilingual wiki's) only contains English namespaces (User, User talk, Item, Property, etc.). --Stryn (talk) 20:14, 22 April 2014 (UTC)
Namespaces are not of a given language, they are just labelled in english ... don't mix the namespace and its label. I think it's just one of the problems refering to the fact that Wikidata does not really have several linguistic versions, it's a unique project, not like wikipedias who are linguistic subprojects on their own. Here it's just a translation of a namespace label wrt. the language of the user. We alreadt have mechanisms to deal with this such as Discussion in templates ({{Int:talk}}) for example ... TomT0m (talk)
The fact that namespaces are just labelled in English is no explanation but the core of the problem! Imagine as non-English speaker you visit Wikidata in your language but all documentation pages are labeled with "XXXX" instead of "Help". -- JakobVoss (talk) 13:26, 23 April 2014 (UTC)
Theoretically one could create namespace aliases for each language though that would cause a lot of work and clutter. But it's not like there wouldn't be any technical solution for this problem. Vogone talk 19:41, 24 April 2014 (UTC)
In fact it would also be beneficial if the translation extension recognised the translated page titles and looked for them in case no local page with that label exists (so that if you link/search for Hilfe:Belege it would look for a translation with the label "Hilfe:Belege" and redirect you there in case no "real" page with the title "Hilfe:Belege" exists). Vogone talk 19:44, 24 April 2014 (UTC)

$number needs to be an integer

Hi. When trying to add de:(243073) Freistetter to it:243073 Freistetter or vice versa I get the error message $needs to be an integer when going through "Edit inerlanuage links". Bug #63278 is tagged "Status: VERIFIED FIXED". However, the problem remains and I don't find a workaround. Or maybe I don't understand what the status means. Any suggestions? Thanks and regards, --Gereon K. (talk) 18:26, 23 April 2014 (UTC)

It will take a few more days to go live unfortunately. --Lydia Pintscher (WMDE) (talk) 18:30, 23 April 2014 (UTC)
I have merged. --ValterVB (talk) 18:52, 23 April 2014 (UTC)
Thanks. --Gereon K. (talk) 19:54, 23 April 2014 (UTC)

Futsal team

Hello. We have association football club (Q476028) for association football. Do we have something for futsal (Q171401), as well? I can't use the association football club (Q476028) because is a different sport. I want to use it with instance of (P31). Xaris333 (talk) 03:30, 24 April 2014 (UTC)

No, we don't have such an item yet. In fact, most futsal teams don't have a P31 statement at all, while some are tagged as Q476028. Go ahead and create a futsal team item, I'd say. (Main category would be either Category:Futsal clubs (Q7843233) or, more directly but less complete, Category:Futsal teams (Q13263276) - I can't tell which would be more correct.) --YMS (talk) 08:34, 24 April 2014 (UTC)
We need guidelines for sport, a team is not a match, is a match an instance of the sport itself, what is the status of a pro league versus an amateur league and so on. Did a few attempt myself for some sports randomly, but there clearly is a pattern. TomT0m (talk) 11:31, 24 April 2014 (UTC)
For sport other than football, I generally use instance of (P31) = sports team (Q12973014). --Casper Tinan (talk) 12:54, 24 April 2014 (UTC)

Done. futsal team (Q16632355) Xaris333 (talk) 15:59, 24 April 2014 (UTC)

I'm against creating lots of items that won't have much in the way of properties. I suggest that instance of (P31) = sports team (Q12973014) should be used with the additional property
sport (P641) = futsal (Q171401) so we don't need to have an item for a type of team in addition to the existing items we already have for each sport. Filceolaire (talk) 23:02, 24 April 2014(UTC)

I support Filceolaire's solution. It would also enable us to handle the case of multisport and omnisport clubs. Casper Tinan (talk) 09:44, 25 April 2014 (UTC)
I think it won't reduce the number of items and won't solve anything. If we have a notable junior woman team of volleyball in a big club, we won't really be able to model this easily in one statement. The creation of items is just organic and a few item per sports is unticeable, there is a lot more teams than there is clubs, a lot more clubs than there is sports, and a lot more players. TomT0m (talk) 10:16, 25 April 2014 (UTC)
I prefer to have that Stockport Futsal Club (Q16632363) instance of (P31) = futsal team (Q16632355) than Stockport Futsal Club (Q16632363) instance of (P31) = sports team (Q12973014). Xaris333 (talk) 23:13, 25 April 2014 (UTC)
We have also association football club (Q476028), cycling team (Q1785271), ice hockey team (Q4498974), baseball team (Q13027888), handball team (Q10517054), basketball team (Q13393265), field hockey team (Q15020890), rugby league team (Q15221215) etc. Xaris333 (talk) 23:17, 25 April 2014 (UTC)

References on Wikipedia

I work a great deal on Wikipedia with References, and quite a lot of my job is to convert raw text into instances of one or other of the {{Cite…}} templates. Would it be possible to consider storing the details of these references here? There is quite a lot of scope for normalisation and synchronisation, particularly when it comes to names and wikilinks. Assuming I haven't missed something obvious, where would I look to start a discussion about this? TIA HAND —Phil Boswell (talk) 12:39, 24 April 2014 (UTC)

Yes, according to Wikidata:Requests for comment/Source items and supporting Wikipedia sources#Supporting Wikipedia sources we want to store this data. At Help:Sources you see how this works. However, note that it is not yet possible to use this in Wikipedia as accessing another Wikidata item from a Wikipedia article is not yet supported. But this feature is under development and will - hopefully - be enabled very soon.  — Felix Reimann (talk) 15:12, 24 April 2014 (UTC)
Thank you. Unfortunately, despite reading those pages, I am somewhat at a loss as to how this would actually work in practice. Is there some kind of Wikidata for Dummies I can read to get up to speed in preparation? —Phil Boswell (talk) 13:55, 25 April 2014 (UTC)
I do not know a basic introduction to Wikidata, Wikidata:Glossary might be helpful to cope with the new vocabulary. May I give it a try? Given you might want to reference the article Schmid, S. M., Fügenschuh, B., Kissling, E., & Schuster, R. (2004): Tectonic map and overall architecture of the Alpine orogen. Eclogae Geologicae Helvetiae, 97(1), 93-117. [3]. You search Wikidata and do not find this article. Thus, you create a new item (see on the left hand side: "Create a new item" which gives you an empty item like Q13416617) for this specific article. Then you add all the properties as defined at Help:Sources which are relevant for the article, like authors, title, pages,... Then you have Q13416617. As this article is part of a journal and this is a different subject (the article<->the journal containing several articles) we need a different item for it. Perhaps, you are lucky: the item for the journal exists already and you found it by searching: Q13416933. Then, you add to Tectonic map and overall architecture of the Alpine orogen (Q13416617) the statement part of (P361)Swiss Journal of Geosciences (Q13416933). Now, we have all information required for creating a reference. In your Wikipedia article, you can add {{CiteQ|Q13416617}} which is a LUA module (more powerful than the "former" templates) and delivers:
Let's assume you and other authors have now added the same reference to 20 different Wikipedia articles (also in different language chapters: de-wiki, fr-source, pt-voyage, ...), just by using {{CiteQ|Q13416617}}. Now a user (or a bot) detects that there is the ISBN 123-123-123-123 for the article. If it adds the property ISBN-13 (P212)=123-123-123-123 to Q13416617, the reference string in all 20 articles is updated accordingly. Note that this {{CiteQ}} template does not exist yet as this would require a Wikidata feature which is still under development.
What else? You could now also query Wikidata to get all scientific articles by the first author Bernhard Fügenschuh and add the corresponding reference strings as a list to his Wikipedia biography creating his bibliography automatically. Tell me, if I lost you somewhere. :-)  — Felix Reimann (talk) 14:53, 25 April 2014 (UTC)

testing needed for next deployment

Hey folks :)

The next deployment is on Tuesday. We've reworked handling of time and geocoordinates quite considerably. It'd be awesome if you could give it some testing at test.wikidata.org and let me know about any issues you encounter. --Lydia Pintscher (WMDE) (talk) 13:49, 25 April 2014 (UTC)

I can't enter directly the date "2010" with precision year. It gets always displayed as "2010s" with precision decade. --Pasleim (talk) 09:05, 26 April 2014 (UTC)
I can't enter a time of day in the time/date field - best precision is 'day'. Filceolaire (talk) 20:25, 26 April 2014 (UTC)

Dedicated namespace for Task Forces/WikiProjects

Self-explaining. Opinions? --Ricordisamoa 00:24, 26 April 2014 (UTC)

Not really, why would we want to have that? What does it add? Multichill (talk) 16:00, 26 April 2014 (UTC)
I think either zhwiki or kowiki (can't remember which) does this, but we don't have very many task forces, so I see no harm in keeping them in WDspace. --Jakob (talk) 16:04, 26 April 2014 (UTC)

How to list all games in a video game series?

Is there a proper way of doing this, should I do it, and if so how can I do it? For example, the Half-Life series would include Half-Life, Half-Life 2, etc. The series is listed from the game's pages, but this isn't true vice versa. I imagine this'd be something like "Games in this series"? Not sure.

If possible, please use Template:replyto when replying to this thread. I don't visit my Wikidata Watchlist a huge amount, so I don't want to miss your replies! --Nicereddy (talk) 00:43, 26 April 2014 (UTC)

@Nicereddy: How about using has part(s) (P527) as in this example.--Underlying lk (talk) 03:04, 26 April 2014 (UTC)
@Underlying lk: Thanks! While this works, it unfortunately doesn't seem to be used in any other series' data items. It's mostly used for bands, as far as I can tell. I'm not sure if we should start using it without precedent? --Nicereddy (talk) 05:19, 26 April 2014 (UTC)
@Nicereddy: Actually, the only feasible way to do this would be bugzilla:49165. --Ricordisamoa 03:49, 26 April 2014 (UTC)
@Ricordisamoa: Was an RfC ever made on that topic? I don't see anything in the comments saying it was, but it seems like a rather important feature to at least consider. --Nicereddy (talk) 05:19, 26 April 2014 (UTC)
If adding reciprocal properties really would require changing "the essential structure of Wikikdata properties" as said in comment 4 of the bugzilla, I think that has part(s) (P527) is in fact the only feasible solution.--Underlying lk (talk) 05:48, 26 April 2014 (UTC)
@Nicereddy: Wikidata:Requests for comment/Make Symmetric & Inverse Property addition automatic and bidirectional was held about a similar topic, but with very low participation. --Ricordisamoa 13:17, 26 April 2014 (UTC)

If you are talking of series, It may be better to use sequence properties like follows (P155) and followed by (P156). (btw I'll recall my proposition on Wikidata:Property proposal/Generic#in list to discuss how we use those properties to say that this item follows (P155) and followed by (P156) some other items in possibly several sequences ... do we use part of (P361) ? TomT0m (talk) 13:45, 26 April 2014 (UTC)

The proper way to do this is to use the property part of the series (P179) on the elements of the series - as has been done on Half-Life (Q279744) and Half-Life 2 (Q193581) and not put anything in the item for the series. There is a bug with the developers to give us an easy way to track reverse properties so that, for instance, we can easily find all the statements that have Half-Life (Q752241) as it's object (all the items that link to this item) though "What links here" sort of does this. Filceolaire (talk) 20:08, 26 April 2014 (UTC)

Clarifying the requirements for property creation

Conversation moved to: Wikidata:Requests for comment/Clarifying the requirements for property creation.--Micru (talk) 01:20, 27 April 2014 (UTC)

Dates manufactured

What property would be best to use for inputting the dates between which a car was manufactured? For instance, the Pontiac Phoenix was made from 1977-1984. How would I enter that info on the appropriate item? Mr. Millard Fillmore (talk) 21:41, 25 April 2014 (UTC)

service entry (P729) and service retirement (P730) ? Have a look at their talk pages and original proposal discussions. LaddΩ chat ;) 23:13, 25 April 2014 (UTC)
service entry (P729) is not practical, it would create ambiguities for any vehicle that is in military service, like T-90 (Q192757). The truth is that we still don't have many basic time properties, there are 22 of them in total compared to hundreds for items or strings. Some suggest using significant event (P793) with a qualifier but personally I think we should just create a dedicated property for this.--Underlying lk (talk) 03:11, 26 April 2014 (UTC)
Have a look at the talk page of significant event (P793). A solution for this very case is proposed. Casper Tinan (talk) 07:39, 26 April 2014 (UTC)
Having to use qualifiers for every time claim is not much of a solution.--Underlying lk (talk) 15:37, 26 April 2014 (UTC)
I agree. significant event (P793) is too impractical to be used on large scale. /ℇsquilo 05:23, 27 April 2014 (UTC)
@Underlying lk Why is it not a solution ? Having 30-50 properties is more simple especially when they are in the middle of ~2000 properies ? Not for a newbie. Snipre (talk) 10:56, 28 April 2014 (UTC)
I understand where you're coming from because having lots of properties is bound to confuse things, but the proposed solution is no better as it requires the user to remember not only the property, but also the right qualifier to use for each case, which is even less user-friendly. significant event (P793) should be for cases where the property-qualifier combination is relatively rare, but if a claim requires the same qualifier for several thousand items, it's better to create a dedicated property.--Underlying lk (talk) 11:04, 28 April 2014 (UTC)

Private templates

Should we keep Template:Maintenance? It is placed on dozens of item talk pages without explanation. --Kolja21 (talk) 21:22, 27 April 2014 (UTC)

There are hundred of semantical conflicts, many of them about language links linked together one subset about an occupation and one about a field of occupation. Wikidata started as a quick-and-dirty (Q2718087) project. Some people especially @Kolja21 seems to thing that nothing exists beyond his imagination, else I can not understand his "contributions" as [4].
The template (today a draft) is part of a multiple week job.

{{maintenance|SUBJECT=Q2207288|LIST=|TREEVIA=31<!--361 -->|DDC=680|DDCMAINITEM=|DDCMAIN=|TANDEM=Q1294787|TANDEMLIST=|TANDEMTREEVIA=279|DDCTANDEM=331.794|SHOWALL=|SILENT=y|WMFLCODE=en}}
craft (Q2207288) · purge · T · WLH · tree · reasonator · DDC: 680 · DDCTANDEM: 331.794 · tree using instance of (P31) · TANDEM: artisan (Q1294787) · purge · T · WLH · tree · reasonator  · tree using subclass of (P279)

In many languages there is only one article about Handwerk = craft and Handwerker = craftsmen. To my understanding (this is if you do not want that Wikidara is a farce) Wikidata pages need to be semantically unambiguous. It will take many months to achieve this and many items as history will be dual use for long time. Keeping adding information via bots will just postpone and create more work. לערי ריינהארט (talk) 04:30, 28 April 2014 (UTC)

{{maintenance|SUBJECT=Q40634|LIST=|TREEVIA=361|DDC=|DDCMAINITEM=Q8162|DDCMAIN=410|TANDEM=Q13418253|TANDEMLIST=|TANDEMTREEVIA=279|DDCTANDEM=409.2<!-- ,410.92-->|SHOWALL=|SILENT=y|WMFLCODE=eo}}
philology (Q40634) · purge · T · WLH · tree · reasonator · linguistics (Q8162) · DDCMAIN: 410 · DDCTANDEM: 409.2 · tree using part of (P361) · TANDEM: philologist (Q13418253) · purge · T · WLH · tree · reasonator  · tree using subclass of (P279)

is an example about some fields of works and the related professions. I have not seen @Kolja21 contributing here.
With the template it is possible to see if the "tree" is consistent for both the fields of works and the related professions across Wikidata and the related / linked sites. It will not be possible to follow the systematics from all linked sites because of historical changes and variations across cultures and languages. But it can be possible to keep the amount of differences as small as possible.
Regarding maintenance there are a lot of issues to do:

keeping WD items semantically unambiguouse; this includes the naming titles in Wikidata (also the singular / plural versions differences); if necessary language links have to be moved to the semanticly correct WD item;
keeping aliases unambiguous; this will require to remove a lot of aliases added by bots from the language Wikipedia sites because of existing redirects;
creating missing FreeBase entries etc.
adding carefully whatever statements, the external and the relational statements (instance of, part of, subclass of)

All this will be possible only with human interaction. gangLeri 06:59, 28 April 2014 (UTC)

Sure there are "hundreds of semantical conflicts" but it doesn't help if you add cryptic thoughts to talk pages. We had the same problem with you on Wikipedia where you flooded talk pages in the same way, after your edits in the articles have been criticized and often reverted. Why don't you correct your old errors like adding multiple identifiers (different persons, wrong numbers etc., see Property talk:P1003 and Property talk:P1005) instead of starting new chaos? --Kolja21 (talk) 12:24, 28 April 2014 (UTC)

Is it possible to import IMDB database?

IMDB is a database containing movie and TV information. Since data doesn't have a copyright, is it possible to import cast, crew, genre, rating, production company, and technical Specs information from it?--維基小霸王 (talk) 07:31, 28 April 2014 (UTC)

On imdb.com it is written: You may not use data mining, robots, [...] on this site, except with our express written consent and We do allow the limited use of robots and crawlers, such as those from certain search engines, with our express written consent. --Pasleim (talk) 07:52, 28 April 2014 (UTC)
I see. But facts are facts. Facts won't change if collected them in other ways. I suppose If I index that site on behalf of myself, they won't become illegal data.--The Master (talk) 08:29, 28 April 2014 (UTC)
Facts are facts but collection of facts are under other rules. You can take some facts from a database or any document but once you import large parts of the database, you can't justify by using the notion of facts. Snipre (talk) 10:03, 28 April 2014 (UTC)
Given that they say they do give consent to some robots, it is not outside the bounds of possibility that consent would be given to a bot that imported data here. It may therefore be worth engaging in a discussion with them about it. Thryduulf (talk: local | en.wp | en.wikt) 10:18, 28 April 2014 (UTC)
Better not to. IMDb is not so reliable site. --Stryn (talk) 13:16, 28 April 2014 (UTC)

I think when one item link to a particular language of Wikipedia, label in that language should be added automatically with removal of the disambiguating text inside "()" as well as "()". There should be a delay of 1 hour or so instead of doing it instantly, in order to allow the editors to correct potential mistakes.(They may find mistakes not long after applying the interwiki change.) It may not be perfect, but helps a lot. How about that?--The Master (talk) 08:02, 28 April 2014 (UTC)

I think that would be Bugzilla57564, and there are tools like Wikidata:Tools/User_scripts#Label_Collector.--Micru (talk) 09:21, 28 April 2014 (UTC)

Default statement for human settlements

I work a lot with Swedish urban areas. Urban areas often have a history before urban areas was defined as a statistical entity. And sometimes urban areas are amalgamated with other urban areas and loose it statistical purpose. Other times the population becomes to small or become less dense, that they do not longer fullfill the requirements to be an urban area any longer.

How am I supposed to describe them then? In Swedish, I would use the word "ort", something we do not have a Swedish article about. It's a populated place, or a former populated place. Which item is the best to describe a human settlement which do not fullfill any special administrative or statistical definitions? -- Innocent bystander (talk) (The user previously known as Lavallen) 11:52, 28 April 2014 (UTC)

I would use "instance of:human settlement (Q486972)" with qualifiers "start/end:date with century precision", and then keep filling it out with "instance of:village" (start/end century), "instance of:municipality" (start/end date). Maybe also ask on Wikidata:Country subdivision task force, it can be that there is another approach...--Micru (talk) 12:19, 28 April 2014 (UTC)

When multiple P41 values exist, how to retrieve the correct one

When using {{#property:p41}} in a template or on a page, it returns the name of the flag image of a country/state/..., but in the case of country United States it returns:

Flag of the United States.svg, US flag 49 stars.svg, US flag 48 stars.svg

Three file names! In this case {{#ifexist:media:{{#property:p41}}|[[File:{{#property:p41}}{{!}}35px{{!}}]]}} has a false condition and does not show a flag image. A difference with other countries (and other geographicals) is the presence of a start date and an end date. So to solve my problem, I need to access the one with the highest start date or the one witout an end date in case of multiple values for P41. But I also need to get the ones that have no start/end date in case of a single value for p41. What can I do? --FredTC (talk) 14:55, 26 April 2014 (UTC)

I tried to find an answer but couldn't. As you figured out, it's caused by the multiple values in the "flag image" field of Q30. I see you were sent here from Wikipedia:Help desk. Sorry to keep sending you around but it's a Wikidata feature so people at Wikidata:Project chat probably know more if you don't get an answer here. PrimeHunter (talk) 11:19, 28 April 2014 (UTC)

Above question text is copied from Wikipedia. Does someone here know the answer? --FredTC (talk) 12:05, 28 April 2014 (UTC)

Set one of the statements to preferred and only that one will be send to Wikipedia. --Lydia Pintscher (WMDE) (talk) 13:24, 28 April 2014 (UTC)
Use [5] but I think the selection according to the date is not yet developed. Snipre (talk) 13:50, 28 April 2014 (UTC)
That was the solution, Thanks. --FredTC (talk) 14:00, 28 April 2014 (UTC)
Italian verision can manage date in qualifier, for example can select more recent value with: {{#invoke:Wikidata|formatStatements|property=p41|qualifier=p580|qualifiertype=latest}}, or also ranking ex.: {{#invoke:Wikidata|formatStatements|property=p41|rank=preferred}}. --ValterVB (talk) 20:35, 28 April 2014 (UTC)

Welcome Helen and Anjali!

Hey folks :)

Last week, the GNOME Foundation announced the interns that were accepted for the May - August 2014 round of the Outreach Program for Women (OPW). Eight interns were selected to work on Wikimedia projects, including two who will contribute to Wikidata outreach efforts. I will be mentoring them with Thiemo as my backup.

Anjali Sharma of India will be focusing on the design of mobile app concepts and promoting Wikidata on social media platforms. Her original project proposal can be found here: https://www.mediawiki.org/wiki/WikiHunt_the_%27Property%27:_Wikidata_Outreach_Initiative She will be blogging about her progress here: http://discoveranjali.wordpress.com/

Helen Halbert of Canada will be evaluating and expanding Wikidata documentation and contributing to a GuidedTours extension for introducing new users to Wikidata. Her original project proposal can be found here: https://www.mediawiki.org/wiki/Feed_the_Gnomes_-_Wikidata_Outreach She will be blogging about her progress here: http://www.helenhalbert.com/

OPW is part of Wikimedia’s mentorship programs and is intended to encourage new and intermediate contributors to grow their skills as well as provide pathways for richer community collaboration. Community feedback is essential to the success and sustainability of our interns’ projects and will be sought by Anjali and Helen throughout their work. They'll be in touch with you as the projects progress.

Please welcome them to Wikidata. I'm really excited to have them here to help us out.

Cheers --Lydia Pintscher (WMDE) (talk) 19:30, 28 April 2014 (UTC)

@Lydia Pintscher (WMDE): Congrats Helen and Anjali, it is great to have more students working here :) Just one question, why has been OPW preferred over GSoC?--Micru (talk) 20:10, 28 April 2014 (UTC)
It was a matter of resources - specifically developer time. I didn't want to have one of the developers spend time this summer on mentoring one or more students as we really need to make fast progress on Commons and queries for example. I can however make the time in my own schedule for another crucial task: improving out documentation and outreach ;-) Hope that explains. --Lydia Pintscher (WMDE) (talk) 20:13, 28 April 2014 (UTC)
Shame that we couldn't have both this year, but I understand. And anyhow better one program than none :) --Micru (talk) 21:05, 28 April 2014 (UTC)
@Lydia Pintscher (WMDE): it looks like you forgot to mention their usernames; I think they are User:Discoveranjali and User:Thepwnco, right? :-) --Ricordisamoa 21:45, 28 April 2014 (UTC)
Hah. You're right. And yes those should be their user pages. --Lydia Pintscher (WMDE) (talk) 21:48, 28 April 2014 (UTC)

Wikidata:Project chat#Norse_mythology

Gylfi (Q2296321)
aka Gangleri

Hi! This paragraph relates to many items linked to from narrative universe (P1080) WLH. It was a more the 24 hours job.
You may help

verifying the topics
adding FreeBase items for missing topics as Líf (Q3261873) , Lífþrasir (Q16513752) and many others.
verifying and adding the descriptions and labeles
verifying that aliases are unique
verifying the existence of all items listed at en:Template:Norse paganism topics, no:template:Norrøn mytologi, is:Skepnur (norræn goðafræði) and all items relevant to Norse mythology from en:list of mythological objects
usefull links: en:Family tree of the Norse gods

Please remember that this is not a quick-and-dirty (Q2718087) job.

many WMF pages where linked wrongly between languages, especially those related to multiple characters as Þjálfi and Röskva (Q1464046) and many others
most pages where orphaned
to be continued

FYI: Reasonater does not handle properties involving fictional entity (Q14897293) properly (see Reasonator: Loki (Q133147) ).

לערי ריינהארט (talk) 07:58, 23 April 2014 (UTC)

some help: Norse mythology (Q128285) · purge · T · WLH · tree · reasonator · DDCTANDEM: 000 · tree using subclass of (P279) · TANDEM: Norse cosmology (Q1798538) · purge · T · WLH · tree · reasonator  · tree using subclass of (P279)

Reasonator does show a special form for Humans.. Loki is not. Thanks, GerardM (talk) 11:06, 24 April 2014 (UTC)
@GerardM I posted a message at the m:Talk:Reasonator page about handling of fictional entities. לערי ריינהארט (talk) 22:14, 26 April 2014 (UTC)

At the end of the day: http://tools.wmflabs.org/wikidata-todo/tree.html?lang=en&q=128285&rp=279,361&method=d3&live to be continued לערי ריינהארט (talk) 22:45, 25 April 2014 (UTC)

I moved many from narrative universe (P1080) WLH statements. They are used now mainly as qualifiers.
WLH:"Norse mythology" lists those items; most of them. לערי ריינהארט (talk) 22:14, 26 April 2014 (UTC)
last "bottle" at: nb:talk:Norrøn mytologi לערי ריינהארט (talk) 10:31, 29 April 2014 (UTC)

Bot request

Would it be possible for an automatic tool to go through w:Template:Shakespeare's sonnets and s:Template:Sonnets and link the wp and ws pages on Sonnet 1, Sonnet 2, ..., together? I have chosen a few at random and linked them up manually, but it would be quite tedious to do them all by hand. It Is Me Here t / c 23:18, 28 April 2014 (UTC)

bot requests go to WD:BR --Akkakk 23:50, 28 April 2014 (UTC)
Anyway, ✓ Done --Ricordisamoa 00:26, 29 April 2014 (UTC)
Thank-you! And sorry I couldn't find the page; would it be possible to add a link to it to Template:Discussion navigation? It Is Me Here t / c 10:49, 29 April 2014 (UTC)

Error: "Please translate this into қазақша."

When I visit Wikidata:Property_proposal/Generic I see that message written next to each section title. Does anyone experience the same problem? My language interface is set to English, I tried to remove some babel boxes, but it didn't help, and Module:TranslateThis hasn't been changed lately.--Micru (talk) 11:55, 29 April 2014 (UTC)

Actually I am also seeing transcluded item labels in Kazakh language, but property labels in English :-? --Micru (talk) 12:04, 29 April 2014 (UTC)
Turning my user interface language to en I experience the same problem. --Pasleim (talk) 12:13, 29 April 2014 (UTC)
I have periodically reporting this problem on IRC for months. I do appreciate that the devs are doing fantastic stuff and I dont mind the occasional қазақша to keep life interesting. John Vandenberg (talk) 12:28, 29 April 2014 (UTC)
Now it is gone :D --Micru (talk) 12:59, 29 April 2014 (UTC)
?action=purge always fixes the page. I mostly see it on the Property proposal pages, and only ever when using English for my skin language. And it is almost always Kazakh language that appears instead of language.
fwiw, occasionally for the last few days I am also seeing <wikibase-x-y-z> on Q pages instead of the English strings. John Vandenberg (talk) 13:25, 29 April 2014 (UTC)

Position Held (p39p37) Normalization of member of parliament (q486839)

I propose that we should normalize the data on position held property. I saw the pattern being used are so varied, I think we should normalized it so we can use the data later. For the member of parliament property Here (reasonator) are the patterns that I saw on items right now. I think we can use pattern 2. What do you think?

Pattern 1: Position Held (p37)

   member of parliament (q486839)
       of (p642) = House of Representative of Country (qXX)

Pattern 2: Position Held (p37)

   member of parliament (q486839)
       country (p17) = country name (qXX)
       of (p642) = Representing district (qXX)

Pattern 3: Position Held (p37)

   member of parliament (q486839)
       member of (p463) = Hellenic Parliament (q477089)

Also, in another related topic, should we use generic attribute than specific entry? Some position held have specific entries like presidents and popular position. But we should normalize it to be able to use the data in a usable way in the future.

Generic: Position Held (p37)

   president (Q30461)
       country (p17) = country name (qXX)

Specific: Position Held (p37)

   president of country XYZ (QXXX)

--Napoleon.tan (talk) 15:32, 24 April 2014 (UTC)

It's P39, not P37.
Please don't use country (P17) as a qualifier property to position held (P39). Remember that these properties need to be usable for local councils and regional parliaments as well. I would propose
 position held (P39):Local councillor (or member of parliament or other generic office)
    of (P642):Foo county (or other administrative entity) (For larger entities use the item for the parliament/congress/senate)
    electoral district (P768):Bar ward (item for the electoral constituency) (Can be omitted for local councils where we don't have items for each electoral ward.)
    start time (P580)
    end time (P582)
OK?
I would be against using of (P642) with an item for the particular parliament (i.e. use United States Congress (Q11268) not 34th United States Congress (Q4635460)) so that a long stay in congress is one statement. Filceolaire (talk) 22:43, 24 April 2014 (UTC)
No reference to the election item? Something like statement is subject of (P805)=Foo county general election 2014 ? LaddΩ chat ;) 00:30, 25 April 2014 (UTC)
Thanks for the position held correction. I agree with your idea Filceolaire on the "of" administrative division plan. I think we should remove it also for presidents and such. To only refer to position held as president and not position held = "President of XXX country" directly.

Laddo, I think we can add the reference to election also. What do other think about this idea?

I think the question two question above is related. What does GerardM think about this? I think changing this would make the display of reasonator change a lot. There would be no more items in the president of XXX country and instead all are related to the generic position president. I believe he promotes the Reasonator project in his blog.

--Napoleon.tan (talk) 00:59, 25 April 2014 (UTC)

@LaddΩ: I would use successful candidate (P991) to link from the election item to the person but I don't think we need another qualifier for position held (P39) here. Filceolaire (talk) 21:24, 25 April 2014 (UTC)

I think the no specific item solution makes harder to build a list of the persons who held this office chronologically. It's easy enough to create those items and not really a big deal. TomT0m (talk) 07:14, 25 April 2014 (UTC)

One problem is that there can locally be different titles for each city. I know Poland have different titles for their mayors, depending on the size and tradition of the city.
In Sweden we have 12 borgarråd only in Stockholm, each have it's own specific title, describing his/her responsibility-area. -- Innocent bystander (talk) (The user previously known as Lavallen) 07:48, 25 April 2014 (UTC)
Are'nt they all subclasses of mayor in poland, for example ? TomT0m (talk) 08:45, 25 April 2014 (UTC)
Yes, and if we have a solution like Mayor of Whatever, we need to specify the subclass in the specific statement. Only "Mayor" is not specific enough. -- Innocent bystander (talk) (The user previously known as Lavallen) 09:02, 25 April 2014 (UTC)

I think the member of parliament item does not make any sense, this regroup any kind of assembly and legislative systems ... We would shoehorn an assembly elected proportionally or by territory, assemblies with different kind of powers, democraties or dictatorship alike. I prefer a lot the membership of a precise assembly with its peculiarities dealt in their own items once and for all, not in every statement for every elected people. TomT0m (talk) 08:52, 25 April 2014 (UTC)

It makes better sense to indicate for instance "Member of the Lok Sabha". So you would have one for each parliament and in that way do justice to the differences between systems. It is part of the Lok Sabha.. and they refer to constituencies of the Lok Sabha. Member of the Lok Sabha is imho a subclass of "member of parliament". Thanks GerardM (talk) 18:32, 25 April 2014 (UTC)

  1. but what about other levels of government?
  2. Do we create an item for each county council separate from the item for the county?
  3. Are there statements that apply to the Council but not the County?
  4. Do we create an item for "Mayor of Foo" and "Counsellor of Foo" separate from the items for "Mayor of Bar" and "Counsellor of Bar"?
  5. If we have one item for all "Mayors" with the qualifier "of" then why not have the same pattern for members of parliament and for Presidents?
Filceolaire (talk) 21:15, 25 April 2014 (UTC)
Actually the problem I am thinking is for query purpose. If we allow any value to be set in the position held like specific item (i.e. Mayor of city X), then how do we query data across different countries? --Napoleon.tan (talk) 05:39, 30 April 2014 (UTC)

Templates using data from Wikidata

Hey folks :)

I am constantly being asked to show people templates that make use of Wikidata. Are you active working on templates in a Wikipedia or other sister project? It'd be awesome if you could help by expanding the pages linked in Q11985372 so we get a better overview and have showcases. Thanks a lot! --Lydia Pintscher (WMDE) (talk) 10:45, 25 April 2014 (UTC)

On it.wiki (probably not complete): Template:Authority control (Q3907614), Template:Coord (Q6294369), Template:Find a Grave (Q6203793), Template:IMDb title (Q5640735) only check, Template:Infobox software (Q5621231)(logo image (P154)-developer (P178)-software version identifier (P348) with qualifier publication date (P577)-official website (P856)), Template:Infobox Metro station (Q6508423) (for coordinate) --ValterVB (talk) 13:07, 25 April 2014 (UTC)
On it.wikisource: s:it:Modulo:Autore uses sitelinks when not supplied locally, and sex or gender (P21) only from Wikidata, while s:it:Modulo:Controllo di autorità makes exclusive use of authority control codes of ours! --Ricordisamoa 14:11, 25 April 2014 (UTC)
Nice! Thanks. Do you maybe also have links to articles where they are used that I can give to a journalist as a showcase? --Lydia Pintscher (WMDE) (talk) 14:34, 25 April 2014 (UTC)
Lua infobox, see w:fr:Module:InfoboxBuilder/Composé chimique using only Wikidata as source and comparison of the result with the current Wiki infobox w:fr:Modèle:Infobox Chimie in w:fr:Undéc-1-ène. Snipre (talk) 15:31, 25 April 2014 (UTC)
Just for example, it:Rita Levi-Montalcini uses VIAF ID (P214), Library of Congress authority ID (P244), SBN author ID (P396) and Find a Grave memorial ID (P535) from Wikidata, while it:Stellarium uses logo image (P154), developer (P178), copyright license (P275), official website (P856) and software version identifier (P348) (with publication date (P577) as qualifier). Some articles about populated places could include geo-coordinates, but actual migration of non-trivial data is currently under discussion by both communities, given the importance of reliable sources. --Ricordisamoa 15:52, 25 April 2014 (UTC)
We really need to have a unified version of Module:Wikidata (Q12069631) across Wikimedia projects. it:Modulo:Wikidata seems to be much more advanced than en:Module:Wikidata but right now the module would have to be updated across every Wikipedia each time a change is made, and even that is not possible unless conflicts between the local versions are corrected.--Underlying lk (talk) 18:20, 25 April 2014 (UTC)
bugzilla:50329 --Ricordisamoa 00:26, 26 April 2014 (UTC)
You should suggest to the English WikiProject Video Games to implement this for some of the parameters in their infobox! --Nicereddy (talk) 00:35, 26 April 2014 (UTC)
bugzilla:50329 was reported on 2013-06-27. Is it taking so long because it is complicated to solve, or because it is not considered a priority?--Underlying lk (talk) 03:14, 26 April 2014 (UTC)
It's not a priority, because it's too important :) and complex. Don't expect to see this fixed within the next decade, or at least not by WMF. --Nemo 07:27, 26 April 2014 (UTC)
For cswiki the template using wd most heavily is w:cs:Šablona:Infobox rabín (check its code), but there is also w:Šablona:Commonscat, using only one property, but for most of its 103,436 usages. --Jklamo (talk) 06:40, 26 April 2014 (UTC)
And video game infobox in cswiki where 19/28 of the data may be from WD. My plan to use this infobox as a model for all next modified infoboxes. Our Wikidata module has also been edited "our" way. Matěj Suchánek (talk) 07:15, 26 April 2014 (UTC)
The usage of Wikidata is still limited to some edge cases/pioneer templates, sadly. I have no idea why it's taking so long to import truly crucial data from Wikipedias, is someone able to explain me? For instance it.wiki would immediately adopt Wikidata for its 250 thousands biographies (w:it:Template:Bio) if the data was brought here, but Wikidata:Bot requests#Persons and it.wikipedia sees no action (other than careful property mapping by me, ValterB and others). --Nemo 07:27, 26 April 2014 (UTC)
@Nemo_bis: WP is not a source for WP: if you import source with the data from Wp, ok but in other cases, that should be avoided. If WP:it wants unverified data in its infoboxes that its choice but other WP wnat to be able to select data especially when there are several values for the same property. Snipre (talk) 10:37, 26 April 2014 (UTC)
I'm not discussing philosophy (again); that got boring in two years of talking. The facts are that Wikidata is useless as a source for the Italian Wikipedia {{bio}}, that is useless for 1/4 of our Wikipedia (the template controls all the crucial parts of a biography article).
We survived 5 years without template and then 8 years (and counting) with the template filled locally, we're not going to die any time soon if we can't move this activity to Wikidata. It's not mandatory for Wikidata to be important, i.e. to host crucial activities which were so far scattered on local wikis. So, no need to convince me of that; everyone can have their opinions, but let's get the facts straight first. --Nemo 08:00, 27 April 2014 (UTC)
@Snipre:, Joseph Stalin (Q855), date of birth (P569), source: en.wiki. Fail. --Cpaolo79 (talk) 09:02, 29 April 2014 (UTC)


I believe one of the reasons for taking so long for building infoboxes that use wikidata, is that there are so many needed properties that don't exist yet. To make WP communities interested in using such templates and feeding and updating Wikidata, there is a need for showcase templates that provide the same information with the older templates. If they need to fill some parameters in the local article's template, and fill some properties in Wikidata to fill the infobox, noone will care. The easy way for every WP user is to keep doing what he/she already knows to do in one place: fill a template.

There are so many important properties waiting for creation, waiting for the number datatype. For example I build a wikidata-infobox for locations, where area and altitude properties are really needed. Height is needed for athletes. This information exists in the current infobox. An infobox holding most of the information in Wikidata and only these as template parameters, is inconvenient and will not have full support from Wikipedia users.

Lets face it: The usage of infoboxes is what will bring proper sourcing for data, and not the other way. Nothing will have proper sourcing if the data is not used by the crowd that is capable of doing this. Build the way for Wikidata to be useful and will be used. -geraki talk 09:43, 30 April 2014 (UTC)

I agree that the missing datatypes are a blocker (as the random access of items), but there are so many data that could be imported & replaced on wikipedia (like person infoboxes). If there are fears to use "unsourced data" by any wikipedia, then as a first step import the data and let them use only the data that was imported from their site. Later on we can start thinking about tools to compare different information and use ranks to prefer sourced reliable data.--Micru (talk) 10:03, 30 April 2014 (UTC)

I added a tool (Crotos) on this page, but I don't see it when I try to translate this page. What's wrong ? Pyb (talk) 06:32, 30 April 2014 (UTC)

Only Translation administrators can register new messages. What you can do is to add a new message in <translate> tags without any T:XX id, which will mark this page as outdated here and wait, or ask for translation administrator permissions. --Lockal (talk) 07:56, 30 April 2014 (UTC)
✓ Done Translation administrators check regularly the pages to update. For this type of requests, you can also use the Translators' noticeboard. --β16 - (talk) 08:34, 30 April 2014 (UTC)
ok, thanks Pyb (talk) 10:06, 30 April 2014 (UTC)