Wikidata:Project chat/Archive/2020/05

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Q36930430 and Machowicz

Originated with wrong spelling of a category for Richard Machowicz (Q2150152) in the Commons. I have fixed it in the Commons. There are still dozens of language labels to be corrected in the WD item. Do I have to change them by hand or is there possibly any other way ? Kpjas (talk) 06:38, 1 May 2020 (UTC)

I think it's better to create a new one and delete the above artefact. --- Jura 06:47, 1 May 2020 (UTC)
- some more other pseudo-Scottish surnames Macha (Q37572368) (Slavic surname Macha (Q60524884) ?), Q37277683 (Polish and other Slavic languages surname Machnik (Q65490929)), Q36902648 (like Polish surname Maciąg (Q27883518)), Q36960469 (like Anna Maciołek (Q73041914) ?), Q37473598 (Macura (Q60482576)) and Q37289908 (Macurak) Kpjas (talk) 09:55, 1 May 2020 (UTC)
  - There may be some duplicates, like Macha (Q60524884). Sorry, you spotted that already, Macha (Q37572368) can probably be merged with it. Ghouston (talk) 10:18, 1 May 2020 (UTC)
    - As the spelling is different, I don't think they should be repurposed or merged, but deleted. Did that. --- Jura 10:39, 1 May 2020 (UTC)
I emptied the labels / descs / aliases on Q36930430, so that they can be repopulated with the right version as desired. Ghouston (talk) 10:05, 1 May 2020 (UTC)

Empty mass and Gross mass

Hi, Is there a proper way to define the empty mass and the gross mass of, for instance, a car? I could not find proper qualifiers for the mass property. Thanks. Sovxx (talk) 10:35, 1 May 2020 (UTC)

How do we keep bot owners from importing the same bad data over and over

How do we keep bot owners from importing the same bad data over and over?

For example, there are multiple instances on Virgil (Q1398) the bot BotMultichillT (talk • contribs • logs):

imports a large batch of bad data

which is then removed

then imported again a few months later

How are we ever going to make progress as editors cleaning up bad data if the bot owners keep putting the bad data back in? --EncycloPetey (talk) 14:29, 26 March 2020 (UTC)

See Help:Deprecation. You keep the entry but mark it as a false claim. The entry can't be reinserted as it already exists but no one pays attention to it because it is recorded as false. From Hill To Shore (talk) 15:20, 26 March 2020 (UTC)

It's not possible to mark aliases that way. --EncycloPetey (talk) 15:32, 26 March 2020 (UTC)

In the specific example you link above where the bot is inserting several different language labels into the English fields, you need to flag it up to the bot operator. Either the source that the bot is using needs to be put on a black list or the logic of where the bot is inserting the data needs to be improved. From Hill To Shore (talk) 15:28, 26 March 2020 (UTC)

Several people have brought this and similar edits to the bot owner's attention multiple times. There are at least two active threads about ULAN data (from three editors) on his talk page right now. --EncycloPetey (talk) 15:32, 26 March 2020 (UTC)

The first step is to ping the bot owner to see if they will engage here. @Multichill:. If the bot owner doesn't respond to multiple requests on their talk page or pings to related discussions then you can flag it to administrators to intervene. What often happens is that we find the bot owner is busy and had either not picked up the earlier messages or misunderstood the implications. Most issues can be resolved once the bot operator engages in the discussion. From Hill To Shore (talk) 16:21, 26 March 2020 (UTC)

@From Hill To Shore: The bot owner has replied below. He refuses to change his bot and is accusing me of vandalism. He has reverted me [1] and claims all the data is valis as English aliases, contrary to my explanations below, simply because they are in another database. --EncycloPetey (talk) 16:51, 26 March 2020 (UTC)

"Bad" doesn't really say much. Sometimes it happens that bots have code errors or do mis-mappings and the result should be different or added elsewhere. This means that the bot's code has a design issue and should be blocked.

Here this doesn't seem true. I think we discussed these alias here before and concluded that it's a good idea to add them. Why do you keep deleting them?

It's a common feature of people born before spellings of names were standardized that there isn't just one that was used to refer to them. Bear in mind that Wikidata is not Wikipedia nor an encyclopedia.

If you think some are problematic, you could add them as statements with the given reference, deprecated rank, and a reason for deprecation. Such cleanup would be most helpful, merely suppressing referenced data is not. --- Jura 15:34, 26 March 2020 (UTC)

"Publiusz Wergiliusz Maro" is not English; it's Polish.

"Publio Virgilio Maron" is not English; it's French.

"Vergil." is not an alias; it's Vergil with a period added.

You think that "... Virgil" is a valid alias? Why? Why would we need an alias with preceding ellipsis?

In short, I don't think you properly looked at the list of added aliases. I don't see how adding 68 statements about deprecation is a good solution to the problem of alias cruft. --EncycloPetey (talk) 15:41, 26 March 2020 (UTC)

I think it's bad that "P. Virgil Maro" is missing. --- Jura 15:44, 26 March 2020 (UTC)
Can you point to examples of Romans whose praenomen has been abbreviated that way for an English alias? I would expect such abbreviations only in the Latin alias. --EncycloPetey (talk) 15:55, 26 March 2020 (UTC)
- The question is if it would appear in English texts and one would want to search for it on Wikidata. Other information can be included in the statements. --- Jura 16:26, 26 March 2020 (UTC)
  No, the question is whether it is a valid alias in that language. I can find French words and French quotations in English texts. That doesn't make them English. --EncycloPetey (talk) 16:53, 26 March 2020 (UTC)

I have no intention of changing the behavior of the bot because the Union Lists of Artist Names (ULAN) considers these valid aliases. You should not be removing these aliases. See https://www.getty.edu/vow/ULANFullDisplay?find=&role=&nation=&subjectid=500337098 to see theses aliases are all individually sourced. Removing these aliases borders vandalism. Multichill (talk) 16:41, 26 March 2020 (UTC)

Removing bad data is not vandalism; it improves the quality of Wikidata content. Why are you refusing to alter the behavior of your bot? --EncycloPetey (talk) 16:51, 26 March 2020 (UTC)

In your opinion it's bad data. This opinion is not backed by any sources. In my opinion these are valid and useful aliases and my opinion is backed by various sources. Shouting generally doesn't improve your point. Multichill (talk) 17:03, 26 March 2020 (UTC)

In your opinion, does "Vernacular" mean "English"? Your bot edits indicate that you think so. The data at ULAN you are adding is marked as Vernacular, but your bot is dumping the entire lot of it into the English aliases field, regardless of what language the data is in. The ULAN data does not indicate its language; it is therefore inappropriate to repeatedly claim that it is English. I have not "shouted" anywhere; I have bolded the key lines of my discussion for the ease of readers who do not wish to slog through all of the discussion. Emphasizing is not shouting. --EncycloPetey (talk) 17:45, 26 March 2020 (UTC)

You are asserting that the ULAN aliases are perfect. But this is clearly not the case. Once imported, if those aliases are improved, the improvements should persist. It would appear that bot-imported aliases, plus improvements by other Wikidata editors, are superior to the original ULAN aliases, and so the original ULAN aliases should not take precedence. —Scs (talk) 22:49, 26 March 2020 (UTC)

That's not my field, but in general: the fact that some organisation considers something valid does not mean that something is valid. There are many examples in chemistry databases where there are names that are obviously incorrect for an alias in WD (e.g. broader/narrower/related concepts that should have or already have different item in WD). Wostr (talk) 16:50, 26 March 2020 (UTC)

@Multichill, call me a vandal if you dare, will you? — Mike Novikoff 14:20, 31 March 2020 (UTC)

It is indeed a problem, I see a lot of incorrect aliases of chemical compounds that are a result of automatic imports (incl. aliases in different languages matched to English, aliases that are names of broader/narrower/related concepts, aliases with capital letters despite the fact that the same alias is already present without capitals). Sometimes such erroneous aliases are propagated to other languages (eg. English aliases are copied to Welsh, British English etc.) However, I don't think there is a universal solution for this — many of these errors are caused by imperfections in imported databases — but I think that every mass import should be discussed before it starts (at least a month earlier) in relevant WikiProject or in other place. This could reduce the number of errors. Wostr (talk) 16:47, 26 March 2020 (UTC)

- I think there was some gene bot the messed up a lot of aliases. I think they were mostly fixed. Also, imports of redirects from Wikipedia as aliases as done for some languages is known to be problematic. None of this is relevant to the referenced additions by the bot above. --- Jura 17:07, 26 March 2020 (UTC)
What is the primary purpose of aliases? To me they are simply a way to improve search results. If a variant of a name is likely to be used for search purposes, then it is useful to have that as an alternate label. The more aliases the better, generally. Is there some other use for aliases of which I am unaware, that is damaged by having too many? ArthurPSmith (talk) 17:29, 26 March 2020 (UTC)
But should Polish, French, and German aliases appear in the English alias field? Should aliases that simply add a period after the name be added? --EncycloPetey (talk) 17:52, 26 March 2020 (UTC)

I'd say that enabling searching is one of two equally-important purposes of aliases. But "enabling searching" does not necessarily require explicitly including every possible misspelling and abbreviation and punctuation variation. —Scs (talk) 12:09, 27 March 2020 (UTC)

I'm guessing the problem here is that the Union List of Artist Names does not tag aliases by language, but of course Wikidata does. So yes, the ULAN aliases are all valid; what's invalid is importing them into Wikidata and tagging them as "en". So we need to figure out a way of augmenting the ULAN aliases with language mappings for proper importing, or else find a way to import them into Wikidata either without a language tag, or with some kind of "unknown" or "unspecified" language tag. Or -- here's another idea -- instead of deleting the "bad" aliases, re-tag them with better languages, and then teach the bot not to re-import an alias if it's present under any language tag. —Scs (talk) 17:49, 26 March 2020 (UTC)
That's part of the problem. The ULAN database also has aliases that are the same other aliases, but with punctuation added, such as the name followed by a period, or the name preceded by ellipses. --EncycloPetey (talk) 17:50, 26 March 2020 (UTC)

I was about to say, the bot should be filtering those out, too, because in this case they're clearly unnecessary. The tricky part is that there are other cases where punctuation can be more significant. So it's not immediately obvious what a bot's rules should be for when to strip insignificant punctuation, or which punctuation is insignificant.

In any case, it seems we do need a better consensus on bot activity. This is the second time in as many weeks we've had complaints here about bots importing poor-quality data. It seems to me that in such situations bot operators should not just be falling back on the defense of "the bot is fine, and the data is fine, stop complaining". —Scs (talk) 18:00, 26 March 2020 (UTC)

This issue was raised back in 2018 (part1, part2), where I noted that many of the aliases being added by BotMultichillT from ULAN simply do not conform to our alias policy. The issue of adding alternative names marked as "vernacular" or "undetermined language" as English aliases was called out, as was the fact that many ULAN aliases are simply related concepts. The botop was asked to make the bot respect the work of editors removing these incorrect aliases, but apparently nothing was done. Bovlb (talk) 18:19, 26 March 2020 (UTC)

Can we block the bot until the problem is fixed? It's added the bad data in yet again since this discussion started. --EncycloPetey (talk) 20:15, 26 March 2020 (UTC)

So, having spent a lot of time in editing items about ancient Greek and Latin literature, I agree with many opinions expressed above, namely by @EncycloPetey, Scs, Bovlb:: it's true that ULAN contains a lot of aliases whose language is not stated; consequently, all these names exist, but it's wrong to add them indiscriminately as English aliases, because most of them aren't used in English sources, but in sources written in other languages. I have myself removed a lot of such aliases in the past years and months (this today). Since the problem has already been discussed (as stated by Bovlb), I think the bot should stop additions - or be blocked - until a consensus is reached; my suggestion is that the bot should at least never add an English alias when it is present in at least one label or alias other than English (of course there are a few exceptions, e.g. Publius Vergilius Maro can be a valid label/alias both in English and in Latin and in other languages, but it's better to manage those cases manually). --Epìdosis 20:34, 26 March 2020 (UTC)

Thanks for those links, @Bovlb:. At the risk of reopening that debate (and at the further risk of seeming to criticize the unsung work of bot operators, which is certainly not my intent), I have to point something out.

It's claimed that these "extra" aliases, the ones that some here are complaining about and trying to trim down, are important for searching. That might be true if we had a really dumb search engine, but we don't: the Mediawiki search engine(s) is/are pretty good. If you were to search for, say, "P. Vergilius Maro", and if Q1398 did not have an alias with that exact spelling, your search would still find Q1398 perfectly easily. (I tested this hypothesis by searching for "Q. Vergilius Maro". Similarly if you search for "Mantoano Virgilio" even though the closest explicitly-listed alias is "Virgilio Mantoano".) So there may be a reason for preserving the full breadth of these "extra" aliases, but enabling better searching isn't it. —Scs (talk) 11:31, 27 March 2020 (UTC)

Easily with Special:Search maybe provided that the relevant string property is indexed, but most searching at Wikidata is done with entity selector. If it hasn't happen in 8 years, it's not possible otherwise by now, it's unlikely to happen ever.

Anyways, it's still not stated why it's a problem to have more than 2 alias for an item. What are you trying to do with them? Maybe the usecase for not having them should be stated. --- Jura 11:51, 27 March 2020 (UTC)

Some of the points I have seen for retaining this data dump of aliases under the English tag suggest that the field is used solely for searching on Wikidata. This is incorrect. The aliases are reused elsewhere, such as in the Commons Creator template, providing a way for users of many projects to interact with our content. By dumping multiple language alias data against the English entry, we present English users with a nonsensical list of names (many of which can't even be read). Of a more serious and damaging nature though, we are hiding the native labels from non-English readers, because the content is against the wrong filter. From Hill To Shore (talk) 13:00, 27 March 2020 (UTC)
- Can you provide us a sample from Commons creator template you see as problematic? --- Jura 13:12, 27 March 2020 (UTC)
  - I can't give you a specific example of a creator that has been affected by this problem as it is unclear on the scale of the issue. If bots and human editors are edit warring over this, there may be no specific cases that have had their data read by Commons. However, as a hypothetical example, see Commons:Creator:Rowland Langmaid. This shows a number of aliases if you are viewing Commons in English but will show different aliases if you are set to a different language. From Hill To Shore (talk) 13:30, 27 March 2020 (UTC)
It's a problem to have supposed aliases that aren't really aliases because someone writing something is liable to feel free to choose one as a stylistic matter, and end up writing something inappropriate. - Jmabel (talk) 15:37, 27 March 2020 (UTC)

I looked at some cases people are not happy about the bot importing many aliases (Virgil (Q1398), Homer (Q6691), Jerome (Q44248) & Dante Alighieri (Q1067)). The common denominator is that these people don't seem to be a (subclass of) visual artist (Q3391743). I updated the query to only work on visual artists. Is that a good compromise? Multichill (talk) 18:18, 27 March 2020 (UTC)

This update will limit the problems, but this is not a generic solution. --NicoScribe (talk) 18:40, 27 March 2020 (UTC)

Moreover do you plan to remove the incorrect values that have already been imported? --NicoScribe (talk) 15:52, 31 March 2020 (UTC)

The problem (the import of "aliases that aren't really aliases" by the bots of Multichill) has multiple variants. Some variants can be found in old discussions (for instance, in chronological order, 2015, 2016, 2017, 2017, 2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019, 2020, 2020, 2020, 2020). In one of these discussions, in June 2019, I have asked to avoid the import of foreign common expressions into English aliases (for instance the Italian "detto", the German "genannt", the French "surnommé", which all mean "called"). --NicoScribe (talk) 18:40, 27 March 2020 (UTC)

This is not a problem of a bot, but a different understanding of what we should/should not be allowed as alias between one user and one bot-operator. Blaming the bot-script is not going to solve this. Starting a discussion with the bot-owner might either solve the issue, or end as agree to disagree. Edoderoo (talk) 10:29, 30 March 2020 (UTC)

What do you mean by "one user"? There's a lot of users complaining for more than four years already. Could it for once end as enough is enough? — Mike Novikoff 15:33, 30 March 2020 (UTC)

@Edoderoo: I am not really blaming the bots of Multichill. They are doing what their operator wants. Moreover "The contributions of a bot account remain the responsibility of its operator [...] In the case of any damage caused by a bot, the bot operator is asked to stop the bot. [...] The bot operator is responsible for cleaning up any damage caused by the bot" per Wikidata:Bots policy.

What do you mean by "one user"? Look at the old discussions and the current one: this is not "between one user and one bot-operator", this is "between 25 users and one bot-operator". Starting the old discussions with the bot-operator did not solve all the issues. --NicoScribe (talk) 15:39, 31 March 2020 (UTC)

What do you mean by one user. Counter question: why is the subject bot owners in plural? When the problem is with only one bot owner, it is brought as a generic problem between bot owners and the smart people. Discussing with such pre-positioning is not offering any solution. Edoderoo (talk) 16:11, 31 March 2020 (UTC)

@Edoderoo: yes, this discussion's title should include the words one bot owner instead of bot owners. So, what is your solution when 25 users disagree with one bot-operator? --NicoScribe (talk) 16:26, 31 March 2020 (UTC)

@Edoderoo: what is your solution when 25 users disagree with one bot-operator, please? If your solution is to talk, that is exactly what we are trying to do here. --NicoScribe (talk) 14:37, 4 April 2020 (UTC)

I guess our next move should be to ask for a block of BotMultichillT at AN. It clearly goes against the consensus, yet the bot owner says "I *am* right, period". Some other admin will put *another* period there, won't he? Personally, I'd suggest to forbid to use ULAN at all for any of Multichill's bots. Four years were more than enough to show his mighty skills to filter out junk from crap, let's strike a balance now. And I sincerely hope that WD is not like ruwiki (which I had left more than year ago) where any user with admin flag will do whatever he wants and nothing will ever stop him. — Mike Novikoff 21:44, 7 April 2020 (UTC)

Multichill already "filter out the non-latin strings" so I do not understand why Multichill would be unable to filter out some keywords, such as "called", "dit", "detto", "genannt", "surnommé", "plus connu sous le nom de", "eigentlich", "known as". --NicoScribe (talk) 15:52, 31 March 2020 (UTC)
Note also the Wikidata v. Renon case. Looks like an utter madness, doesn't it? I can't readily propose an algorithm to filter out such things, and moreover I don't suppose it's my burden. What does it mean in practice? Does it mean that Renon should persist for some another five or ten years? Just imagine all the people... no, alas, just imagine how globally this "Renon" will be replicated all across the Universe... well, across the Internet by then. — Mike Novikoff 18:25, 31 March 2020 (UTC)

I wrote to the Getty (the Vocabulary Program chief editor and 2 IT people): There's a heated discussion about Multichill (the author of the Sum of All Paintings project) importing ULAN names and aliases indiscriminately. The example being discussed is Virgil. Two problems are pointed out:
- 1. Many aliases are a duplicate of another, with just a trailing dot added. Sometimes the dot indicates an abbreviation, but not always. Even an example with leading ellipsis is pointed out. I some cases the dots are required to show abbreviation (Eg "Pub.V.M." and "P.V.M.") but in other cases are parasitic (eg "Wergiliusz" vs "Wergiliusz.") Since the dots are not useful for searching, can the Getty fix some of these problems?
- 2. There is no language tag, so his dumping of all aliases in Wikidata as EN is incorrect. But I don't see an easy way for the Getty to fix this... --Vladimir Alexiev (talk) 06:21, 13 April 2020 (UTC)

If this set of aliases is generally good (that is, if it's worth bulk-importing at all), but if it contains these anomalies that can't readily be automatically filtered out or properly tagged, one solution would be to let the volunteers here clean them up (as indeed some volunteers have been trying to do). That would work if the bot can be persuaded not to re-import the same data over and over. —Scs (talk) 11:13, 13 April 2020 (UTC)

If I were running this bot, I would alter its current algorithm:

for each alias a in external database D

if a is not present on the relevant entity e in Wikidata

add a to entity e

and change it to:

for each alias a in external database D

if a is not in auxiliary list X and is not present on the relevant entity e in Wikidata

add a to entity e

add a to auxiliary list X

—Scs (talk) 12:17, 14 April 2020 (UTC)

@scs: Such a change would resolve the issue of bots edit-warring with human editors but, as ghuron noted in a different discussion, "I need to create infrastructure that can cache hundreds of millions of individual statements. This task alone is significantly more complex than the rest of the bot code." Bovlb (talk) 16:50, 14 April 2020 (UTC)

@Bovlb: That's a good point; thanks for reminding me.

The difference here is that (a) I don't get the impression there are hundreds of millions of ULAN aliases being imported, and (b) aliases don't have rank, so the people who would like to see the aliases cleaned up have no choice but to try to convince the bot operator to change its algorithm.

(If this problem persists, I'm thinking that the people who would like to see the aliases cleaned up are going to have no choice but to seriously propose adding a whole new ranking/deprecation mechanism for aliases.)

The other question I would ask is whether the ULAN aliases are so vital a dataset that a 100% synchronization process between them and Wikidata has to be ongoing. If not, it seems like it would be enough to make one pass through the importation process, then call it done and move on to some other database to import. Or (since it's true that new people worthy of inclusion are being born and identified every day) there could be a simple mechanism to limit the continuous import to process only people newly added to ULAN. (That would be simple, without requiring an entire "auxiliary list X", if it happens to be the case that ULAN's IDs, like Wikidata's Q-ids, are monotonic.) —Scs (talk) 11:39, 15 April 2020 (UTC)

Abandoned the ULAN alias bot

No way to please people and I'm done with the unconstructive comments, nasty remarks and demotivating comments. Zero motivation to spend any more time on this so I'm abondoning it. I hope this makes you all very happy and proud of what you achieved. Multichill (talk) 18:54, 16 April 2020 (UTC)

I don't know if you're referring to me, but no, of course it doesn't make me happy. I don't like to see useful functionality discarded, and I don't like to see anyone as angry as you obviously are. But I really don't think much that was said here was nasty or unconstructive -- I at least was trying very hard (perhaps without success) to be constructive. I'm sorry if I failed at that. —Scs (talk) 12:21, 17 April 2020 (UTC)

Perhaps we are supposed to show much woe (like in the story of Juliet and Romeo), but I'll just say "thank you" instead.

And please get it right: thanks are not for all these years, just for now. I hope that we really won't see any of these 'Ko-zyöl Shmo-zёl' or 'kooklo-wod Brat.chuk' anymore.

And seriously, thank you for seeing the reason. An impossible situation in the fscking ruwiki. Please don't get offended, it's just that we are all here to improve the quality of the data rather than to be proud of such things like a "number of edits" or "I had written a script" or whatever. Remember the CC0 at last. — Mike Novikoff 01:11, 22 April 2020 (UTC)

Suggest to desysop Multichill based on this topic. --117.136.54.124 20:51, 23 April 2020 (UTC)

If you are going to make this personal about him, maybe you should have at least the courage to identify yourself. - Jmabel (talk) 23:43, 23 April 2020 (UTC)

I suppose I should explain the link to the next topic here. As some user who seems primarily to revert other people keeps changing the section header. Above we saw a series of edits showing how users reverted the same edit by the bot several times without attempting to address the matter in a way that could allow us to expand Wikidata's base. Clearly it's an aspect of the story we need to attempt to address. --- Jura 18:51, 27 April 2020 (UTC)

MGIMO finished? — Mike Novikoff 19:29, 27 April 2020 (UTC)

@Jura1: I'm not sure what you mean by "allow us to expand Wikidata's base", and I'm not sure what you would expect those aggrieved editors to do.

The impression I get is that concerns about the quality of some of those bot imports have been raised, many times, and have been repeatedly dismissed. The consensus seems to be that what the bots are doing is fine, and that those (few?) people who are concerned about them should find something else to worry about. (More on this below.) —Scs (talk) 13:52, 1 May 2020 (UTC)

If anything, I'd support desysopping a user who had done THAT much harm to the project, who had been doing it persistently for years (just imagine where any IP or new user would be if they would have done a one hundredth of this!), and haven't even suggested to rollback these edits himself, leaving the task as an exercise for ordinary rollbackers like me. I'm sure I haven't even seen most of these yet, so there's a lot of room for a future uncivil undo summaries. :( — Mike Novikoff 19:29, 27 April 2020 (UTC)

Multichill, what did you mean at 06:28, 23 April 2020?!! — Mike Novikoff 22:44, 27 April 2020 (UTC)

@Mike Novikoff: I'm not Multichill, but based on the arguments that have been taking place here I believe I can summarize the argument for including that alias:

The ULAN aliases come from a respected database, curated by Getty. Any alias that is good enough for ULAN is good enough for Wikidata. The individuals who claim that some few of those aliases are somehow "bad" have adduced no objective proofs for their claims, or at any rate, no proofs which outweigh the demonstrated value of the ULAN aliases as a whole. It's the bot's job to make sure that any alias in ULAN is also in Wikidata, so if someone manually removes one of these aliases from Wikidata, the bot is only doing its job by putting it back.

I'm not saying I agree with this argument (I don't), but I believe that's the argument. If I have misrepresented it I'm happy to be corrected. —Scs (talk) 13:52, 1 May 2020 (UTC)

But no: the Ulan alias bot has - very unfortunately - not been abandoned at all, and persists, apparently on a weekly basis, in including as aliases anything considered as such by ULAN, like here advertisements for enamels, as here on April 16, here on April 23, and here today. I assume next time will be on May 7 ? Sapphorain (talk) 08:00, 30 April 2020 (UTC)

It seems to me we have an impasse between three factions:

We have a few bot operators who interpret their mandate quite broadly: Wikidata should be a proper union of every other database we consider worthy. For a variety of reasons, it is best if our inclusion of another database's items is 100%. Trying to include 99% would mean having to define and remember and apply some definition of the 1% which we wish not to include, and that's not worth it.
We have a few editors who have observed that a few pieces of data imported from those external databases are (to put it charitably) of considerably lower quality. These editors feel that they are improving Wikidata by removing that low-quality data. These editors are frustrated when the bots continue to re-import the same data. If the data in question were imported as statements, they could be deprecated in various ways, but the problem is particularly acute for aliases, which have no ranking or qualification mechanism.
Finally, the majority of editors -- and this includes the more senior, respected, consensus makers -- seem to, basically, not care. Either they fully support the actions of the bots, or they figure that the bots do so much good overall that it's okay if they import a bit of chaff along with the wheat, or they feel that it's too unseemly to ask an underappreciated bot operator to do even more work by reprogramming the bot to be more selective. Furthermore, this majority is (it seems) tired of the complaints of faction #2, and wishes they would go away.

It's always dangerous to factionalize (Lord knows there's far too much of it in the word today), and I feel bad about presenting these divisions so starkly. I know I've made a lot of unwarranted assumptions, and I know what I've presented is an oversimplification, but I'm doing so to make the point that this is how the situation feels to me, and I suspect the editors in faction #2. Please, someone, explain how I've gotten this wrong, or even better, suggest what you think we ought to do in order to resolve this imbroglio. Should we be asking bot operators to make their bots more selective? Should we be asking the folks in #2 to settle for "good enough" and move on? Should we be asking the wikibase folks to add qualifiers or ranking mechanisms for aliases, analogous to statements? Or is this not worth talking about? —Scs (talk) 13:52, 1 May 2020 (UTC), revised 14:39, 1 May 2020 (UTC)

@Scs: To me the solution is fairly simple: if the ULAN alias isn't tagged with something more precise than vernacular, then the bot shouldn't import it, because it has a chance not to be in English (and in a number of cases it's not in English). That would considerably lower the amount of bad data being imported, since a good many of aliases aren't language-refined. Alternatively, and I think you already mentionned it, why should the bot run every single week on every single item? If it wasn't updated in the meantime, there is no point in doing so. --Jahl de Vautban (talk) 14:55, 1 May 2020 (UTC)

I never argued that the aliases weren't "good". I argued that the supposedly English-language aliases often are not things that are ever used in the English language, and I stand by that. - Jmabel (talk) 16:02, 1 May 2020 (UTC)

@Scs: Before I begin, I want to clarify that I have never seen myself as part of a faction nor do I wish to be part of one. I try to consider any situation on the merits of the arguments presented and leave room to be swayed in support of a position I originally disagreed with (or perhaps to abstain if I can't support but have been convinced to withdraw an objection). My view on this is that all datasets have a degree of error and there is a need to correct those errors and maintain the data. A bot or script is a great way to import data en masse, but we need to acknowledge that it will also speed up the importation of the errors in the original dataset.

If a bot imports a large body of data in a single run and then switches off, I can see no problem. Any errors imported with the data will be cleaned by human editors over time. However, if we have a bot that is continuing to run with the same orders over a long period of time, it will attempt to overwrite the cleaned data with the errors in the original source dataset. We need bot operators to acknowledge that there are errors in the source data they are importing and come up with a method to stop their bots or scripts from reinserting the same error. Most bot operators I have encountered acknowledge this and make attempts to alter their bot's behaviour when an error or problematic action is pointed out to them. The source of the discussion here appears to have been flagged to the bot operator several times over a period of at least two years. That suggests either the bot operator was unwilling to fix the problem, or the nature of the source dataset made it impossible to implement a fix within the script (I have no knowledge of the previous cases or actions, so I make no judgement either way).

In situations where a bot continues to reinsert bad data over cleaned data, the first action is to fix the bot. If that doesn't work, the next action is to stop the bot. When a bot stops the correction of known errors, it has to be seen as crossing the line from a beneficial tool to a detriment to the project. From Hill To Shore (talk) 16:42, 1 May 2020 (UTC)

I've re-read the discussion above and I can't square the debate with SCS's summary. The vast majority of people who have commented have objected to the error and a couple of editors have defended the bot operator's right to import source data regardless of quality. Summarising the debate into opposing camps and stating both are equal looks like a misrepresentation of the consensus. Also, I am unclear on where the third category of "the majority of editors... seem to, basically, not care" has come from. Was this based on the vast majority of editors not choosing to participate in the discussion? If we held all decisions on Wikimedia projects to the burden of participation, we would never reach a consensus. Reading a motive into silence is a logical fallacy. From Hill To Shore (talk) 17:21, 1 May 2020 (UTC)

@From Hill To Shore: Here are the facts, then: (1) users continue to complain about "bad", bot-imported data; (2) the operators of the bots in question continue to not change their bots, and to either not respond to the complaints at all, or state baldly that the bots are fine and they have no intention of changing them; (3) the bots in question continue to not get blocked by administrators. You're right, reading a motive into silence is a fallacy, but the silence here is deafening, and the fact that these blocks don't get blocked does, I think, represent a de facto ruling by administrators that the bots are fine.

The conclusion I'm left with is that the users who would delete and/or complain about the bad data are indeed wasting their time. If they delete the data it gets re-inserted; if they complain about the bad data here they may get notes of agreement, but nothing changes. They have no choice, it seems, but to heave a heavy sigh and concede that Wikidata's data quality can't be brought up to their desired level in this regard.

(But I've got to remember that I, too, am wasting my time, and the time of anyone who is reading all this. So far, the only thing I've accomplished is to give the impression that I'm anti-bot, when nothing could be farther from the truth. So I'll try, again, to stop blathering on this topic.) —Scs (talk) 19:22, 1 May 2020 (UTC)

It may be worth drawing up a list of principles for bot editing (we could build on statements in the existing bot processes) and then initiate an RfC to generate a discussion. I'm thinking something along the lines of "bots and scripts are beneficial to wikidata and they allow... however, bots and scripts can sometimes cause problems such as... Bot and script operators should..." The RfC shouldn't be a vote to support or oppose but rather a discussion to talk around the merits of the principles; should we have them, are they the right ones, should they be adopted as a formal guideline or policy and should they be enforced in some way? When does a bot edit switch from being beneficial to being a burden?

As a starting point, I'd advise having further discussions under a more neutral heading. The title at the top of this section implies that bot operators as a group are causing problems, which is not going to set the right tone for a constructive discussion. The subheading is also misleading as we have moved on from talking about an individual bot to a more general discussion of managing situations where a bot operator edits against consensus. From Hill To Shore (talk) 21:20, 1 May 2020 (UTC)

I agree, and I wish someone luck with that. My earlier attempt at defining some bot principles bombed rather spectacularly, as numerous commentators seemed to feel that it treated bot operators much too harshly. —Scs (talk) 23:13, 1 May 2020 (UTC)

Well, all this talk seems in general be leading nowhere. But. In the particular case of the ULAN alias bot, Multichill has stated that he abandoned it. Could he then kindly actually do it? For real? And not just state he did it? This would at least solve this small part of the problem. Sapphorain (talk) 21:43, 1 May 2020 (UTC)

Introduction into R

Hello,

in the last months I have thinked about how I can make it clear how I create the descriptions I add. At the moment this is not clear for other users. I gave to the way how I make it with a Spreadsheet an Introduction in Ulm last year. Since then I learned a bit using R. I dont know much about and so it were great to learn something from other users who use it. I know how it is possible to make a VLOOKUP kind thing in R and how to filter data and creating subsets out of it and putting them to a new file. At the installation of PAWS about what you can find more information here, Wikidata:PAWS it is possible to use R. Is here somebody who uses R to prepare the data before uploading and can tell me something about it. --Hogü-456 (talk) 19:39, 28 April 2020 (UTC)

Most probably use Python or Wolfram Mathematica languages. Nevertheless I saw a couple of diagrams built with R. --Infovarius (talk) 19:34, 1 May 2020 (UTC)

Need help with merging

Could someone help me with the merging of two items? (i've never merged items, so i'm a bit hesitant) I did find two wikidata items that concern both to the Formula 1 activities of Alfa Romeo, (so this is incorrect as i see it).

Therefore i think that this wikidata item: (Q622489)called "Alfa Romeo" needs to be merged into : Alfa Romeo in Formula One Alfa Romeo in Formula One (Q65960697).

The last item is the newer one, however i still think this description "Alfa Romeo in Formula One" is a better/more specific description and to avoid confusion with the car brand "Alfa Romeo" (Q26921]).

Kind regards Saschaporsche (talk) 07:34, 30 April 2020 (UTC)

Help:Merge explains how to merge items covering the same topic. However, the scope of the two:

seems to be different (which is why they link to each other). --- Jura 07:39, 30 April 2020 (UTC)

Sorry, may be i don't get it, but both items cover the same subject: "Formula 1 activities of the brand Alfa Romeo" so why are they separated? kind regards Saschaporsche (talk) 07:44, 30 April 2020 (UTC)

Looking through the edit history, it does seem odd. I suppose one could conceive having several items, but it's not clear to me why the sitelinks were split merely based on the format of page titles at Wikipedia. --- Jura 07:58, 30 April 2020 (UTC)

So where do we go from here? What "sitelinks" are you talking about? (i'm only a beginner on wikidata). Kind regards Saschaporsche (talk) 08:28, 30 April 2020 (UTC)

Both items link to several Wikipedias (sitelinks). Maybe I should look at each page, but it appears that the difference between what is one item and what is on the other items merely depends on whether that Wikipedia uses a naming convention for their article title that is "Alfa Romeo in Formula One" or "Alfa Romeo (Formula One)". If that is correct, then all the sitelinks should be on the same item. --- Jura 08:34, 30 April 2020 (UTC)

I see this was discussed on English Wikipedia at w:Wikipedia_talk:WikiProject_Formula_One/Archive_53#Wikidata_item_merging. It seems Eurohunter disagree with the other parties. Stryn (talk) 15:27, 30 April 2020 (UTC)

Thanks for all our comments, i've started to do some work; i moved a couple of sitelinks to Alfa Romeo (Q622489). Now "Alfa Romeo in Formula One (Q65960697)" is empty. Can i now suggest this wikidata item for deletion? Kind regards Saschaporsche (talk) 18:44, 30 April 2020 (UTC)

The problem with the discussion on enwiki is that it isn't really relevant for Wikidata contributors. --- Jura 19:57, 30 April 2020 (UTC)

Wikidata can have an item for the current Alfa Romeo Racing / Alfa Romeo Racing Orlen, founded in 2019, even if the English Wikipedia decides to cover it in a page with all of Alfa Romeo's historical Formula One activities. Ghouston (talk) 02:17, 1 May 2020 (UTC)

They can't be merged as I said there can be different articles about Alfa Rome like F1 team (even in different period), engine supplier and whole activity of Alfa Romeo in F1 (team and engine supplier). Eurohunter (talk) 16:45, 1 May 2020 (UTC)

So how can we change things, because now we have two wikidata items that don't quite interconnect in my opinion. Saschaporsche (talk) 19:36, 1 May 2020 (UTC)

How to link to multilingual Wikisource

I would like to link Template:Error (Q5400225) to https://wikisource.org/wiki/Template:Error. However, I do not know what to enter for the language field. I thought perhaps I could use oldwikisource based on w:en:Help:Interwiki linking#Project titles and shortcuts, but was unsuccessful. How can this be accomplished? Daask (talk) 13:44, 1 May 2020 (UTC)

As far as I know it's not yet possible to link to that particular wiki. There's a ticket at phab:T138332 - Nikki (talk) 18:29, 1 May 2020 (UTC)

Bug with symmetrical propoerties

Tracked in Phabricator
Task T252078

Hello! I have found this bug. You may help with it. Thanks! -Theklan (talk) 21:59, 6 May 2020 (UTC)

This section was archived on a request by: --- Jura 10:06, 7 May 2020 (UTC)

Special:MathWikibase

The above special page displays label, description and formula from an item,

e.g. Special:MathWikibase/Q205692 with Poisson distribution (Q205692).

Does it do anything else? Is it documented somewhere. I don't think it's on w:Help:Displaying_a_formula or mw:Extension:Math. --- Jura 12:11, 27 April 2020 (UTC)

@Jura1: I also don't know where the documentation is; I think I saw it announced on the project chat a few months ago, though I can't find it in the archives. There is at least one more feature, seen in Special:MathWikibase/Q35875: using has part(s) (P527) with quantity symbol (string) (P416) as a qualifier you can have it display components of the formula. ―Vahurzpu (talk) 13:24, 27 April 2020 (UTC)

@Vahurzpu: thanks, I don't recall that either. Interesting feature though. Looks like it's using the incorrect properties though. calculated from (P4934) and in defining formula (P7235) should be used. @Lea_Lacroix_(WMDE): would you know where it was announced and could you arrange for the configuration to be adjusted? --- Jura 13:34, 27 April 2020 (UTC)

I don't have any specific information about this page, it was not developped by the Wikidata team. I only found this ticket that may be related. Maybe @Physikerwelt: has more information? Lea Lacroix (WMDE) (talk) 14:55, 27 April 2020 (UTC)

Thanks. By testing, I found it also supports defining formula (P2534) as qualifier (in addition to quantity symbol (string) (P416)).

If a "part" property is to be used, shouldn't it have been has part(s) of the class (P2670)?

The sample given in the ticket shows that it also links to articles with used on enwiki (e.g. on Special:MathWikibase&qid=Q1899432). Cool!

@Andreg-p:: can you adjust it to use calculated from (P4934) and in defining formula (P7235) (qualifier and main statement) ? --- Jura 15:16, 27 April 2020 (UTC)

@Lea_Lacroix_(WMDE): Thank you for bringing this issue to my attention. The math extension currently uses the following properties ([2]):

Changing this behavior requires a two-step process. First, a concrete proposal needs to be established and then approved by the math community. To start this process, someone needs to volunteer to interact with the community and @Andreg-p: for a time span of a few months before this is implemented and rolled out in production. Without a fixed contact person putting work into the implementation is too frustrating, cf. phab:T208758. --Physikerwelt (talk) 10:39, 29 April 2020 (UTC)

No worries. There is just a risk that eventually your feature stops working as the Wikidata community cleans up its data. Curious who advised you to use has part(s) (P527) instead of has part(s) of the class (P2670) or calculated from (P4934). --- Jura 12:11, 29 April 2020 (UTC)
- I don't remember it was in 2016 when we wanted to create new properties for mathematics. I personally think that calculated from is misleading. For instance in differential equations there is not one quanity that is calculated from another. I would very much apprechiate data cleaning efforts. Is there a discussion on how to model mathematical formulae? Physikerwelt (talk) 08:17, 30 April 2020 (UTC)
  - The feature seems to have been written earlier this year, not in 2016. Beyond the ones you already know of, there is Help:Basic_membership_properties with samples at the bottom of the page about the difference betwen P527 and P2670. P2670 might have been created after your forgotten discussion of 2016. --- Jura 06:21, 2 May 2020 (UTC)

Property:P2313 and P2312 for minimum and maximum values outside property constraints

Notified participants of WikiProject property constraints @Ivan A. Krestinin:

Originally these two properties were created for property constraints (see Help:Property constraints portal/Range).

At same point the English labels were changed to remove "(property constraint)", but not the description ("qualifier to define a property constraint in combination with P2302"). Some uses are now on items.

Should these be used outside property constraints, used for regular statements, or should we create an additional pair of properties? --- Jura 05:15, 28 April 2020 (UTC)

Yes, a new property is needed, just as maximum date (property constraint) (P2311) as to latest date (P1326).--GZWDer (talk) 10:20, 2 May 2020 (UTC)

Swiss municipalities with two coordinates

Hello! Lots of municipality of Switzerland (Q70208) have two coordinates, making automatic transclusions break. It seems that lots of them are an import from Cebuano Wikipedia. Would it be possible to delete them, as it is redundant (and sometimes even unexact: Eschenbach (Q7092))? -Theklan (talk) 16:47, 1 May 2020 (UTC)

They violate a single-value constraint, so yes, it seems it would be not just possible, but desirable to delete the redundant ones! (One challenge, of course, is picking which one is "better".) —Scs (talk) 17:33, 1 May 2020 (UTC)

If they are imported from another source, would it be better to deprecate one? Deleting it will just lead to eventual reinsertion. From Hill To Shore (talk) 17:38, 1 May 2020 (UTC)

I think it's fine to remove them. Bots don't usually add more coordinates when an item already has some. The ones from the Cebuano Wikipedia are almost certainly caused by merges anyway. - Nikki (talk) 18:12, 1 May 2020 (UTC)

All information imported from Cebuano Wikipedia can safely be removed. I would go as far as to allow a bot removal.--Ymblanter (talk) 18:34, 1 May 2020 (UTC)

@Ymblanter: Can you clarify that statement, please? I pay a fair amount of attention to geodata, and I have observed that:

there are many places in the world that have an article only in cebwiki, but
the information in cebwiki seems to be accurate enough, because
it seems to have been bot-imported (into cebwiki, that is) from some amazingly comprehensive geographic database that I've never heard of and that does not appear to be represented in any of the other wikis.

So, yes, if a particular piece of cebwiki-imported data seems to be inferior to another piece of data we've got, by all means, supersede the cebwiki data with the better data by deleting the cebwiki data. But no, I would never say we should delete all cebwiki data, because much of it appears to be high-quality. —Scs (talk) 19:32, 1 May 2020 (UTC)

Disclaimer: I have no connection with cebwiki; I barely even know what language "ceb" represents.

Indeed, all information in Cebuano wiki is bot imported from elsewhere, and this is exactly the resaon why we should not have it here. If we want this information, we should be importing it (presumably, by bot) from the primary sources, whiah that bot used. Btw I have come across really wrong info on the Cebuano Wikipedia, though I will not be able to recollect now where exactly it was.--Ymblanter (talk) 19:38, 1 May 2020 (UTC)

I'm fairly sure the "amazingly comprehensive geographic database" User:Scs is talking about is GeoNames (Q830106). It's probably better to keep non-GeoNames data just because it's more likely to have been entered by a human and thus sanity-checked. Vahurzpu (talk) 20:06, 1 May 2020 (UTC)

I see. Someone created hundreds of stubs on Kazakhstav localities in the based on GeoNames on the English Wikipedia and did not take into account that the localities were renamed. As a result, we often have two copies of the articles (which presumably made it into Wikidata as well) and it is vertually impossible to figure out that they are the same.--Ymblanter (talk) 20:11, 1 May 2020 (UTC)

The worst data introduced from ceb are the elevation values for the geographic object - especially for mountains/hills they are often total bogus. And to top it - the bot which imported it here often did not even add the reference, so it is impossible to notice that its the wildly guess number. Ahoerstemeier (talk) 21:36, 1 May 2020 (UTC)

Accidentally, we just today got an example of an item which was created in duplicate on ceb.wp and then the error propagated here, and we can not do anything about it: Wikidata:Bureaucrats' noticeboard#permanent duplicated item (P2959).--Ymblanter (talk) 11:11, 2 May 2020 (UTC)

date of first performance (P1191)

Applies to works in progress/unfinished/demos? Eurohunter (talk) 23:23, 1 May 2020 (UTC)

I would say that it does not. Iwan.Aucamp (talk) 09:43, 2 May 2020 (UTC)
- @Iwan.Aucamp: In Wikipedia article we would add sentance like "on 24 April 2020 part of song was played in BBC Sounds during the interwiew" and I wonder if this data could be added to Wikidata. Eurohunter (talk) 13:14, 2 May 2020 (UTC)
  - @Eurohunter: Just my view, I won't make a fuss if you use it but IMO it would be better to have a "performed on" property or something and use that with qualification. The problem with using this property is that when the first performance of the completed work happens it will be a bit of a conflict as to which date is appropriate. Iwan.Aucamp (talk) 15:04, 2 May 2020 (UTC)

How to find which bots are scraping a specific site or database?

I would like to know if there are any bots operating on https://dblp.uni-trier.de/ - but I'm not aware of a simple and straight forward way to check this.

If each bot had an item associated with it and the bot items had ?botuses (P2283)dblp computer science bibliography (Q1224715) then it would be significantly easier to find which bots uses what. Ideally each bot should say:

What it uses for sources (e.g. dblp)
What items it will create (e.g. instance of human)
What properties it will set

Maybe the right solution is to have:

Bot item
Bot task item

And then some of this goes on the Bot task item?

Maybe there is already some way to figure this out. Any input would be appreciated. Iwan.Aucamp (talk) 23:43, 1 May 2020 (UTC)

Each item in an external database should have an identifier which is imported and has its own property. What would be needed is a tool that shows which users have the most edits with this property. --SCIdude (talk) 08:32, 2 May 2020 (UTC)

For example the tool Navel Gazer ~~which however only shows changes not creations~~ (it says "Data derived from database dump wikidatawiki-stub-meta-history.xml" so this is not a normal SPARQL query). For DBLP the associated property would be DBLP author ID (P2456) and the user with most edits ~~(not creations but still)~~ would be Florian.Reitz. --SCIdude (talk) 08:49, 2 May 2020 (UTC)

While this may be a way to find the information I want in some cases there are many cases for which this would not actually yield the information I want even if everything worked. For one, in this case what you found was not actually a bot. Further if a bot goes offline and a new one comes online it will also not work to find the new one. I think the right answer is to have structured data for this, as items. It is really not that difficult and it is a lot cleaner. Iwan.Aucamp (talk) 09:38, 2 May 2020 (UTC)

Scooby-Doo duplication

https://www.wikidata.org/wiki/Q936279 and https://www.wikidata.org/wiki/Q205683 are duplicates, but when I try to merge them I either get errors or nothing happens. Any idea of what's going on?

They have separate articles on enwiki (and probably others, I didn't check).

Q936279 is w:Scooby-Doo, Where Are You! and is basically just for the original animated cartoons, 1969-1970.
Q205683 is w:Scooby-Doo and is for the whole "franchise": those cartoons, the movie, the video games, etc.

So that's why they are (and have to be) separate entities here, too.

Q936279 is an instance of animated series (Q581714), and Q205683 is an instance of media franchise (Q196600).

—Scs (talk) 14:13, 2 May 2020 (UTC)

Should I merge Macclesfield Bank (Q20050783) and Q14592080 or not?

Both looks purely same by de jure. --Liuxinyu970226 (talk) 01:00, 2 May 2020 (UTC)

@Liuxinyu970226: They don't look the same to me, what am I missing? Iwan.Aucamp (talk) 07:43, 2 May 2020 (UTC)

Perhaps the same, or perhaps 中沙大环礁 "Zhongsha Great Atoll" is just part of 中沙岛礁 "Zhongsha Island Reef". Ghouston (talk) 02:03, 3 May 2020 (UTC)

Two people conflated

Nicolaes van Bambeeck (Q4625873) has two people conflated, is there an easy way to tease them apart? They are 100 years apart, the Wikipedia article is about the earlier man, but all the positions held is about the later man. Usually conflation has just one of two values that need t be pulled apart. --RAN (talk) 19:04, 2 May 2020 (UTC)

In these situations I create a duplicate item, make sure thde descriptions are disambiguated, and then selectively remove incorrect statements from each one. - PKM (talk) 19:15, 2 May 2020 (UTC)

Two items had been merged; I undid the merge and restored Nicolaas van Bambeeck (Q57151023). Peter James (talk) 22:36, 2 May 2020 (UTC)

A likely explanation is that the Commons categories for Nicolaes use the spelling "Nicolaas"; not sure if it should be changed or if that is an alternative spelling (I checked the identifiers and they don't mention it). Peter James (talk) 22:45, 2 May 2020 (UTC)

Q92453624

I am curious as to why Q92453624 was deleted. What admin rights do we need to see deleted items? --RAN (talk) 02:43, 2 May 2020 (UTC)

Unless you are a member of the Wikidata staff team, you should be an administrator. I think you need the delete right to view deleted revisions. Ahmad^talk 05:00, 2 May 2020 (UTC)

@Ahmad252:Can you clarify why visibility of deletions is restricted? Iwan.Aucamp (talk) 20:31, 3 May 2020 (UTC)

Can we know why the label was suppressed on deletion? --- Jura 07:54, 2 May 2020 (UTC)

The label have been supressed in the majority of deleted items for quite some time now. It's quite a pain@Jura1:--Trade (talk) 19:02, 2 May 2020 (UTC)

@Iwan.Aucamp: Sometimes, there are legal reasons. For example, especially on wikis like Wikipedia or Commons, there are thousands of files and revisions deleted because of being copyright violations. Making them visible to the public will practically violate the copyright and can therefore lead to legal problems for the Wikimedia Foundation. I think this is of less importance on Wikidata, given that items essentially can't contain copyright violations (can they?). Other reasons can be privacy issues, biography of living person (BLP) issues, or a variety of other issues. Generally, the idea is that a page is deleted because it contained something inappropriate, so the only ones who can still see it should be trusted. The thing is that it is an all-or-nothing. To my knowledge, if you give someone the right to view deleted revisions, you will give them the full right to see all deleted revisions. There is no option to limit this access. Ahmad^talk 22:18, 3 May 2020 (UTC)

The entry was for a child of another entry, I was just curious what is contained, that led to the deletion, so I can avoid having entries I create deleted. A child of another human entry appears to have a structural need. I once asked if there were restrictions on how many generations of a family can be added and was told there were no restrictions. For instance there are ten generations present for most US presidents and 20 or more for noble families. I am also curious about why some Q-entries go through a consensus deletion process, and others can be deleted by a single editor with deletion rights. Also, why can't we have an expanded list of editors that can view deleted entries? Do we have an appeals process to reverse deletions? --RAN (talk) 08:44, 2 May 2020 (UTC)

Why do you need admin rights or special rights to see deletions? it makes very little sense to me. Iwan.Aucamp (talk) 09:41, 2 May 2020 (UTC)

It would also be nice if a message was automatically left on the creator's talk page when an entry they created is deleted. You should't have to find out when you go looking for the entry, and find it missing. --RAN (talk) 18:41, 2 May 2020 (UTC)

Human lexeme

Just found this lexeme that seem to be not a lexeme at all but a human created as a lexeme at all. What’s weird is that it’s found in a query

select * {
  ?item wdt:P31 wd:Q5 .
} limit 10000

Try it!

which means a lexeme can very well be found in a regular query about items, is this a bug or something ?

Apart from that, do we have a procedure to find/handle such mistakes ? author TomT0m / talk page 16:15, 2 May 2020 (UTC)

The lexeme can be deleted as an item Jurgi Kintana Goiriena (Q57659657) already exists. Peter James (talk) 16:33, 2 May 2020 (UTC)

I have done so. —MisterSynergy (talk) 14:48, 3 May 2020 (UTC)

UserWarning script

For some time now, I've been looking for an easy way to warn users. I couldn't find any UserWarning script, so I began localizing one myself. It can be found here. Given that we don't have many uw templates, I chose a rather simple script. A short documentation is available here. If you have any suggestions, please let me know. Ahmad^talk 05:40, 3 May 2020 (UTC)

Nice work but we also have User:Bene*/userwarn.js. Yeah we don't have so many warn templates but we need to develop more wikidata centric. ‐‐1997kB (talk) 05:54, 3 May 2020 (UTC)
Oh, I must've missed that. Wish I've seen it sooner; it's a nice one (and is also translatable, an important feature for Wikidata). Thanks. Ahmad^talk 06:32, 3 May 2020 (UTC)
What exactly do you mean when you say that you want to warn users? If you are talking about a standardized way to send errors to new users when they do certain errors, those messages should likely be templates given that templates can be read by users in their own language and you might not know which language a new user is fluent in when you want to send them a message.

See also https://www.wikidata.org/wiki/Wikidata:WikiProject_Welcome/Automated_Bot_messages for a potential way to list a bunch of templates for common errors. ChristianKl ❪✉❫ 07:18, 3 May 2020 (UTC)

@ChristianKl: Yes, I think that is the idea. I agree about the templates, and this tool does the same thing (the messages aren't built-in, I only specified which templates it should use). I got a list of available templates from Category:User warning templates (the list certainly needs a review, though. I will try to do it at least for some templates. Many aren't translatable, and icons differ from one template to another). I like the bot idea, but this tool is a little bit different: it also covers templates that aren't likely to be sent by a bot (e.g. warnings about vandalism, test edits etc.). It is my understanding that Wikidata:WikiProject Welcome/Automated Bot messages covers more auto-detectable issues, right? Ahmad^talk 08:04, 3 May 2020 (UTC)

Merge request

Hiya yesterday I created an EN wikipedia page for Bose Ogulu. I added wikidate information but somehow managed to create a new item instead of editing the existing one. Would it be possible to merge or delete Q71976779 since Q92994095 (the new one) has much more info on it now? Thanks for any help. Mujinga (talk) 10:39, 3 May 2020 (UTC)

We would rather merge more relevant information into the older-created item. So the deletion is not necessary, Q92994095 to be merge into Q71976779. --Wolverène (talk) 10:44, 3 May 2020 (UTC)

Done--Ymblanter (talk) 10:46, 3 May 2020 (UTC)

Thanks to both for the fast response, of course it indeed makes sense to merge to the older one, I should have thought of that. Cheers! Mujinga (talk) 10:49, 3 May 2020 (UTC)

are there any differences between Grace Frankland (Q88824282) & Grace Frankland (Q4794922) ?

can we consider one of them as duplicate ? Leela52452 (talk) 12:46, 3 May 2020 (UTC)

Done @Peter_James: thanks for fixing so quickly Leela52452 (talk) 13:20, 3 May 2020 (UTC)

image copyright policy?

Does Wikidata have a specific policy on the copyright policy of the images it hosts? I couldn't find anything at Wikidata:List of policies and guidelines.

I know that, for example, Commons is stricter than Wikipedia. I'm getting the impression that Wikidata is probably closer to Commons.

(The example I came across that got me thinking about this is Leela (Q121841). She has a low-res, copyrighted, fair-use picture on Wikipedia. Here, and on Commons, there's only a fan dressed up as Leela, which seems wrong, although I suppose someone felt it was better than nothing. At Bender (Q750023) we've similarly got a fan in costume. At Mickey Mouse (Q11934) we've got a "real" image of the character, but it's an old one, claimed to be out of copyright. And at Bart Simpson (Q5480) there's no image at all.)

Not saying there's anything wrong here, just wondering if it's written down anywhere. —Scs (talk) 14:13, 3 May 2020 (UTC)

Hi Scs, Uploading images on Wikidata has been deactivated. This means that locally uploading images is not possible, and that only images that are stored on Wikimedia Commons can be used. Basically this means that Wikidata has the same policy on images and files as ~~Wikidata~~Wikimedia Commons.

Fair use is not allowed on Wikimedia Commons and is only allowed on some Wikipedias where the community has arranged a fair use exception with WMF. Romaine (talk) 14:30, 3 May 2020 (UTC)

@Romaine: Thanks for confirming. That's about what I figured. (I assume you meant "Wikidata has the same policy as Commons".)

Anybody know if this is written down anywhere, or is it basically the default across all Wikimedia projects, with narrow, project-specific exceptions as Romaine mentions? —Scs (talk) 14:37, 3 May 2020 (UTC)

To my knowledge, there is no text regarding image use. As User:Romaine mentioned, Wikidata does not host any files (see here). It is worth to mention, however, that in data items, files are not really used; they are just *linked* (but displayed in the web UI for convenience). Technically one can only link files hosted at Wikimedia Commons, which in fact means that all of their files can be linked from here, and nothing else. —MisterSynergy (talk) 14:45, 3 May 2020 (UTC)

Whoops, yes. Fixed! It is a Wikimedia wide thing, something I would search for on Meta. I quickly also came across m:Non-free content + wmf:Resolution:Licensing policy - Romaine (talk) 14:47, 3 May 2020 (UTC)

@Romaine, MisterSynergy: Thanks, all. —Scs (talk) 23:40, 3 May 2020 (UTC)

It is for not often user of Wikidata from my point of view not clear that the files that can be found in items are not hosted in Wikidata. I think that the Text about the license of content in Wikidata should be modified. It is from my point of view not for every one clear what structured Data is and what not. This is the sentence about the specification of the license of Wikidata. All structured data from the main, Property, Lexeme, and EntitySchema namespaces is available under the Creative Commons CC0 License. There is not mentioned that there are properties in the main space who embed data, for example maps or pictures and other media files. --Hogü-456 (talk) 17:50, 3 May 2020 (UTC)

Well, since Wikidata started embedding images on its item pages, these item pages are no longer necessarily public domain or CC-zero, but should at least be freely useable according to the licences on Commons. The underlying data properties only contain the file names and are assumed to be not under copyright (although it's unclear if this is really the case for all long file names). Ghouston (talk) 00:39, 4 May 2020 (UTC)

Q29637965

I propose that we undelete Q29637965, which was deleted after discussion archived here, even though there was little support for deletion, and a case was made that it meets our notability criteria; as is still the case Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:09, 3 May 2020 (UTC)

Property proposal: seats

Hello. After this discussion Wikidata:Project chat/Archive/2020/04#legislative election I proposed Wikidata:Property proposal/Organization#parliament seats. We need a way to add the seats a political party won in elections (for parliament, for municipal council etc) or the party has in a body like parliament or how many seats a constituency has. A user proposed to me to used number of seats in assembly (P1410) but the property should not be used as qualifier. Please read the property proposal and say your opinion there. I just need a way to add the seats each party won, either with a property we already have or with a new one. Xaris333 (talk) 22:18, 3 May 2020 (UTC)

Android smartphone model

I think that an item like Q91804224 (recently created) is undesirable, since it's just an intersection of concepts. The operating system installed on a device can be better recorded with operating system (P306). Whenever something is done in two ways, it makes it harder to write queries, since you have to check both methods. Also, it may be only a matter of time before somebody decides to subclass it further with other properties, such as brand, making "Samsung android smartphone model", etc. Items like smartphone model (Q19723451) were already unnecessary, in my opinion, since it's just the combination of smartphone (Q22645) and product model (Q10929058). @MProperLawAndOrder:. Ghouston (talk) 03:54, 25 April 2020 (UTC)

I tend to agree these items are generally undesirable for use as instance of (P31) but it shouldn't make much difference if properly subclassed. Vojtěch Dostál (talk) 09:51, 25 April 2020 (UTC)

The harm is firstly the usual problems you get with redundancy in databases: if there are two ways of expressing the same information, you don't know which is correct when they disagree. Secondly, you get an ever-growing forest of intersection items, see Commons categories as an example. Ghouston (talk) 11:17, 25 April 2020 (UTC)

Their username is @MrProperLawAndOrder:. I wouldn't have created Q91804224, Q19723451 looks like it isn't currently necessary, although it could be useful, but these are similar to how I'm using A road (Q18019452) and B road (Q89021600) in instance of (P31) which could also be undesirable for the same reason (it looks like transport network (P16) would also allow these, if there's an item for "classified road" or "numbered road"). What's more important is consistency with similar items. Peter James (talk) 13:07, 25 April 2020 (UTC)

I agree with Ghouston: such items are intersection items and should not be created nor used. I can’t think of anything that item that cannot be solved by a tiny bit of SPARQL. Commons 'had to' implement such categories because we had no "good way" of doing intersections.

The issue with the existence of these items is that editors will use them − why wouldn’t they: after all, if it exists, it must be for a good reason.

Jean-Fred (talk) 20:19, 25 April 2020 (UTC)

@Ghouston, Peter James, Jean-Frédéric: I agree this should be changed, but smartphone model isn't any better. Regarding the reasoning above: The values in instanceOf and subclassOf are by nature intersections of something. That's the whole point of classification.

The reason why I created Android smartphone model was that smartphone model seemed too weakly defined. When is a mobile/cell phone smart?

What can be done? Looking at manufactured goods:

automobile model (Q3231690)
1. instance of: first-order metaclass (Q24017414) + type of manufactured good (Q22811462)
2. subclass of: Ford SPORT Wagon 3 puertas año 1998
Android smartphone model (Q91804224)
1. subclass of: smartphone model
2. operating system: Android
smartphone model (Q19723451)
1. instance of: first-order metaclass (Q24017414)
2. subclass of: cell phone model, computer model
cell phone model (Q19723444)
1. instance of: first-order metaclass (Q24017414) + type of manufactured good (Q22811462)
2. subclass of: electronic device model, tangible good, telephone model
telephone model (Q41622600)
1. instance of: first-order metaclass (Q24017414)
2. subclass of: model (Q10929058)
electronic device model (Q62008942)
1. instance of: first-order metaclass (Q24017414)
2. subclass of: model (Q10929058)
computer model (Q55990535)
1. instance of: first-order metaclass (Q24017414)
2. subclass of: electronic device model (Q62008942)
model (Q10929058)
1. instance of: first-order metaclass (Q24017414)
2. subclass of: type of manufactured good (Q22811462) !! here the value is in subclass, above it was in instance of
3. has quality: brand - really? No model without a brand?
4. equivalent class: http://schema.org/ProductModel

What do you think about only using model (Q10929058)? What type of model can be inferred from subclassOf and/or other properties. MrProperLawAndOrder (talk) 02:02, 26 April 2020 (UTC)

There are items in Wikidata to represent concepts, like microprocessor (Q5297), which is a kind of manufactured product, and then items like Motorola 68020 (Q916240) which represents a particular product of this type. That's expressed using the subclass statement (since a lot of these processors have been manufactured, not just one), and the item also declares it to be an instance of product model (Q10929058). As far as I can see, there's nothing to gain in this case by creating another item to represent "model of microprocessor", and it would be redundant. We shouldn't really need to create an item "model of X" for every product type X. I think a bigger problem is working out exactly what qualifies as a "model" in the first place: is Motorola 68020 (Q916240) really a model? Maybe in this case it is, but in other cases, there are microprocessors that come in different clock speeds and cache sizes, and their are cars that have different engine and trim options, and phones that have different variants sold in different countries: at what point is something a model series (Q811701) instead? I suppose at least at the point where the different variants have their own Wikidata items. Whether mobile phone (Q17517) and smartphone (Q22645) can be meaningfully distinguished or not is really an unrelated question, since the OS can be expressed in operating system (P306), if that helps. Creating an item "Android smartphone" would have been better than creating "Android smartphone model", so it can be used in the subclass statement, but I still think it would be an undesirable redundancy. Ghouston (talk) 03:51, 26 April 2020 (UTC)

@Ghouston: agree, additionally creation of subclasses should also be discussed. More model items:

commercial weapon model (Q22809622)
commercial object model (Q22809624)
belt sander model (Q23811260)
circular saw model (Q23811261)
drill model (Q23811262)
grinder model (Q23811263)
hammer drill model (Q23811264)
impact wrench model (Q23811265)
jig saw model (Q23811266)
miter saw model (Q23811267)
orbital sander model (Q23811268)
reciprocating saw model (Q23811269)
screwdriver model (Q23811270)
scooter model (Q23867828)
moped model (Q23868001)

NOTE; Android smartphone model has been replaced with smartphone model on the ~120 items that used it. Before doing so, I made sure each item has a value for OS - it had none in ~25 cases then I added Android -, and is a subclass of smartphone. But I prefer this is changed to product model. And then people can argue about whether something is a model, a model series, or a variant of a model. At the end, if any property differs, it is a new class of products. MrProperLawAndOrder (talk) 04:31, 26 April 2020 (UTC)

Thanks. Yes, there has been quite a bit done already in unnecessarily creating "X model" for every product type "X". If we aren't careful, we end up with three items for each concept: microprocessor, microprocessor model, and microprocessor model series, and after that, random intersections with various attributes that can be better expressed as properties. Ghouston (talk) 05:45, 26 April 2020 (UTC)
I’ll also plead guilty on this, since I created video game console model (Q56682555) and computer model (Q55990535)… Why? Because as I was not sure how to model these concepts, I turned for inspiration to… smartphones, cars (automobile model (Q3231690)) and cameras (camera model (Q20888659)) and concluded this was the proper way of doing things ^_^' Meanwhile, as I worked on the constraints of old-computers.com ID (P5936), I could hardly help but feel that I was essentially saying the same thing twice Jean-Fred (talk) 07:52, 27 April 2020 (UTC)

tldr, what would be used in P31 if not these? --- Jura 08:02, 27 April 2020 (UTC)
- product model (Q10929058), like on Motorola 68020 (Q916240). Ghouston (talk) 08:17, 27 April 2020 (UTC)
I see. In general, it's indeed preferable to use few different P31 values. However, in some of the samples above, the main advantage of using a different P31 is that it makes it easier to determine other properties that should be on the same item. This shouldn't impact negatively queries (e.g. instances of Q91804224 should still have operating system = Android and P279 would remain unchanged). --- Jura 08:28, 27 April 2020 (UTC)
- Couldn't you do that just as well from the subclass of (P279) value? Ghouston (talk) 10:38, 27 April 2020 (UTC)

product model (Q10929058) vs smartphone model (Q19723451), electronic device model (Q62008942) & others

User:MrProperLawAndOrder exchanged a lot of instance of electronic device model and instance of smartphone model with just instance of model. My impression was that it is better to sort the items into subclass if that has a norrower topic and fits the item. What is now the correct way of tagging? --D-Kuru (talk) 07:24, 2 May 2020 (UTC)

@D-Kuru, Ghouston, Jean-Frédéric, The RedBurn: First I added "subclass of X" to many of the "instance of X model" items - they were completely outside the regular classification tree via subclass of. I only replaced instance of X model with model, if the item was a already a subclass of X - and did so to remove redundance. MrProperLawAndOrder (talk) 07:37, 2 May 2020 (UTC)

instance of smartphone model

Huawei U2801 (Q5926380) ... subclass of feature phone - but feature phone is not a subclass of smartphone, in its English alternative label it says "dumbphone". So, what is it? Redundancy is bad, contradictions worse. MrProperLawAndOrder (talk) 08:13, 2 May 2020 (UTC)
[3] ... subclass of old age? MrProperLawAndOrder (talk) 08:20, 2 May 2020 (UTC)
[4] ... subclass of mobile phone form factor MrProperLawAndOrder (talk) 08:25, 2 May 2020 (UTC)
[5] this one has instance of cell phone model, smartphone model, video game console model but subclass of handheld game console MrProperLawAndOrder (talk) 08:45, 2 May 2020 (UTC)

As discussed in the section above, "[specific kind of object] model" items used in instance of (P31) don't seem to add anything compared to product model (Q10929058) used in instance of (P31) + "[Specific kind of object]" item used in subclass of (P279). The latter also avoids having to create (and use) a "[specific kind of object] model" item for each and every type and subtype of object, avoids redundancy and allows to have a more direct link with Wikipedia. The RedBurn (ϕ) 09:54, 2 May 2020 (UTC)

Yes, although there are a few items like automobile model (Q3231690) that have Wikipedia articles, so that we are stuck with them. There are also some pointless subclasses of model series (Q811701), such as automobile model series (Q59773381) and computer model series (Q60484681). Ghouston (talk) 10:11, 2 May 2020 (UTC)

I agree with User:D-Kuru, that smartphone model seems for more suitable than just model for P31. Germartin1 (talk) 09:24, 3 May 2020 (UTC)
Germartin1, D-Kuru didn't say that, please read again, read his question and the answers. MrProperLawAndOrder (talk) 05:41, 4 May 2020 (UTC)

Subclass of model

Below are some of the subclasses of product model (Q10929058). SPARQL ?item wdt:P279* wd:Q10929058 https://w.wiki/PrS returned 455 until I removed "family car subclass of automobile model" [6], now it returns 151.

Electronic devices

handheld game console model (Q67387549)
video game console model (Q56682555)
Android smartphone model (Q91804224)
smartphone model (Q19723451)
cell phone model (Q19723444)
telephone model (Q41622600)
electronic device model (Q62008942)
computer model (Q55990535)
model of calculator (Q19799634)
smartwatch model (Q19799938)
digital camera model (Q20741022)

Tools

belt sander model (Q23811260)
circular saw model (Q23811261)
drill model (Q23811262)
grinder model (Q23811263)
hammer drill model (Q23811264)
impact wrench model (Q23811265)
jig saw model (Q23811266)
miter saw model (Q23811267)
orbital sander model (Q23811268)
reciprocating saw model (Q23811269)
screwdriver model (Q23811270)

Vehicle

Q29048322 vehicle model
Q18758641 watercraft class
Q23867828 scooter model
Q23868001 moped model
Q5411847 Typhoon variant
Q5531853 F-16 Fighting Falcon model
Q5684964 Hunter variant
Q5684968 Hurricane variant
Q6665639 P-3 Orion model
Q7643937 Spitfire model
Q7925343 Viscount model
Q15126161 prototype aircraft model
Q16986328 Constellation variant
...

Other

Q20888659 camera model
Q22809622 commercial weapon model
Q22809624 commercial object model
Q29889276 model of earth-moving machine
Q29982117 musical instrument model
Q42314054 ammunition model

MrProperLawAndOrder (talk) 06:26, 4 May 2020 (UTC)

Items missing sitelinks in schema:about?

Tracked in Phabricator
Task T251387

There are some items that miss some sitelinks in the schema:about property. See example:

SELECT ?item {
  ?item schema:about wd:Q5084390 .  
}

Try it!

Charli XCX (Q5084390) has links to 39 wikis, but the query returns 37 and misses enwiki. Am I getting something wrong about schema:about or is there a problem with these items? --MarioGom (talk) 09:42, 29 April 2020 (UTC)

Indeed it's missing the sitelink for en and pt wikipedias, it is still unclear what could have have happened yet, filed a phabricator task to investigate further. DCausse (WMF) (talk) 10:23, 29 April 2020 (UTC)

There was a duplicate, Charli XCX (Q89621390), which had the two missing sitelinks (English and Portuguese). Before I merged them on 13 April the query returned both items. I have now edited Q5084390; not sure if the query service will be updated by this. Peter James (talk) 11:39, 29 April 2020 (UTC)

I think it's a leftover from the database problem earlier this month. Wikidata:Project_chat#Major_issue. --- Jura 12:08, 29 April 2020 (UTC)
You can find some more cases if you look for blue links at en:Wikipedia:WikiProject Women in Red/Number of links. For example: Cristin Milioti (Q5186320), Marion Rung (Q243057), Sara Sampaio (Q10368401), Masih Alinejad (Q6783266), Leah Kirchmann (Q16225885). --MarioGom (talk) 08:24, 30 April 2020 (UTC)

Thanks, I think the best approach is to do a full reload of the wdqs servers once all the cleanups related to the wb_items_per_site incident are done, in the meantime (I know that this is far from ideal) doing a null edit on the item should restore the missing sitelinks. I'll keep the linked phabricator task updated. DCausse (WMF) (talk) 09:22, 30 April 2020 (UTC)

DCausse (WMF) Thank you for the update. It's no big deal for me. Now that I know that my queries are correct, I'll just wait until the fixes eventually propagate. --MarioGom (talk) 19:19, 30 April 2020 (UTC)

No links are missing for Q243057, but the query service says most of the links are http, instead of https; the item page has all https links. Peter James (talk) 21:31, 30 April 2020 (UTC)

I'm a bit puzzled by this one, I've checked on the test server that has been reloaded recently and they appear as https. Hopefully the reload of all the servers will fix these discrepancies. DCausse (WMF) (talk) 09:28, 4 May 2020 (UTC)

Other varieties of English than British and Canadian?

I've been working on New Zealand place names as part of Task force tohutō. In New Zealand English (Q44661), Māori loanwords are now generally written with the macron that indicates long vowels, but this isn't true for other varieties of English. The official New Zealand Gazetteer names of many places were recently changed to use macrons, and the Wikipedia en-NZ article-naming conventions were changed to follow suit (please don't debate this with me without reading the 33,000 word RfC which took two years to resolve...)

So we have a situation where I'd like to represent the official name (P1448) of Taupō (Q2397257) as being in NZ English. But my options are En, En-gb, and En-ca. There's no NZ English option, and incidentally no US English? Or Australian English? Very odd. Can somebody point me to the discussion that set British and Canadian as the only English variants in Wikidata; and how would I go about adding NZ English as a pop-up option? —Giantflightlessbirds (talk) 07:42, 30 April 2020 (UTC)

I don't know where en-gb and en-ca came from, they have been in Wikidata for as long as I remember. Perhaps they were just inherited from the MediaWiki software at inception. There are obviously a lot of dialects of English, see en:List of dialects of English. I suspect that allowing all these variants to be specified in Wikidata would be harmful, on balance, because it would encourage massive duplication of data (and database bloat). What happens at present is that sometimes an en label is copied to an en-gb label, pointlessly since the values are identical, but then the en-gb label tends not to get updated when the en label is changed. Couldn't we assume anyway that an official name of an item with country (P17) = New Zealand is in the NZ English dialect, so far as names even have a dialect? Ghouston (talk) 08:23, 30 April 2020 (UTC)

The problem is that New Zealand English has macrons in much of its official naming and the macrons are likely to be removed when names are not designated as being in NZ English. It is not just a dialectal matter.: the spelling includes diacritics which have no place in ordinary English. Hence, there is an imperative for the recognition of nz-en when designating Maori names with macrons in wikidata. MargaretRDonald (talk) 10:43, 30 April 2020 (UTC)

I don't see any reason why labels of NZ place names would be changed from official NZ versions, any more than the en:Taupō article is likely to be renamed in Wikipedia. Ghouston (talk) 10:51, 30 April 2020 (UTC)

There has in fact been regular Wikidata vandalism going on with macrons being removed from names, and there was quite a bit of resistance and edit warring over renaming Wikipedia articles to the official NZ place name until the RfC was finally closed a month ago. Adding en-NZ might help with this; but I can understand the assumption that a NZ official placename at least is in NZ English. —Giantflightlessbirds (talk) 21:12, 30 April 2020 (UTC)

@Giantflightlessbirds, Ghouston: The process for adding new language codes in this context is controlled by the language committee, which has stalled or given unclear answers on adding language variants before (e.g. en-IN, en-US). Based on the existence of en-GB and en-CA it could be appropriate to add all of these, but I don't know how this would be done. I don't think there's an established process for this, since there haven't recently been any successful requests to add new variants for major languages, although phab:T180771 and phab:T195816 seem to be advancing slightly faster. Jc86035 (talk) 11:51, 30 April 2020 (UTC)

@Lea Lacroix (WMDE): Has there been any progress related to Wikidata:Identify problems with adding new languages into Wikidata? Jc86035 (talk) 11:57, 30 April 2020 (UTC)

I would say there's also a strong case for deprecating & removing en-gb & en-ca rather than adding more en-xx variants (given every other WM project manages okay without them). Maybe we need a broader RFC on this to decide which way to go, given that the current situation is a bit unsatisfactory from all directions? Andrew Gray (talk) 14:51, 1 May 2020 (UTC)

Nothing specific as far as I know. The next step that was suggested on the input page was to create a new process within the Wikidata community, but that's not something we can trigger from the development team. Lea Lacroix (WMDE) (talk) 09:24, 4 May 2020 (UTC)

@Giantflightlessbirds: I think what you've done on Taupō (Q2397257) looks exactly right. en="English" (not en-us) should respect the variant that officially administrates the place, so in fact the macrons are "normal English" for NZ places. Just like the en-Wikipedia style guide about strong national ties to a topic. So, even without en-nz, I think you can proceed comfortably with the macrons. --99of9 (talk) 00:05, 1 May 2020 (UTC)

Yes, I think it's better this way than for example having an en-nz label using a macron and an en label on the same item without. Although the macron would basically disappear in this setup, since only users who request en-nz specifically would see it, if that's the goal, to protect the rest of the English-speaking world from the macron. I don't think that's necessary or desirable, however. Ghouston (talk) 01:30, 1 May 2020 (UTC)

There's currently a en-gb label on the item, without the macron. Does it even make sense to have a specific en-gb label on a New Zealand place name? Ghouston (talk) 02:00, 1 May 2020 (UTC)

Nope, we certainly don't use en-gb! Thanks, everyone, for your help with this. —Giantflightlessbirds (talk) 04:11, 1 May 2020 (UTC)

@Giantflightlessbirds: If the orthography changes, Wikipedia tends to overwrite all previous information. Please avoid doing the same on Wikidata. New statements with relevant ranks should be added and nothing deleted. --- Jura 04:49, 1 May 2020 (UTC)

OK, I think I have it right. name (P2561) (English): Taupo (this is not and has never been an official name); official name (P1448) (English): Taupō, as of 21 June 2019; and native label (P1705) (Maori, sic): Taupō-nui-a-Tia. Someone had put "Taupo" as a native label, so I deleted that. Incidentally, how do we go about editing the name of the Māori language, Te Reo Māori, so it's spelled correctly with a macron? It's very jarring to see the form "Maori" used everywhere in Wikidata. —Giantflightlessbirds (talk) 08:59, 1 May 2020 (UTC)

This depends on the context. The item Māori (Q36451) current has the macron in the en label, but in the P1705 the value is Reo Māori (Maori). I don't know where the label (Maori) comes from: coded in MediaWiki software perhaps. Ghouston (talk) 10:25, 1 May 2020 (UTC)

Merging "Foreign Office" and "Foreign and Commonwealth Office"?

Q358834 and Q58211956 seems to reference the same entity. Wiki page for "Foreign and Commonwealth Office" says: "The Foreign and Commonwealth Office (FCO), commonly called the Foreign Office". Q58211956 doesn't references any Wiki pages at all. In my opinion, Q58211956 should be merged into Q358834 (which is clearly older). What do you think? Hdfan2 (talk) 11:52, 3 May 2020 (UTC)

I think Q58211956 explains what's it's about and how that is different. --- Jura 11:55, 3 May 2020 (UTC)
To expand on Jura1's point, one is the successor of the other, as indicated by replaced by (P1366) and replaces (P1365). They should not be merged. Having them separate is very useful when another part of the database calls on the item. For example, if someone made a statement that a person born in 1800 worked for Foreign, Commonwealth and Development Office (Q358834) a constraint error would be triggered as the person would have died before the entity existed. Instead, they can make a call to Foreign Office (Q58211956) which is our item for that time period. From Hill To Shore (talk) 15:44, 3 May 2020 (UTC)
- I understand. But there's a problem. Q58211956 is used to specify a person's employer (Russian translation Веллингтон, Артур Уэлсли of "Arthur Wellesley, 1st Duke of Wellington", Q131691 in my case). The Wikipedia article on this person uses infobox template that shows this field (see second line in "Место работы" field). And the link to corresponding Wikipedia article was red, because we can't use one Wikipedia article in two Wikimedia entities. I tried to create a page that redirects from this red link to existing page (that Q358834 references: [7]), but then this link became black (and unclickable). Is there a way to fix this problem without merging two entities, other than manually editing infobox data in Wikipedia article? Hdfan2 (talk)
  - I think you'd need to create a page for the historical Foreign Office on the Russian Wikipeida (blank), then add it as a sitelink on Foreign Office (Q58211956), then convert the page to a redirect to ru:Форин-офис. Ghouston (talk) 01:07, 4 May 2020 (UTC)
    - Thank you! That finally solved the problem! Hdfan2 (talk) 04:13, 4 May 2020 (UTC)

Could not save due to an error. The save has failed.

Hello, i find a problem when i edit wikidata pages .. the following message appears: Could not save due to an error. The save has failed. What is the solution ? --Omar Ghrida (talk) 00:46, 4 May 2020 (UTC)

Is that the only message you get, without any details? And are you still experiencing it? Sometimes errors come and go for temporary technical issues. Ahmad^talk 06:56, 4 May 2020 (UTC)

@Ahmad252: The problem has been resolved, thank you for your reply --Omar Ghrida (talk) 07:49, 4 May 2020 (UTC)

How to display the {{Wikidata Infobox}} on two different pages on Commons?

I see {{Wikidata Infobox}} displays the same on c:London and c:Category:London. But that template doesn't work the same on c:Ambigram (only c:Category:Ambigrams displays it). Why? -- Basile Morin (talk) 05:38, 4 May 2020 (UTC)

The gallery was specifying the qid manually, but didn't do it right. I fixed it. There's another method, which would involve creating a new Wikidata category item so that there could be two sitelinks to Commons, but it's probably good enough as it is. Ghouston (talk) 06:07, 4 May 2020 (UTC)

Thanks -- Basile Morin (talk) 06:30, 4 May 2020 (UTC)

I created Category:Ambigrams (Q93217145), so both the gallery and category are now sitelinked, and the infobox works without the manual QID. Thanks. Mike Peel (talk) 08:37, 4 May 2020 (UTC)

How to add google book as a reference in a better way

I was trying to add [8] as reference for height of Mahavira (Q9422), but adding the link seems inefficient. Do we have a wikidata item for google books or can we simply add WzEzXDk0v6sC and page number and rest of data gets automatically fetched like title, isbn, author, etc.

@Capankajsmilyo: Create an new item and add Google Books ID (P675) to the item for the edition (not the book itself!) - see Help:Source#Books.--GZWDer (talk) 08:35, 4 May 2020 (UTC)

Thanks for the response. Google books seem like a database. Can't we automate Q creation of books just like it was done for other databases like IMDB? Capankajsmilyo (talk) 09:51, 4 May 2020 (UTC)

suggest step to update orcid reference available on caltech library

i want to update "stated in" & "caltech library". what more should be updated in reference. is updating above is correct ? or should i update just "reference URL" ? orcid is given at caltech thesis page Leela52452 (talk) 13:52, 4 May 2020 (UTC)

Public genealogical data

Hello,

I'm a member of my local genealogical society and I'm wondering: is the Wikidata community interested in the genealogical data of ordinary people?

If so, I can suggest to my local society that they transfer their genealogical data to Wikidata. We have the birth and death dates of tens of thousands of people.

Where I live, genealogical data on people born 100 years ago or more is in the public domain.

We keep this data in a huge computerized database, but there is certainly a way to retrieve it. Is there a file format or an API that would allow us to easily upload it to Wikidata?

Thank you,

--Milano-2018-10-16 (talk) 20:26, 28 April 2020 (UTC)

Hi Milano-2018-10-16, thanks for your nice idea. Before uploading huge dataset, be sure to check the following:

data input into Wikidata should be notable enough. I am not sure that "ordinary people" are notable enough. I, personnaly, would be interested and in favor of genealogy dataset but perhaps other would object to. I've read notability guideline and discover the genealogy topic has'nt yet been determined. I, personnaly, have been always wondering why there hasn't been any genealogy wikipedia. Curious to what other will say.
new data should be first checked (there might already be people inside) and reconciled with objects (for instance "Pierre-Paul JEAN" should be linked to Pierre-Paul (Q20727006) and Jean (Q12657412) and so on. Bouzinac (talk) 20:52, 28 April 2020 (UTC)

@Milano-2018-10-16: How many entries it will have? Please provide a link to the dataset so community may evaluate the data.--GZWDer (talk) 22:05, 28 April 2020 (UTC)

+1. (Is it a French-speaking one, as induced by your mother tongue?) Nomen ad hoc (talk) 22:07, 28 April 2020 (UTC).

And, did the dataset have any sources? (ordinary GEDCOM should not be imported to Wikidata, as it is usually unsourced; but if the dataset is peer reviewed, it may be OK.)--GZWDer (talk) 22:17, 28 April 2020 (UTC)

@Bouzinac, GZWDer: Thank you for your quick answers!

The database I'm thinking of is only available by subscription, but the data from 1620 to 1920 is in the public domain. My idea is to make this portion more easily accessible by transferring it to Wikidata. But, before trying to convince the board of directors of my genealogy association, I want to make sure that the Wikidatians are interested.

I'm referring to the Connolly file, which is an index of Catholic and Protestant baptisms, marriages and burials coming mainly from Quebec. The file contains more than 6,000,000 entries. There are some duplications and inaccuracies in the file, but the data remains on the whole fairly reliable. You are right the file was compiled by francophones like me, but the data (names of persons and dates) is not in any particular language. The database contains no explicit reference to other sources. However, once a date and place are provided, it is possible to infer from which handwritten register the data comes from. To be honest, though, I am no longer certain that the data included in this database meets Wikidata's standards. You can find more info on the Connolly file here.

That said, there are other high quality databases. For example, it would be great to incorporate into Wikidata the free portion of the PRDH database, which is little known but of excellent quality (it is what you might call "peer reviewed"). You can access an English version of the interface here. Please note that scans of the original Quebec registers are, for the most part, made available online by the national library (here).

A few years ago, I suggested to a Wikimedia Canada volunteer that the foundation create WikiGenealogy, a Wikimedia sister projects dedicated to genealogy. There was a lot of excitement about the idea, but it seems that no one was able to make it happen. At the time, I was told that the first step might be to record enough genealogical data in Wikidata to lay the foundation for a WikiGenealogy.

So far, I observe that the same datasets are transcribed separately by several associations and genealogists. Why not federate all efforts around a single open platform like Wikidata?

Can I suggest that you organize a vote or create a reference page that would give guidelines in relation to the genealogical data of lesser-known people on Wikidata? If I put together the substance of what has been said here, we would already have a few guidelines to vote on:

Data must be from a peer-reviewed data set or be referenced.
New data concerning people already in Wikidata must be incorporated (avoid generating duplicates with what is already in Wikidata).
New data should be reconciled with objects (thus, first names, surnames, etc., are linked).
No genealogical data on people born less than 100 years ago, except for people of notoriety who have already made this information publicly known or whose information has been published in a public source such as a catalogue of authority notices.

--Milano-2018-10-16 (talk) 00:20, 29 April 2020 (UTC)

I think one of the worries we had when we were deciding about uploading all the people in Findagrave was that we were worried about how to disambiguate all the ordinary people, and if adding in several hundred John Smiths would make it difficult to find the John Smith that most people are looking for. I am already having problems with all the ORCID entries of scientists and disambiguating them. But I am sure we can work something out. For the ORCID people we described them as ORCID=143357 or something like that, so we know they are minimally described people, that may be duplicated in the database. --RAN (talk) 03:01, 29 April 2020 (UTC)
@Milano-2018-10-16: I think putting this data on the internet would be great. While wikidata may not be the right place for it you could consider WikiTree. Iwan.Aucamp (talk) 07:17, 29 April 2020 (UTC)

Looks there has been tentatives of "Wikidata-genealogy" but they do not look to be active ? https://tools.wmflabs.org/genealogy/wiki/Joseph_Don_Carlos_Young_(1855-1938) . A nice FAQ can be read there : https://www.wikidata.org/wiki/Help:FAQ/Genealogy Bouzinac (talk) 08:01, 29 April 2020 (UTC)

A general overview of Wikimedia and genealogy is at meta:Wikimedia Genealogy Project, WikiProject Genealogy (Q19817878). —Sam Wilson 08:51, 29 April 2020 (UTC)

My 2 cents: I think that specialized genealogy sites are better suited to large genealogy datasets than Wikidata is. Those sites will have a built-in standardized way of prominently displaying the person's birth/death dates and relatives, so you can see at a glance who you are talking about. Whereas with Wikidata the only "identify-at-a-glance" info is in the description, and that has to be done by hadn and isn't consistent by any means or always used. There are already too many confusing entries for humans (don't get me started on how scrappy the huge import from "The Peerage" is) and too many duplicates. Again, specialized genealogy sites are likely to have tools for finding duplicates, and for linking with relatives, etc. So, I think the best thing for Wikidata is to only have selected people but make sure to link to resources like Wikitree for exploring their whole family. Thanks to the OP for the idea, though! I am sure we are greatly lacking in information from Quebec in many ways, and the dataset would be useful to Wikidata whether it's hosted here or somewhere else. Levana Taylor (talk) 13:40, 29 April 2020 (UTC)

Hi, Thank to all of you for your research, which allows us to take the discussion further!

I can see that I am not the first nor the last to suggest the creation of a genealogy project on Wikimedia. But reading the answers leads me to ask myself: wouldn't it be possible to use a pre-existing site such as WikiTree as a basis for a project hosted by Wikimedia?

I wonder how projects like Wikidata and Wikispecies got started. Is there a known way to repeat a similar process with a future WikiGenealogy? Who is capable of carrying out such a project?

For sure, it is always possible to list pre-existing external projects, but creating a dedicated and federating Wikimedia project still seems relevant. --Milano-2018-10-16 (talk) 20:17, 29 April 2020 (UTC)

Note 1. Wikitree does not allow mass imports and 2. Data in Wikitree is not in free licenses.--GZWDer (talk) 04:03, 30 April 2020 (UTC)

A specialized genealogy database is likely to work better than MediaWiki software as suggested by User:Levana Taylor, but this doesn't remove the complaint that no existing genealogy database seems to be freely licensed. The Wiki model (of anyone can edit without registration) may not be a good match for a site that's constructed from bulk import of data sources such as old censuses or graveyards. Allowing unchecked random edits would corrupt the data, but checking every edit may be too much work. Ghouston (talk) 02:32, 1 May 2020 (UTC)
There is the Incubatorwiki for new projects mostly used for new language versions. Here is a link [9]. I dont know how the strategy of the Wikimedia Foundation is and if it is possible to get such a project hosted by the Wikimediafoundation. I dont know so much about Genealogy data and I think it is better in another Database than Wikidata. A database who is licensed under a free license for that topic is a good idea. --Hogü-456 (talk) 16:00, 1 May 2020 (UTC)

Apparently, there has been a trying/test of wiki-genealogy : https://meta.wikimedia.org/wiki/WikiTree Bouzinac (talk) 20:48, 1 May 2020 (UTC)

I believe that importing high quality curated datasets is good. Specific software for genealogy has some advantages but it also has disadvantages. Generology software only stores people. It doesn't store anything which those people interacted with. Wikidata can also store information like books a person has written, patents they have been granted and companies the person worked in. ChristianKl ❪✉❫ 11:21, 4 May 2020 (UTC)
- I do not support import arbitrary GEDCOM to Wikidata, as GEDCOM is usually unsourced.--GZWDer (talk) 00:11, 5 May 2020 (UTC)

Cypriot municipal elections

Hello. Every 5 years there are Cypriot municipal elections (Q64918845). The day of the elections, for each municipality there are one mayor election (voters vote persons) and one municipal council election (voters vote political parties and persons). There are 39 municipalities. I tried to create a structure for Cypriot municipal elections. In my examples, I used only two municipalities (Limassol Municipality (Q28870916) and Nicosia Municipality (Q56037497)) and two different elections (2011, 2016). I am not sure about the structure and I don't want to apply it to all 39 municipalities and all elections without be sure. Please tell me your opinion. If you know a good example of another country please let me know. The structure is:

{{{Cypriot municipal elections (Q64918845) }}}

}}

{{{Cypriot Mayors Elections (Q92282917) }}}

{{{Cypriot Municipal Councils Elections (Q92282921) }}}

}}

{{{Mayor of Limassol Municipality Elections (Q92282907)}}}

{{{Mayor of Nicosia Municipality Elections (Q92312582)}}}

{{{Municipal Council of Limassol Municipality Elections (Q92282909)}}}

{{{Municipal Council of Nicosia Municipality Elections (Q92313829)}}}

}}

}}

{{{Cypriot municipal elections (Q64918845) }}}

}}

{{{2011 Cypriot municipal elections (Q28035577) }}}

{{{2016 Cypriot municipal elections (Q64995666) }}}

}}

{{{2011 Cypriot Mayors Elections (Q92320908)}}}

{{{2011 Cypriot Municipal Councils Elections (Q92321112)}}}

{{{2016 Cypriot Mayors Elections (Q92320983)}}}

{{{2016 Cypriot Municipal Councils Elections (Q92321117)}}}

}}

}}

{{{Cypriot municipal elections (Q64918845) }}}

}}

{{{Limassol Municipality municipal elections (Q92282911) }}}

{{{Nicosia Municipality municipal elections (Q92312633) }}}

}}

{{{Mayor of Limassol Municipality Elections (Q92282907)}}}

{{{Municipal Council of Limassol Municipality Elections (Q92282909)}}}

{{{Mayor of Nicosia Municipality Elections (Q92312582)}}}

{{{Municipal Council of Nicosia Municipality Elections (Q92313829)}}}

}}

}}

Xaris333 (talk) 22:44, 28 April 2020 (UTC)

Just curious about the layout of your comment: Is there a tool that generates these trees? --- Jura 12:16, 29 April 2020 (UTC)

No. Xaris333 (talk) 13:17, 29 April 2020 (UTC)

The structure of the trees looks good to me but you have to be careful what property you use. Items in the lowest row of the tree should be connected with the second lowest row by instance of (P31). For all other connections use subclass of (P279). Nowhere should part of (P361) be used. --Pasleim (talk) 15:11, 29 April 2020 (UTC)

Thank you. I corrected all the items. Xaris333 (talk) 16:54, 29 April 2020 (UTC)

@Xaris333: I don't think it makes sense to have 2011 Cypriot Municipal Councils Elections (Q92321112) or 2011 Cypriot municipal elections (Q28035577) as classes with instances. Assuming the elections are all happening roughly at the same time, it should be considered as a single group election, an event with a distinct start/end time, divided into parts, like the 2019 European Parliament election (Q16999180). The rest of the structure I agree with. --Yair rand (talk) 19:53, 4 May 2020 (UTC)
- @Yair rand: Do you mean 2011 Cypriot Municipal Councils Elections (Q92321112) or 2011 Cypriot Mayors Elections (Q92320908)? Xaris333 (talk) 21:19, 4 May 2020 (UTC)
  - @Xaris333: Both, really. Assuming they happen in a coordinated manner as a group (at roughly the same time), they should be modeled as events with parts, rather than as classes with instances. --Yair rand (talk) 21:23, 4 May 2020 (UTC)
    - @Yair rand: You wrote 2011 Cypriot Municipal Councils Elections (Q92321112) or 2011 Cypriot municipal elections (Q28035577). Did you mean 2011 Cypriot Municipal Councils Elections (Q92321112) or 2011 Cypriot Mayors Elections (Q92320908)? I want to understand why you think 2011 Cypriot municipal elections (Q28035577) don't make sense as classes with instances. Xaris333 (talk) 22:15, 4 May 2020 (UTC)

What is the ruling of Wikimedians themselves having entries in Wikidata?

What is the ruling of Wikimedians themselves having entries in Wikidata? Do they really exists and can be they be described with reliable sources? What is the current status of that? --RAN (talk) 02:18, 2 May 2020 (UTC)

It depends a lot on the sources that exist for the particular person, we don't have a general ruling. ChristianKl ❪✉❫ 11:34, 2 May 2020 (UTC)
- Taking a look at the policy page, we do have a ruling, although there is some room for ambiguity. See Wikidata:Autobiography. The page isn't tagged as a policy or guideline but it is included in the list of approved policies and guidelines. @Pigsonthewing: You were the initial creator of Wikidata:Autobiography but have disputed the description of it as a guideline.[10] Can you please clarify the status of the page? Is it an approved policy or guideline (in which case it should be tagged correctly) or is it some form of essay or informal guide (in which case it should be removed from the list of approved policies & guidelines). From Hill To Shore (talk) 22:22, 3 May 2020 (UTC)
  - I think I've expressed my views clearly on its talk page. Unfortunately, others - some more hung up on beaureacracy than helping its target audience - have tried to make the page something it was never intended to be, and it is now a mess. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:23, 3 May 2020 (UTC)
  - @From Hill To Shore: This isn't a list of approved guidelines. It's a list of "proposed and approved policies and guidelines". ChristianKl ❪✉❫ 11:37, 4 May 2020 (UTC)
    - @ChristianKl: The page has "proposed and approved" but has a dedicated section for proposals. It may not be the intention of the page author, but I read the page to mean anything listed before the proposal section has been approved. From Hill To Shore (talk) 14:21, 4 May 2020 (UTC)
      - That's a fair point. As there wasn't a discussion that found consensus to make the autobiography page a policy I moved it to the proposed section. ChristianKl ❪✉❫ 19:47, 4 May 2020 (UTC)
User:Multichill/Questionable notability Wikimedians is the list you may be looking for. Thierry Caro (talk) 14:15, 2 May 2020 (UTC)
- That page is a gross misrepresentation of our notability policy, and should be deleted. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:23, 3 May 2020 (UTC)
I was wondering why my entry was deleted without going through the process of consensus. Is the plan to delete the entire list at User:Multichill/Questionable notability Wikimedians? Can someone direct me to where this as debated so I can see what the currents state of consensus is? It would be helpful if we had clear cut rules to avoid ad hoc deletion by an individual editor who has deletion rights. --RAN (talk) 15:38, 2 May 2020 (UTC)
- I hope not, many of those people are published authors of books and papers, and if almost everyone with an ORCID is being mass imported into Wikidata, certainly those Wikimedians meet the notability bar too. Gamaliel (talk) 17:30, 2 May 2020 (UTC)
- Do we have a process to appeal a deletion, like the one in Wikipedia and Wikimedia Commons? Can we also stop deleting the entries until there is an objective policy, so that a single individual does not determine who stays and who goes? We should create a policy and then have a bot do the deletions, so that an individual editor's bias is not involved. When one individual chooses who stays and who goes, it may come down to just not liking the photo of the person. --RAN (talk) 17:43, 2 May 2020 (UTC)
  - We do not have aby page dedicated for appealing deleted items like WP does.--Trade (talk) 19:09, 2 May 2020 (UTC)
    - Wikidata:Requests for deletions points to Wikidata:Administrators' noticeboard as the place to request undeletion. The QID here seems to be Richard Arthur Norton (Q73707267). Thanks. Mike Peel (talk) 19:30, 2 May 2020 (UTC)
      - Admins are not very enthusiastic on undeletion requests. this request does not got many comment by admins. Though I still think Q22959230 should be undeleted. (I am not targetting any particular admins.)--GZWDer (talk) 22:50, 2 May 2020 (UTC)
        Charlotte Gerson (Q22959230) restored (but see also Wikidata:Requests_for_deletions/Archive/2017/07/30#Q22959230). Please add your sources. I think we just have very few active admins, and we're all volunteers here. A lot of discussions seem to just peter out without resolution. Bovlb (talk) 23:22, 2 May 2020 (UTC)
        And by the same token, I discourage the creation of new noticeboards unless there is some plan for how to staff them. Bovlb (talk) 03:26, 3 May 2020 (UTC)
        
        "I think we just have very few active admins" Then we should appoint more; and make it more attractive to do such work. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:30, 3 May 2020 (UTC)
        @Pigsonthewing: "we should appoint more" - WD:RFA is thataway. It appears that most requests are successful. "make it more attractive to do such work" - Any suggestions? Bovlb (talk) 18:08, 4 May 2020 (UTC)
WD:N. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:25, 3 May 2020 (UTC)

Essay on principles of automated editing

Recently we have had a few discussions where there have been disputes between users of automated processes (bots, scripts or tools) and other users. Often these disputes seem to be partially caused by confusion over the responsibilities and expectations of different users. For example, we have had some cases where an editor has complained that a bot has made a mistake but the bot operator has replied to say that the bot has correctly imported the source information. There has also been a suggestion that we are forming into factions over those who support all actions by users of automated tools, those that challenge the edits of users of automated tools and those who ignore the problem.
I've put together a userspace essay to explain some of the principles of using automated processes and the methods to resolve disputes. The aim here is not to place blame on anyone (which would increase the perceived split into factions) but to explain some common sense standards of behaviour and how the existing processes can be used to resolve problems.
Does this seem useful and is the content set at the right level? From Hill To Shore (talk) 18:20, 2 May 2020 (UTC)

It looks good to me. Thanks for writing. —Scs (talk) 11:00, 3 May 2020 (UTC)

I haven't read it in detail, but it doesn't seem to encourage contributors to use more tools to stop contributing manually doing the same edit again and again (e.g. multiple reverts of the same bot edit).

A problem we encounter is that non-contributors to Wikidata seem to be offended that Wikidata is edited by bots. Maybe it should be made clear that data users interests are considered, but are generally not a factor in determining which tool should be used.

Besides, I don't think any Wikidata contributor has a duty to fix any error introduced by somebody else. Obviously, it's a good and helpful thing to do. --- Jura 11:42, 3 May 2020 (UTC)

If you feel that "duty" is too strong a word then I will be happy to replace it. I'm not sure of your other points though, especially with the caveat that you haven't read the page in detail. Your point about editors constantly reverting the same bot edit (and by implication, a bot operator constantly reverting a manual edit) is covered by the guide - we must direct those users into a discussion to find a consensus and then enforce that consensus if either side continues to edit war. Secondly, I make the point quite clearly that automated edits make a positive contribution to Wikidata and note the advantages of automated processes. However, this is not intended as promotional material for one type of editing over another. The key purpose of my essay is to clarify the misunderstandings between the two groups that you note in your comment and I don't think straying away from a neutral presentation of facts and process will be conducive to either audience.

However, if you want to write your own tool promotion essay then I will be happy to link to it with a "see also" section at the bottom of my essay. We don't have to give out all messages on a single page after all. From Hill To Shore (talk) 15:59, 3 May 2020 (UTC)

I feel like we need some idea of what kind of error rates we are willing to tolerate when using automated tools. We can't be right all of the time and doing all editing manually isn't feasible (and still will result in errors). Realistically the acceptable error rate is a function of what kind of information is being imported. BrokenSegue (talk) 06:13, 4 May 2020 (UTC)

Acc. to Asimov's laws robots may never do harm to humans. Mass edits by bots over a minimal error rate I would consider to be psycho harm to people (editors here that try to fix errors, users of wikidata expecting correct data, etc.): bots (their operators) should be responsible for cleaning up such errors. --Herzi Pinki (talk) 19:19, 4 May 2020 (UTC)

What happened to automatic data updation based on infobox

When I was here a few years back, a bot was running which was fetching details of father, mother, date of birth, etc from enwiki and adding to their respective Qs. Another bot was running which was adding data to the other end. By other end I mean father on son's page and vice versa based on either Q. Both these activities seems stopped. Is there any new policy or something? Capankajsmilyo (talk) 07:08, 4 May 2020 (UTC)

@Capankajsmilyo: 1. importing such data to Wikidata is usually not a fully-automatic process, but performed by tools like HarvestTemplates or a similar script in Pywikibot. Some properties (dates) are imported by different users occaionally. I do not recommend importing fathers and mothers in such way to prevent bring errors. 2. Data are fetched (transcluded) from Wikidata to Wikipedia. They should not be mass copied by bot. Instead infoboxes should be adapted so that Wikidata value may be shown.--GZWDer (talk) 08:43, 4 May 2020 (UTC)

Thanks again for the response. I am not talking about data from wikidata to wikipedia but vice versa. Wikipedia communities seem to have a strong feeling against transclusion of wikidata. However, doing the opposite seems a better option. Whatever users edit on wikipedia is updated oon wikidata (add some quality-checks if data exist in wikidata already). I'm not sure its same, but some TemplateData thing existed on wikipedias before. To elaborate suppose a new actor / politician / person page is created on wikipedia, his/her data gets created on wikidata too. The editors update the page with details like date of birth, hometown, place of birth, etc but that data doesn't get updated on wikidata. Capankajsmilyo (talk) 09:59, 4 May 2020 (UTC)

I'm not sure about biographic infoboxes, but on Commons, Template:Infobox artwork (Q6064255) can be still be used to populate missing properties (artist, date, collection, etc) on respective artwork items, provided appropriate templates or text-strings are in the template. -Animalparty (talk) 21:01, 4 May 2020 (UTC)

Duplicate with link to gom-wiki

Hello! Could someone help me decide if Gregory of Nazianzus (Q25690037) is a duplicate of Gregory of Nazianzus (Q44011) or of Gregory of Nazianzus the Elder (Q935447)? I don't speak the language tag, and because it is translitterated I don't know how to translate it. Thanks. --Jahl de Vautban (talk) 10:06, 4 May 2020 (UTC)

Jahl de Vautban: According to the image links, it was a duplicate of Gregory of Nazianzus (Q44011). Esteban16 (talk) 16:56, 4 May 2020 (UTC)

@Esteban16:, yes, but dates seemed to point more toward Gregory of Nazianzus the Elder (Q935447), that's why I was confused. Anyway, it's done, it shoudn't be to hard to relinks in case it's wrong. Thanks! --Jahl de Vautban (talk) 17:08, 4 May 2020 (UTC)

Wikidata weekly summary #414

Here's your quick overview of what has been happening around Wikidata over the last week.

Discussions
- New request for comments: A meta item namespace (Mxxx) for structured data about Wikidata

Events
- Competition for the International Museum Day is about improving data about museums in Austria, France, Germany, Italy and Switzerland on Wikidata, from 3 May 2020 to 18 May 2020, more information on Museum Day 2020/Wikidata Competition
- Upcoming: Next Linked Data for Libraries LD4 Wikidata Affinity Group call: Alex Jung on Wikidata and Wikipedia Infoboxes, 05 May. Agenda
- Upcoming: live SPARQL queries in French by Vigneron, Tuesday May 5 at 20:00 CEST (UTC+2)

Press, articles, blog posts, videos
- Wikipedia Weekly Network - LIVE Wikidata editing: YouTube
- Art+Feminism Office Hours: Introduction to Editing Wikidata: YouTube
- Webinar on why and how to contribute to Wikipedia and Wikidata? (in French): YouTube

Tool of the week
- ProWD explores completeness for entities and classes.

Other Noteworthy Stuff
- MachtSinn, a tool that allows you to easily add Senses to Lexemes, has lately been improved significantly
- 33,000 values of British Museum person or institution ID (d:Property:P1711) now have a useful target again. After their being mostly inoperative for several years, the new British Museum website now has information pages matching these values, with links to related objects. (Example). To work around bug T112081, a script by Andrew Gray is going through the items making null edits to update the relevant URLs.
- Outdated copy of wb_terms has been dropped on April 29th

Did you know?

Development
- Add monolingual language codes rm-rumgr, rm-surmiran, rm-sursilv, rm-sutsilv, rm-vallader, rm-puter (phab:T222426)
- More work on the test system for federated properties
- More work on displaying statements with federated properties (phab:T246606)
- Bridge: more work on generic error screens (phab:T241126)
- Bridge: warn the user that they are about to edit anonymously (phab:T246676)
- Work on a Wikidata distributed game for automated finding of references
- Automated finding references: build a scraper, simple value matching, item analyzer
- Have the focus on field when adding new element on Sense (phab:T203461)
- Add more properties to the PageImages list (phab:T249811)
- Use strict types everywhere in Wikibase (phab:T251382)

You can see all open tickets related to Wikidata here. If you want to help, you can also have a look at the tasks needing a volunteer.

Monthly Tasks
- Add labels, in your own language(s), for the new properties listed above.
- Comment on property proposals: all open proposals
- Suggested and open tasks!
- Contribute to a Showcase item.
- Help translate or proofread the interface and documentation pages, in your own language!
- Help merge identical items across Wikimedia projects.
- Help write the next summary!

Read the full report · Unsubscribe · Lea Lacroix (WMDE) 15:33, 4 May 2020 (UTC)

Issue on Commons template drawing from Wikidata

commons:Commons:Village pump#More poorly curated WD content injected everywhere. Someone will probably want to follow up. - Jmabel (talk) 15:40, 4 May 2020 (UTC)

[11][12][13], and this was not even vandalism. Well, I do not particularly care about the Commons community discussions at the moment, but it is hard not to accept that he has a point.--Ymblanter (talk) 15:52, 4 May 2020 (UTC)

Did you mean to include mystery-meat links to revert pages? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:48, 4 May 2020 (UTC)

Yes, indeed, thanks. Will correct now.--Ymblanter (talk) 18:53, 4 May 2020 (UTC)

How to model graves?

A discussion was started in German about how to model graves. Wikidata:Forum#Wie werden Gräber modelliert? As the contributions are only by two people we both wish more involvement here to get a modelling guideline for graves. At the moment there are at least 3 ways to model a grave:

Hugo von Hofmannsthal's grave (Q91013436) as a separate item (about 830 items, about 150 separate graves assigned to persons as place of burial (P119))
Hugo von Hofmannsthal (Q51513) with the person (double modelling to see the difference for the same grave, in productive modelling either one or the other) (about 8050 items with coordinates, timelimit without coordinates)
grave of Walter Behrens (Q2543587) along with the cemetery Cemetery Atzgersdorf (Q758337)

Any application accessing graves needs to know the modelling scheme, without such scheme it is difficult to sparql all the graves. Forces:

If a grave is used by more than one notable person, modelling an extra object to use it as place of burial (P119) for all the persons buried there, seems to be a good idea.
if a grave is also considered to be a monument (e.g. Taj Mahal (Q9141)), it seems to be a good idea to model it separately.
Splitting a person from its place of burial (P119) information might create the need to access two items instead of just one, which has performance implications (# expensive calls).
transition issues to a common scheme

This is just one particular scheme question. How, in general, deal with such modelling issues? best --Herzi Pinki (talk) 09:50, 4 May 2020 (UTC)

I'm not sure if the item Hugo von Hofmannsthal's grave (Q91013436) was created solely because of the category on Commons that predates it, but not all graves are equal, and most don't warrant a seperate item.. Grant's Tomb (Q1025105) is notable. Arlington National Cemetery (Q216344) is notable, and contains many notable people, but not every headstone warrants a separate item (even if someone on Commons makes a category like Category:headstones of my great-great granddad photographed in 1998 by an orphan named Sven). All humans have heads, and most have arms, but we don't have an item for Barack Obama's head or William Shakespeare's left arm. For most people whose grave is known, I think place of burial (P119) should simply be the cemetery (or park, mausoleum, etc.), with relevant structured data with respect to the specific grave simply added as a qualifier to P119. -Animalparty (talk) 04:54, 5 May 2020 (UTC)

If there happens to be an item for a grave, you could link that with statement is subject of (P805) from the place of burial (P119) statement (sample). --- Jura 05:01, 5 May 2020 (UTC)
Jura, this would just duplicate redundant information (like exact location (grave#), coords, ref). If we use place of burial (P119) to set just a cemetery (or, in case of individual graves somewhere not in a cemetery, e.g. the Taj Mahal), it would be fine. But if we create an individual item for the grave (Hugo von Hofmannsthal's grave (Q91013436)) and set place of burial (P119) to that individual item, it would not help to reduce redundancy. place of burial (P119) allows both interpretations (added by @Marcus Cyron: [14]). --Herzi Pinki (talk) 08:32, 5 May 2020 (UTC)

The problem is the definition of "Ort". Wo "Wohnort" can be a city/village, but also a house. Usually we should go to the lowest possible item in the system. If it is a single grave - the grave. If it is a graveyard/cemetary/church/whatever, use this. If it is just a city/village/place - go for this. -- Marcus Cyron (talk) 10:41, 5 May 2020 (UTC)

In most cases, I don't think an item for the grave is useful (with grave, I mean graves like the ones for people on Wikidata:Lists/cemetery/Ireland/Glasnevin_Cemetery#location_of_burial, not a pyramid or Taj Mahal). For consistency sake across a cemetery, I'd use statement is subject of (P805). I think the app for Père Lachaise Cemetery (Q311) is based on the use of the cemetery item as value for P119.

I'm not sure if there is a need for an item about Hugo von Hofmannsthal's grave (Q91013436). For the ones I created at Wikidata:Lists/cemetery/Ireland/Glasnevin_Cemetery#location, I thought having items useful. OTH, for many, I don't think we even have items about the person buried there. --- Jura 08:58, 5 May 2020 (UTC)

I was thinking about modelling the graves of honor in Vienna cemeteries (>1000 graves of honor) to find those with missing images of the gravestone e.g. for the next competition. Also to get lists of notable persons with missing items. I was looking for some guideline more than for individual advice and for transition from one model to the other. I would have modelled the location with the person and as qualifiers to the cemetery in place of burial (P119), while @D-Kuru: preferred the more atomic approach (give everything a WD-item like you would give everything an IP-address in the Internet of Things). At the moment I'm stuck in confusion. --Herzi Pinki (talk) 10:27, 5 May 2020 (UTC)

As a sample, this is what is available for Père Lachaise Cemetery (Q311). Maybe I can dig up the App, if you want to adapt it. --- Jura 10:42, 5 May 2020 (UTC)

Property to list elections' results on an area's item?

Hi! Are there properties so that, on an item for an area (a town, state, etc.), results for various elections within it can be listed? If not, this is useful data (that is pretty hard to find sometimes in the case of old elections) which could be used to get sets to make forecasts more easily. Does such a property exist/should it? DemonDays64 | Talk to me 01:50, 5 May 2020 (UTC) (please ping on reply)

You mean state governor or US president election results on every single town item? The data seems a bit too granular for Wikidata. Maybe Commons? --- Jura 06:08, 5 May 2020 (UTC)

statistic request

I was wondering whether I could monitor the numbers of changes of a single property in a certain context (let's say a country (P17))) and in a timeframe by individual users? E.g. to get an idea how many images have been added image (P18) by each of them. Eventually to assign credits in competitions. best --Herzi Pinki (talk) 10:32, 5 May 2020 (UTC)

Items for countries are getting too large

My work with using Wikidata information in Commons infoboxes has run into several problems recently because including information from country items causes time-outs. Looking at France (Q142) as an example, it's now 1.6MB of data - which seems excessive. Looking through the item doesn't point to a single cause - there are multiple properties with many values that are adding up to cause issues. I don't have any solutions to suggest here, I'm just flagging the issue - does anyone have any thoughts on how to change things here? Thanks. Mike Peel (talk) 20:26, 24 April 2020 (UTC)

Could we move some of the statistics to a new item? For example, create item:Life expectancy of France, move all the data on life expectancy (P2250) to the new item and then link life expectancy (P2250) to the new item. From Hill To Shore (talk) 20:31, 24 April 2020 (UTC)

At Wikidata:Property_proposal/birth_rate, I suggested the use of "demographics of Norway" instead of Norway. Commons categories for some fields already display data from the country item, e.g. c:Category:Economy of Norway. Obviously, economy of Norway (Q1379783) could store that directly. --- Jura 20:43, 24 April 2020 (UTC)

I suppose it's a defect of the Wikidata API, that you are needing to retrieve every statement which has the particular country on the left hand side, when you probably only want a subset. At the same time, there's no way in the API to retrieve statements where the country appears on the right hand side, where it's more obvious that it wouldn't always be possible to return all matches in a single response. Ghouston (talk) 04:12, 25 April 2020 (UTC)

demonym (P1549) seems to me worthy of deletion once we moved the relevant information over to lexemes. Maybe we can model public holiday (P832) in the other direction? ChristianKl ❪✉❫ 07:23, 26 April 2020 (UTC)
- Can Wikipedia fetch that data from lexemes though? Last I heard, it wasn't possible to fetch data that links back to an item, only data that the item itself links to. - Nikki (talk) 07:40, 26 April 2020 (UTC)

I'm not sure whether it's presently in the functionality of the plugins but it certainly should be sooner or later. ChristianKl ❪✉❫ 12:58, 30 April 2020 (UTC)

From Wikidata:Request_a_query#How_many_triples_for_Q30, it appears that most triples on country items come from quantity properties, item properties and (surprisingly) sitelinks. Monolingual strings don't seem to have much of an impact. This even if the approach there slightly overestimates quantity properties and references aren't fully factored in. --- Jura 05:41, 1 May 2020 (UTC)

@Mike Peel: Have you reported this problem in Phabricator? It would be good to have some engineers look at it from the infrastructure side as well. Kaldari (talk) 20:09, 5 May 2020 (UTC)

@Kaldari: I haven't posted it on phabricator, as I can't point to a single cause (it seems to be a cumulative effect), or otherwise figure out a good way to report it. If you can think of a good way to report it, please go ahead and I'll comment where I can. Also, it's not as high priority as other issues I've posted on phabricator that are still waiting for a response (e.g., phab:T232927)... Thanks. Mike Peel (talk) 20:34, 5 May 2020 (UTC)

@Mike Peel: Do you have an API query that is reliably failing? Kaldari (talk) 21:01, 5 May 2020 (UTC)

>7,000,000 people

Wikidata now surpassed 7 million items about people (7,135,860 as of now [15]). I wonder if there is a way to break this down into groups where they come from. I imagine:

people in Wikipedia (initial group, continuous additions)
Peerage (ca. 600,000 Oct 2019)
Orcid
authors of scholarly articles (ca. ?)
etc.

Maybe this has already been done. --- Jura 09:27, 27 April 2020 (UTC)

A search for the most frequent occupations (in WDQS) finds:

politician (Q82955) 607249
researcher (Q1650915) 490418
association football player (Q937857) 260810
writer (Q36180) 246723
actor (Q33999) 222087
painter (Q1028181) 146924
journalist (Q1930187) 118402
university teacher (Q1622272) 103804

which seems a pretty healthy mix. Though a lot of people are missing that occupation (P106) statement too. ArthurPSmith (talk) 14:49, 27 April 2020 (UTC)

Yes, only 4.5 million have P106. Also, just 2 million have VIAF ID (P214). --- Jura 05:22, 28 April 2020 (UTC)
- Many of the entries may not be notable in themselves. They are just there to add structural connections between notable individuals. For example, if we have a notable grandfather and notable grandson, we will probably have a record for a non-notable parent to connect the two of them. The non-notable parent may not have sufficient source material to ever have occupation (P106) recorded and is unlikely to justify a VIAF ID (P214). From Hill To Shore (talk) 07:50, 28 April 2020 (UTC)
  - Wikidata:WikiProject Parenthood/reports/completeness attempts to assess that (some queries might be broken). Maybe 18% have a parent/child/spouse. --- Jura 12:20, 29 April 2020 (UTC)
  - Hundreds, maybe thousands, of Peerage-imported items are children who died in infancy or childhood, thus will likely never have an occupation (I suppose no value could be used for the sake of completionism). Similarly, many spouses of more notable persons may have no verifiable occupation other than housewife (Q38126150), homemaker (Q1934684) or socialite (Q512314). -Animalparty (talk) 20:33, 1 May 2020 (UTC)
    - There 2.5 million of items without P106, but just 600,000 from that import .. --- Jura 00:48, 6 May 2020 (UTC)

Property creator icon

Hello everybody,

I think current property creator icon isn't very descriptive of user-right, so I have gone ahead and created one: (File:Wikidata property creator.svg)

P in it stands for Property and barcode is Morse code of + sign, so it is descriptive form of P+ — which can be easily recognised as Property creator. I have also created two other versions: [16] and [17], but from a short discussion with fellow admins over IRC the choice is one I uploaded over commons.

As this is a big change, so I would like to know what community thinks of this change. Regards. ‐‐1997kB (talk) 07:35, 28 April 2020 (UTC)

Hello! For me your proposition is visually more attractive than the current one. Good work! --Jahl de Vautban (talk) 08:11, 28 April 2020 (UTC)

I like the idea of a new icon as the old one is pretty ambiguous. I'm unsure about your proposal as the only way you'd visually know it's for a property creator would be knowing the meaning of the barcode. So I suppose it depends whether the icon is intended to be meaningful without prior context or just to act as a visual identifier. In terms of visual/artistic feedback I also find the very full square appearance of the barcode visually unbalances the icon (and a more minor thing, but I like having the blue in there somewhere). I like your thinking though and nice work, for the balance thing perhaps experiment with clipping the barcode into the shape of a + or a letter C? --SilentSpike (talk) 10:41, 28 April 2020 (UTC)

@SilentSpike: Well.. the idea of barcode representing + icon is from Wikidata logo in which barcode represents word WIKI, so I think it's as meaningful as site logo. In terms of artistic feedback I tried to keep it as simple as possible.

But with your feedback I have also tried something different:

PC with wikidata logo morse code [18]
P+ with wikidata logo morse code [19]
PC where C filled with morse of C [20]
P+ where + filled with morse of + [21]

IMO following two of above are most simple and meaningful: File:Wikidata property creator.svg and PC with wikidata logo morse code [22]. ‐‐1997kB (talk) 13:22, 28 April 2020 (UTC)

@1997kB: I'd agree with your two picks. I also had a thought that really as an international project the symbol should avoid English letters (P can been seen as a symbol for the property ID, but C is really for the English word "creator"). So I mocked up:

P with a magic wand [23]

as a more symbolic design idea. --SilentSpike (talk) 15:57, 28 April 2020 (UTC)

@SilentSpike: Like the idea of having magic wand there, but I do not think magic wand is something that can be co-related to creators. Also as complex the shape goes it will be hard to use them in topicons and userboxes as when size is reduced it's hard to recognize complex shapes. ‐‐1997kB (talk) 03:29, 29 April 2020 (UTC)

Also I have tried adding blue in File:Wikidata property creator.svg. See [24] where P is filled with morse of wikidata logo. ‐‐1997kB (talk) 03:52, 29 April 2020 (UTC)

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘

Perhaps your original design is best purely for following the KISS principal. Though I do feel like we could be missing a trick by not having something more immediately symbolic of creation than a barcode that needs to be decrypted into the + sign. It somewhat goes against the nature of an icon. --SilentSpike (talk) 09:24, 29 April 2020 (UTC)

P in darkred (color change 8b0000) with the last 7 bars of the logo and a little + above (darkred). —Eihel (talk) 12:17, 29 April 2020 (UTC)

Yeah IMO File:Wikidata property creator.svg and [25] are the best one for now. But let's keep this thread open for atleast a week and in the meantime I will also try some other designs. ‐‐1997kB (talk) 15:26, 29 April 2020 (UTC)

Good work everyone. If it's time for choices: 1st preference : red P and barcode by 1997kB, it has style. 2nd choice : P with a magic wand by SilentSpike, it's cute. Last choice : P+ where + filled with morse of + by 1997kB, because it looks like my first choice. The C is just english. —Eihel (talk) 17:19, 28 April 2020 (UTC)
@Eihel: How about File:Wikidata property creator.svg ? It's similar to your first choice with morse is of + sign and morse of + is symmetrical about y-axis while morse of C is not. ‐‐1997kB (talk) 03:29, 29 April 2020 (UTC)
corrected —Eihel (talk) 09:03, 29 April 2020 (UTC)

Since it's been a week, I have gone ahead and added icon to the templates. ‐‐1997kB (talk) 02:58, 6 May 2020 (UTC)

Elections with one candidate

Election with one candidate. So no voting took place and the person was elected. How can I show that in Wikidata? 2006 Mayor of Kato Polemidia Municipality Elections (Q93160832). Xaris333 (talk) 01:05, 4 May 2020 (UTC)

This is a so called tacit election. You could add instance of (P31)=tacit election (Q1760295) --Pasleim (talk) 08:33, 5 May 2020 (UTC)

@Pasleim: why not uncontested election (Q85811908)? Are they different thing? Xaris333 (talk) 16:44, 5 May 2020 (UTC)

uncontested election (Q85811908) is slightly more general than tacit election (Q1760295). In a uncontested election (Q85811908) it is possible that voting still takes place even though there is only one candidate. In some election systems, even if there is only one candiate, they can fail because a minumum number voter turnout is required. In a tacit election (Q1760295) no voting at all takes place. --Pasleim (talk) 16:54, 5 May 2020 (UTC)

Differenciate between part of constellation and/or asterism

The constallation Draco contains all stars within the yellow borders while the asterism Draco consists only of the stars connected by the green lines.

The star HD 139357 (Q2044186) is part of the constellation Draco Draco (Q8675), but not part of the asterim while Thuban (Q15714) is both part of the constellation and asterism.

Describing a star of being part of a constellation seem to be expressed with HD 139357 (Q2044186) constellation (P59) Draco (Q8675) and a constellation having stars with Draco (Q8675) has part(s) (P527) HD 139357 (Q2044186). The constellation can be described with instance of (P31) constellation (Q8928).

How to describe the difference between an asterism and a constellation best? Should there bee an asterism draco and a constellation? Is there a property to make a star part of an asterism?

Looking forward to reading your ideas. ragards Ogmios ^(Tratsch) 11:40, 5 May 2020 (UTC)

For instance, Big Dipper (Q10460)has part(s) (P527)Alpha Ursae Majoris (Q13084), Alpha Ursae Majoris (Q13084)part of (P361)Big Dipper (Q10460) Ghuron (talk) 18:03, 5 May 2020 (UTC)

How long it takes to a change to be synced?

Hi, I have changed a value in Wikidata about an hour ago (mobile tagline in a company's page) and it still doesn't seem to be synced. what can I do?

Hi, you can purge the page by adding ?action=purge to the end of its URL, and your changes should appear. Modeum (talk) 18:17, 5 May 2020 (UTC)

P410: military rank

I'd like to propose to change this property a bit to include ranks of other services, e.g. police, border guard, gendarmerie etc. Right now military or police rank is described as military rank in English and has Wikidata item of this property (P1629)military rank (Q56019) only. I think it should be renamed to service rank with an alias military rank. Inclusion of police rank (Q19476593) was proposed in 2018 by Andreasmperu with no response, in 2019 [26] Lord Yeager added police rank (Q19476593) as allowed value type. Wostr (talk) 20:00, 5 May 2020 (UTC)

Could an admin please look at this property proposal

Hi all

Could an admin please look at Wikidata:Property proposal/Included in curricula, it has been open for almost a month with 11 supports and only one oppose (the two opposes at the top just wanted the inverse property which has now been done).

Thanks very much

--John Cummings (talk) 21:55, 5 May 2020 (UTC)

I wonder if you actually read participants' comments. --- Jura 22:21, 5 May 2020 (UTC)

Mix'n'Match disappointment

I set up Mix'n'Match catalogue 3536 today. It has 24.5K public bodies in the United Kingdom, but only ~700 were matched automatically. I'm wondering whether I could have done something to improve the matches, or whether the issue is that too many of the organisations are not marked up as such in Wikidata? I've only checked a small sample, but Antrim Borough Council (Q16970896), for example, is said to be an "instance of administrative territorial entity of Northern Ireland". Note, though that it failed to match the exact string "Bromsgrove District Council" to the identically-labelled (in English) Bromsgrove District Council (Q73072550), which is, indirectly, an instance of a subclass of organisation. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:06, 5 May 2020 (UTC)

Please, merge

Please, merge Q4293437 and Milicz (Q11781234). 217.117.125.72 08:56, 6 May 2020 (UTC)

Can be merged but please note that "Милич" is the cyrillization of Milicz, Milić, Milich... It may be ambiguous. --Wolverène (talk) 09:00, 6 May 2020 (UTC)

Discussion about ISIL (P791) and formatter URL (P1630)

I have started a discussion about removing all currently deprecated values for property ISIL (P791) listed at property formatter URL (P1630). Details can be found at this link, please join if you're interested. --Sannita - not just another it.wiki sysop 10:30, 6 May 2020 (UTC)

Wikidata:Requests for permissions/CheckUser#BRPever

In accordance with the policy, here is a notice of my candidacy for CheckUser.-BRP^ever 01:26, 7 May 2020 (UTC)

The Peerage and gender

I'm doing some work on fixing misgendering and stumbled upon a huge amount of items that, apparently, have incorrect sex or gender (P21) sourced from The Peerage person ID (P4638). For example, the source states that Sir Joseph Fuller (Q75931107) is a female (F) and uses the pronoun "she", but also that Fuller is "son of" and appointed "Knight Grand Cross". This is highly unlikely. There are other cases where The Peerage states that a male took a married family name from a female. Here's a query to find suspicious cases:

SELECT ?item
WHERE {
  ?item wdt:P4638 ?thepeerageid .
  {
    ?item wdt:P21 wd:Q6581097 .
    ?item wdt:P735/wdt:P31 wd:Q11879590 .
  }
  UNION
  {
    ?item wdt:P21 wd:Q6581072 .
    ?item wdt:P735/wdt:P31 wd:Q12308941 .
  }
}
LIMIT 100

Try it!

How should we handle this? --MarioGom (talk) 23:16, 30 April 2020 (UTC)

Fix it! The Peerage has no shortage of errors, and the mass import has caused no shortage of headaches. Yay! -Animalparty (talk) 23:47, 30 April 2020 (UTC)

I fixed some low hanging fruits, but this is going to require some semi-automated heuristics with quite a lot of refinement. I'm going to move on to other task at the moment, but here's a better query to detect possible errors:

SELECT ?item
WHERE {
  hint:Query hint:optimizer "None" .
  ?item wdt:P4638 ?thepeerageid .
  {
    ?item wdt:P21 wd:Q6581097 .
    ?item wdt:P735/wdt:P31 wd:Q11879590 .
  }
  UNION
  {
    ?item wdt:P21 wd:Q6581072 .
    ?item wdt:P735/wdt:P31 wd:Q12308941 .
  }
  
  # honorifix: Sir
  #?item wdt:P511 wd:Q209690 .
  
  ?item wdt:P735 ?given .
  FILTER (?given NOT IN (
    wd:Q16652258, # Joan (female)
    wd:Q1484457, # Joan (unisex)
    wd:Q18001597 # Christian (male)
  ))
  
  ?item p:P21 ?sexstatement .
  ?sexstatement prov:wasDerivedFrom ?sexref .
  ?sexref pr:P248 wd:Q21401824 .
  
}
LIMIT 100

Try it!

and here's a couple of examples of more detailed references I used to denote the deviation: Lord Esmé Gordon (Q75281890), Sir Thomas George Wilson (Q76216976). --MarioGom (talk) 10:17, 1 May 2020 (UTC)

Does the Peerage website correct errors that we point out in a reasonable time frame? It would be much easier to have them correct errors as we detect them, rather than have two values here, with the incorrect one deprecated. We used to have a contact at VIAF that corrected errors within a month, but that appears to have stopped last year. --RAN (talk) 02:22, 2 May 2020 (UTC)

Weeeelll ... here's what the introductory page at The Peerage, written by Daryl Lundy, says. "The site is the result of around 17 years of work by one (somewhat eccentric) person collating information on the British Peers (and some European royals), and then entering it into a range of various genealogy programs... NOTE: this site is a work in progress, due to new reference sources becoming available for these families as well as new births, deaths and marriages. It is possible a few errors have crept in, so please pay attention to the credibility of each of the citations given when evaluating the quality and accuracy of this data. Your help in finding, reporting and fixing any errors is hugely appreciated. I hope you enjoy the information I have collected and presented here. Please contact me via email [email protected] with any corrections or updates you might have. I will do my very best to continue to expand and evolve this site on a regular basis." Levana Taylor (talk) 03:29, 2 May 2020 (UTC)

I've said it before, and I'll say it again: The Peerage, impressive though it is, is literally the work of one random guy. It should not be taken as anything but an amateur project (just like Wikidata, but less readily changeable). Errors, discrepancies, and redundancies abound, some of which is inherent to historical compilation, others are simple errors of omission or fact. The fact that it's easily accessible online doesn't make it an authoritative source. Many of the apparent mis-genderings appear to be reciprocal, e.g. the "female" Donald Finlay (Q75910034) is married to the "male" Isabel Heathfield Eliott (Q75910031), which suggests a simple coding error, and which might make bot-assisted corrections easier. Unfortunately, the misgenderings also affect values of mother (P25) and father (P22), which would also need to be swapped for all children of misgenered items. – The preceding unsigned comment was added by Animalparty (talk • contribs) at 04:11, 2 May 2020 UTC (UTC).

I wouldn't rely on given names only to determine sex or gender (P21). Also, if "stated in" is present, it should be removed or the value deprecated otherwise the reference becomes incorrect. --- Jura 06:10, 2 May 2020 (UTC)

Why don't we accumulate all errors here: Wikidata:WikiProject Authority control/The Peerage errors and let Darryl@thepeerage know that we store them all in one place. We should also keep a query of all the entries with duplicate values, like we do at the VIAF error page. I looked over my old emails and Darryl responded the same day and made corrections I pointed out to him. I just emailed Daryl and sent him the link to the new error page. Can we migrate the gender errors detected there? No data set is error free, it is just whether the errors can be corrected or not. Also note that we are creating errors on our end by incorrectly merging his data to the wrong person using the Mix-N-Match program. I have been correcting people who died before they were born, and people over 120 years old, errors that we caused by merging people of the same name, but the wrong people. --RAN (talk) 17:10, 2 May 2020 (UTC)

Hmmm... I've corrected dozens of errors without keeping track of them. Is there some way of searching my edit history for items with Peerage IDs? (Not that I think going back and hunting for this stuff is a particularly good use of time, but if it was easy, I might do it.) Levana Taylor (talk) 19:34, 2 May 2020 (UTC)

I have added a section with the latest version of the query: Wikidata:WikiProject Authority control/The Peerage errors#Incorrect gender. --MarioGom (talk) 09:31, 7 May 2020 (UTC)

Wikidata user growth is increasing

May 2016 we had 7882 active editors and April 2018 we had 9734. Which is roughly an added 10% per year. In March 2020 we had 14163 active editors which suggests we grew 20% per year over the last two years with is double the user growth in the years before.

I'm very happy that we succeeded in improving on the metric I consider the most important for Wikidata. We still have plenty of issues on Wikidata but having more people contributing on our project means more people fixing errors. ChristianKl ❪✉❫ 10:03, 5 May 2020 (UTC)

If I understand the metric correctly, it includes users who moved or deleted pages at Wikipedia. Do we know how the users who actually edited Wikidata evolved? --- Jura 10:15, 5 May 2020 (UTC)

It's no perfect measurement. It includes users that move or delete more then 5 pages per month. Given relatively steady usercounts at the individual Wikipedia's it would surprise me if there would be suddenly hundreds more of those. ChristianKl ❪✉❫ 14:41, 5 May 2020 (UTC)
- Given that the absolute number is probably overestimated, the increase would be even larger if edits at Wikipedia are stable. --- Jura 10:09, 7 May 2020 (UTC)

Mainpage:news

Hi. wikidata:news is full of Wikidata milestone news and shows no interesting data for those who visit main page. Please remove it from main page. دوستدار ایران بزرگ (talk) 08:29, 6 May 2020 (UTC)

What replacement do you suggest? --- Jura 11:16, 6 May 2020 (UTC)

You can use many thing. I propose Wikidata:Showcase_items or new created properties introduced in Wikidata:Status updates or Wikidata:Map data. دوستدار ایران بزرگ (talk) 18:50, 6 May 2020 (UTC)

Yeah, it's pretty un-newsworthy. There's really no special status at all to being the x millionth piece of data, (but maybe it makes the robot who probably created it happy). It would be nice to see how people in the real world are actually using any of the data. Has Wikidata had any transformative effects on society? Is it increasing the pace of medicinal research and drug discovery? Has it powered the spread or translation of knowledge into new communities? If so, it would be nice to know. It'd be nice to see some of the fruits of the labor of us bots and humans, and would probably help persuade more people to donate or improve data. -Animalparty (talk) 02:51, 7 May 2020 (UTC)

@Doostdar: Wikidata:Showcase_items isn't really updated. Maybe we could try to select recent items that go beyond a given threshold in terms of statements. --- Jura 06:33, 7 May 2020 (UTC)

Office and office holder

Is it possible to have one item for the office and another item for the office holder, even if one of them has no article in the Wikipedias, or do they have to be merged into one? I see that there are for minister and ministry, but, for instance, only for praetor and not for praetorship. I think they are different because the office holder refers to the people in office and the office to the office itself. --Romulanus (talk) 10:08, 6 May 2020 (UTC)

Generally the value for position held (P39) is the item for the office held ("praetor").

For other uses, I suppose you could create a property for "office of the praetor", if there are some statements that could be made about it. In general, I'd just use "praetor".

It doesn't really matter if Wikipedia has articles about any of them. --- Jura 11:17, 6 May 2020 (UTC)

To contextualize Romulanus' question, the discussion arose from this merge and this one. --Jahl de Vautban (talk) 11:45, 6 May 2020 (UTC)

I do not argue with the value to be used with position held (P39), but if possible the existence of both items on their own. The merges Jahl de Vautban mentions are what led me to ask. I think they're not the same. --Romulanus (talk) 11:55, 6 May 2020 (UTC)

I'm the one who split one into "decemviri" and "decemvir" ;) It's a bit like council of ministers and minister.

Q663395 (Roman magistrature) and Q20778343 (Roman magistrate) are different as both are classes. I'd keep them separate as well. --- Jura 12:02, 6 May 2020 (UTC)

@Jura1:, I think I more or less get it for the decemviri and the decemvir. To put it in others words, the body of the decemviri is composed of ten decemvir, just like a body of tresuiri is composed of three triumvir, correct? If that the case I agree that my merge be undone.

However I fail to understand the difference between a magistrature and a magistrat. Could you clarify it? --Jahl de Vautban (talk) 12:29, 6 May 2020 (UTC)

Maybe "the government" and "high-ranking government official" could compare? magistrature could link magistrat with has part(s) of the class (P2670). --- Jura 06:35, 7 May 2020 (UTC)

@Jura1: well, I've tried to think about it but I still can't understand what would be the justification of having "Roman magistratures" and "Roman magistrates", even when I compare with modern day countries. Anyway, I'm in minority here and I won't fight over it. @Romulanus: you can rollback my merges. --Jahl de Vautban (talk) 08:09, 7 May 2020 (UTC)

Both would be in singular. --- Jura 10:05, 7 May 2020 (UTC)

Mediawiki Infobox gene

Hi!

On a computer that runs MediaWiki that's not one of the WMF project cluster I've been trying to use v:Module:Infobox gene so that the information displayed for each gene box on e.g., Wikipedia or Wikiversity show up, but it doesn't work. There are statements in the w:Module:Infobox gene that call the information from Wikidata. How should these be modified, if possible, to call the data from Wikidata? --Marshallsumter (talk) 04:21, 7 May 2020 (UTC)

Indicating uni and department as employer?

Is there an inverse of of (P642) that can be used when stating an employer? Most academics that list an employer just list the university, but the department woudl also be very useful. Is there a 'more specifically' qualifier to use as person has employer = university ABC, qualifier 'more specifically' = department of XYZ (E.g. Axel Zeitler (Q60531603) has the employer=University of Cambridge (Q35794), more specifically=Cambridge University Department of Chemical Engineering and Biotechnology (Q5025585)). Any thoughts? T.Shafee(evo&evo) (talk) 02:00, 6 May 2020 (UTC)

You're not employed by your department, you're employed by the larger organization. But perhaps affiliation (P1416) would be approprite here, either as a top-level statement or as a qualifier on the employer (P108) statement. —Scs (talk) 12:41, 6 May 2020 (UTC)

@Scs: Good idea. Currently affiliation (P1416) can't be used as a qualifier of employer (P108), but it might be a logical location. T.Shafee(evo&evo) (talk) 05:12, 7 May 2020 (UTC)

It looks like part of (P361) is currently used for that, with over 7000 instances, e.g., Jan Harm Tuntler (Q21552450). Another alternative would be member of (P463), if there's a desire to change it. Ghouston (talk) 00:20, 8 May 2020 (UTC)

Should labels of WikiProjects include "Wikipedia:" or not?

Recently @Sawol:'s edit on WikiProject Korea (Q8503515) bumped this issue, that there are many WikiProjects that, at least for "en", their labels are omitted "Wikipedia:" namespace(s), should them keep omitting, or should "Wikipedia:" namespace be restored? --Liuxinyu970226 (talk) 06:47, 6 May 2020 (UTC)

Needs help from users who usually edit those WikiProjects' items: @Harej, Bjung, Ricordisamoa, WhisperToMe: --Liuxinyu970226 (talk) 06:49, 6 May 2020 (UTC)

@Sawol: Pardon?! --Liuxinyu970226 (talk) 07:00, 6 May 2020 (UTC)

@Sawol: Please do not re-add "Wikipedia:" without consensus here. --Liuxinyu970226 (talk) 02:20, 7 May 2020 (UTC)

Those items should not use the namespace prefix "Wikipedia:", as labels are given per language, not per project and there may be non-Wikipedia WikiProjects linked to the item. ---MisterSynergy (talk) 07:17, 6 May 2020 (UTC)

In MediaWiki the default name of that namespace is "Project:". I guess that works? 62 etc (talk) 08:41, 6 May 2020 (UTC)

I would keep the prefix, depending on the language label, if it is meant to have interwiki links to project pages. Otherwise, it gets confusing with projects that have article space articles. See WikiProject Women in Red (Q23875215) and Women in Red (Q43653733). --MarioGom (talk) 08:45, 6 May 2020 (UTC)

@MarioGom: I'm afraid that saying articles as "WikiProjects" are logically wrong, maybe the description should be modified, suggest to say "a movement that..." --Liuxinyu970226 (talk) 02:20, 7 May 2020 (UTC)

@Liuxinyu970226: I can't make any sense of that last thing you said. I can't even parse it. Could you reword? - Jmabel (talk) 15:38, 7 May 2020 (UTC)

Images for countries

User:Victor Knox raises interesting problem, should we have image (P18) for countries and if yes what type? I see several alternatives, let's discuss (and probably vote):

Location map (but we have specific location map (P1943))
Topographic map
Satellite image
The most known sightseeing
Collage of different sightseeings (but we have montage image (P2716))
Some other variant?

Personally I would prefer 2 or 3. --Infovarius (talk) 22:13, 6 May 2020 (UTC)

IMHO satellite image. strakhov (talk) 04:05, 7 May 2020 (UTC)

Satellite image won't show borders, and for landlocked countries will just be a rectangle of terrain. Or am I missing something? - Jmabel (talk) 15:42, 7 May 2020 (UTC)

Verifiability and notability

As a followup to Wikidata:Administrators' noticeboard#Please restore Q73707267 and Q92453624: it's beyond time that we have a serious conversation about clarifying the second (and arguably third) criteria of Wikidata:Notability, especially in connection with living persons' items. We also need to draw a line on what sources are acceptable; too often I see items created by folks on themselves with nothing but e.g. Instagram links. We need to ensure we send the message that we are not an indiscriminate collection of information, and that "serious and publicly available" does not just mean someone's own social media, or even their biography in an "About Us" page.

I'm open to suggestions, but we need clarity, and pretty urgently, given how many of these items are created each day.--Jasper Deng (talk) 00:46, 4 May 2020 (UTC)

In my opinion, items about entrepreneurs should be deleted if the company's itself aren't notable. Or if the item doesn't mention any companies at all. --Trade (talk) 01:17, 4 May 2020 (UTC)

Specific guidelines like that could be useful but I'm more looking for general rules applicable to anything, not just entrepreneurs. This is especially important as many outsiders see WD as a place for SEO, given its use by Google.--Jasper Deng (talk) 04:09, 4 May 2020 (UTC)

"aren't notable" - but each company, no matter how small is it, may be mentioned or described in numerous primary sources, and sometimes aggregation thereof. There is a Chinese website listing most of companies (more than 200 million) registered in China (which is and aggregation source, which does not indicate any notability in Wikipedia), and managers and stakeholders thereof. This means basically all Chinese companies can have an item. I am personally not opposed to such idea as some other projects (such as OpenStreetMap) allows them (and we may make Wikidata data useful in OSM), but I am afraid that most users do not want Wikidata to be a yellow page.--GZWDer (talk) 08:22, 4 May 2020 (UTC)

I don't mind WD becomming a Yellow Pages. What i do mind however are non-communicative SEO users dumping hordes of terrible made self promoting items, constantly recreating them through an army of sock puppets and various IP adresses while expecting WD to clean up after them. Theae people are not just annoying, they are qctively malicious @GZWDer: --Trade (talk) 11:15, 4 May 2020 (UTC)

Also, Although I oppose exclusion of primary sources (as they are still serious public data, see Wikidata:Requests for comment/Handle genealogical information; and some kinds of subjects, like paintings, are mainly described in primary sources), but in the extreme, allowing them means we can create items for more than 200 million dead people in United States (from various US censuses), plus 300 million living ones (mentioned in at least one public records - one may even argue that we can import all such records to Wikisource and link them in Wikidata, though this does not yet happen).--GZWDer (talk) 08:32, 4 May 2020 (UTC)

Primary sources cannot be the only source of information on living people; if e.g. one living person is needed as a parent of another, notable one, then we can apply criterion 3 instead, though ideally, we should only use secondary sources.--Jasper Deng (talk) 08:49, 4 May 2020 (UTC)

But this may still mean we can create items for all companies appearing in primary sources (or even yellow pages, which may be secondary) and their managers and stakeholders (for private companies there will only be few ones) will be notable per #3.--GZWDer (talk) 09:01, 4 May 2020 (UTC)

In principle, we can require for living people that all statements are cited to reliable sources, to start with, but this still have two problems: (i) what sources are reliable - for example, the English Wikipedia community recently decided that Daily Mail is not reliable; should we follow the suit? Are databases reliable sources? (ii) if we only keep the statements which are referenced by reliable sources, and the number of these statements is greater than zero - is the item notable? I would argue it is still not necessarily notable (in the end of the day, a telephone book is a reliable source and contains all peole, or at least all people with a telephone), and the notability according to criterion 2 should be smth like one should be able to create a project page (for example, a Wikipedia article in any language) which still conform to the policies of the project, but I do not have a sightest idea how to formulate this.--Ymblanter (talk) 10:52, 4 May 2020 (UTC)

"one should be able to create a project page" - this will exclude many things such as scientific articles and paintings. Even limited to people (or living ones), should we include individuals described in a larger article like w:Family_of_Barack_Obama#Malia_Obama_and_Sasha_Obama or w:Lupton family? and which level of "description" is needed for #2?--GZWDer (talk) 11:01, 4 May 2020 (UTC)

I think these are arguably notable according to criterion 3, Structural need.--Ymblanter (talk) 11:29, 4 May 2020 (UTC)

One of the reasons I wanted a Wikidata entry as a Wikimedian is that despite my images being released under a creative commons license, or even ones I loaded as public domain, I am still contacted by the organization that wants to publish them, and asked to sign a release, or give permission in an email. See for example this image. I don't want my entry removed based on the malice, whim, or bias, of an individual editor with deletion rights. Read how Isaac Newton deleted his rival Robert Hooke. --RAN (talk) 16:44, 4 May 2020 (UTC)
Do we want to discuss the status of Wikidata:Verifiability also? It has been barely touched in years; it would need some significant changes to match current practice here I think... ArthurPSmith (talk) 17:34, 4 May 2020 (UTC)
- @ArthurPSmith: We do need to revisit that too, sorry for not making it clear from my initial comment. This is the primary reason why some projects like the English Wikipedia are distrustful of our data, and it is the primary reason why notability and living persons cannot be enforced as well as they could be. Also, for those of you who want to cry "WD:UCS": UCS is useful only in cases where consensus is obvious and clear, which is quite clearly not the case here.--Jasper Deng (talk) 20:04, 4 May 2020 (UTC)
  - Not only verifiability should be revisited, but also every issues related to content, e.g. NPOV (m:Neutral point of view says some have it and some do not, how should it be properly settled in Wikidata?), Original research (Wikidata de facto allowed some original researchs, such at mass adding main subject (P921)), and also deletion policy (Should we set up some speedy deletion criteria and require discussion for deletion other than such criteria?), etc.--GZWDer (talk) 00:04, 5 May 2020 (UTC)
    - I am not sure how you could have NPOV on Wikidata. On a Wikipedia it is fairly straight forward; you consider the balance of sourced material and ensure text is phrased to put more weight on the opinion accepted by the majority of sources, with a smaller note on the alternate view. How do we set weighting on Wikidata beyond uploading multiple versions of a statement and setting preferred rank on the one with the most reliable sources? If we have a controversial statement only mentioned in a minority of sources, do we include it because it is a statement verifiable to at least one source or exclude it because it is not a neutral balance of source material? From Hill To Shore (talk) 00:19, 5 May 2020 (UTC)
      - @From Hill To Shore: NPOV is inherently a smaller problem on WD than on other wikis. The descriptions of items are usually fairly mechanical, and more watered down than Wikipedia lead sentences; if we have doubts, NPOV can be applied in the usual way to the description. The trickiest case is, as you hinted at, conflicting sources on the same claim. Ranking of statements should reflect NPOV in that case. Otherwise, the usual concept of due weight is not as applicable; for example, personal data like height that usually only goes in the infobox gets similar prominence to the person's occupation, which usually takes the majority of a Wikipedia person article. I'm thus not too worried about NPOV. But verifiability in the first place is a more fundamental problem that needs to be addressed urgently.--Jasper Deng (talk) 08:59, 5 May 2020 (UTC)
        In my experience, the area where we run into the most NPOV issues in Wikidata is in determining what country places are in. This query shows disputed claims marked with statement disputed by (P1310).
        
        And field of work (P101), occupation (P106) where not absolutely clearcut. - Jmabel (talk) 15:00, 5 May 2020 (UTC)
        For disputed places, Wikidata:Property proposal/Recognition, recognized by, not recognized by, jurisdiction status was proposed as a solution.--GZWDer (talk) 00:40, 6 May 2020 (UTC)
        There is also a fair amount of issues with descriptions, because that's what is displayed on the android wikipedia application. WD:D is quite clear, and yet, saying "far right politician" in a description is different from "politician", but this is done all the time. Even the name of the item do sometime matter (for example, the insistance of masculinists to use Marc Lépine (Q575023) birth name, because it sound more arabic). That seems a bit more problematic than having Joe Random getting maybe some SEO, especially since there is much better way than using Wikiadata (like, having a regular website who would just appear first and where no one else has control). --Misc (talk) 14:26, 8 May 2020 (UTC)
If it were up to me, I'd insist all information be verifiable to reliable sources, and for things that are not trivial like height and weight, ideally something independent of the subject. Note that the Wikipedia standard of using secondary sources is relaxed here. I don't want this discussion to peter out like many others of its kind have, because I think we do need to take this seriously.--Jasper Deng (talk) 10:10, 8 May 2020 (UTC)

It would be less of a problem if adding sources was less cumbersome. --Trade (talk) 13:00, 8 May 2020 (UTC)

This basically means Wikidata:Requests_for_comment/Verifiability_and_living_persons#General_guideline plus secondary sources (or w:WP:ABOUTSELF) should be used for living people. But this does not solve the question of what is a reliable source.--GZWDer (talk) 15:57, 8 May 2020 (UTC)

Notability of Corona deaths

Hoi, I notice that many people who died of COVID-19 are included in Wikidata that are not notable. They are stated to be policemen, inmates even relatives of celebrities. The level of notability in Wikidata is not high but this is imho pushing it/ too far. What do you think.. Thanks, GerardM (talk) 01:52, 7 May 2020 (UTC)

I say give up. They are dead, and they are data. One day we will all be in Wikidata. Robots will curate us when we're gone. All Watched Over by Machines of Loving Grace (Q4729854). -Animalparty (talk) 02:53, 7 May 2020 (UTC)

"One day we will all be in Wikidata<^{[citation needed]}"? How? Nomen ad hoc (talk) 06:09, 7 May 2020 (UTC).

Someone will invent a structural need, or simply mass import public birth and death records, probably. -Animalparty (talk) 22:50, 7 May 2020 (UTC)

Well, that would sound concerning for me, Animalparty. Nomen ad hoc (talk) 14:14, 8 May 2020 (UTC).

I am going to San Junipero (Q27877321) when my Wikidata entry gets deleted again. --RAN (talk) 05:43, 7 May 2020 (UTC)

We might look to en:Wikipedia:September 11 victims for guidance, but of course there are far more victims of COVID-19, and Wikidata's notability criteria are weaker. Bovlb (talk) 03:01, 7 May 2020 (UTC)

Not as useful, perhaps, but for completeness see also en:https://en.wikipedia.org/wiki/List_of_deaths_due_to_coronavirus_disease_2019, en:Wikipedia:Articles for deletion/List of deaths from the 2019–20 coronavirus pandemic, and en:Wikipedia:Articles for deletion/List of deaths from the 2019–20 coronavirus pandemic (2nd nomination). Bovlb (talk) 03:43, 7 May 2020 (UTC)

I think you are trying to apply English Wikipedia rules to Wikidata. If they actually exist, and can be referenced, they can be included, if someone is willing to do the work of entering them. --RAN (talk) 04:01, 7 May 2020 (UTC)
- I'm not trying to apply English Wikipedia rules to Wikidata. I am seeking guidance where it is available. They have dealt with similar issues in the past. We are not bound by their rules, but that doesn't mean we can't learn from them. Bovlb (talk) 04:11, 7 May 2020 (UTC)

If they have an obituary we can link to, then they are Wikidata notable. If we want to come up with a scheme to rank people by notability, I am sure we could think of a way to that. --RAN (talk) 05:54, 7 May 2020 (UTC)

"If they have an obituary", they are surely notable. But in case they haven't? Nomen ad hoc (talk) 06:11, 7 May 2020 (UTC).

I think we essentially dropped notability criteria for people when the Peerage was imported .. Is it correct that about 20% them are living people? --- Jura 06:25, 7 May 2020 (UTC)

Wikidata:Notability is clear and has been the same for 7 years: if there is sources (criteria 2) and /or a structural need (criteria 3) then it's notable for Wikidata. No reason to treat COVID-19 deaths differently than the rest of the items. Cheers, VIGNERON (talk) 06:43, 7 May 2020 (UTC)

So what's the structural need to copy a phone book? --- Jura 07:16, 7 May 2020 (UTC)
- But a phone book is a source.--GZWDer (talk) 11:09, 7 May 2020 (UTC)

We have always discussed the phonebook quandary in discussions, and generally the answer has been there is not much usefulness/utility in importing 10,000 entries labelled only labelled John Smith with just a telephone number, especially since telephone numbers are not permanent. We want to import high density information, databases with multiple fields, so people can be properly disambiguated. --RAN (talk) 18:27, 9 May 2020 (UTC)

- - And my phonebook do not have a way to sparql query. There is researchers out there that would surely want to find various death based on location, professions, etc. Studying the impact of a pandemy would surely be easier with something like wikidata that do have such a list. --Misc (talk) 10:30, 8 May 2020 (UTC)

- - - - I think the consensus is that phone books shouldn't be imported into Wikidata. So even if there are sources (criteria 2), this is not thought to be sufficient. --- Jura 10:36, 8 May 2020 (UTC)

I don't think notability is that much of an issue for Wikidata so long as reliable primary sources exists (which is a much lower threshold for inclusion than secondary sources). I'm personally much more concerned with the large import of personal data of non-famous persons, especially as Wikidata is increasingly reused by large databases (typically the Knowledge Graph) which gives a large exposure to potentially sensitive information. I believe that at some point the project will need to use different inclusion criteria for living or recently deceased persons. Alexander Doria (talk) 11:14, 7 May 2020 (UTC)

What are you referring to? Come with some examples. @Alexander Doria:--Trade (talk) 22:19, 7 May 2020 (UTC)

Duplicated park

I walked to Hells Kitchen Park (Q49499734) to snap a picture and later discovered that there is also a Hell's Kitchen Park (Q5706428). The apostrophe as far as I see is the correct punctuation, and otherwise as far as I see both entries have correct information even though the coordinates are for a different part of the park. Am I handling it correctly by mentioning it here? Jim.henderson (talk) 12:56, 7 May 2020 (UTC)

Thanks for the picture.

It's reasonably easy (though occasionally tricky) to merge duplicate items like this one.

I usually use Special:MergeItems (and I've just used it to merge Q49499734 into Q5706428).

See also Help:Merge; supposedly there's a "merge gadget" described there that might be easier. —Scs (talk) 13:10, 7 May 2020 (UTC)

Yes, the gadget is easy to use. You open the Q item you are merging from, open the merge tool from the menu and insert the Q item you are merging to. From Hill To Shore (talk) 09:43, 8 May 2020 (UTC)

You can use this to merge in either direction, by selecting the appropriate option. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:51, 8 May 2020 (UTC)

Merging two items : Vietnamese boat people and boat people

Hello to all,

I typically contribute to wikipedia, but while doing so I noticed that pages in differents languages are not showing up : specifically Vietnamese boat people Q6449297 and boat people Q494303 should probably be the same item, but I can't seem to be able to merge them. Could someone who knows how to do that resolve the issue ?

Thanks in advance, from Wikipedia. --Marteil2003 (talk) 08:58, 8 May 2020 (UTC)

@Marteil2003: It looks like one of these items is for refugees travelling by boat, and the other is for Vietnamese boat people specifically. It's possible that some of the site links are on the wrong item. Ghouston (talk) 09:20, 8 May 2020 (UTC)

@Marteil2003: The markup {{Q|Q6449297}} and {{Q|Q494303}} renders as Vietnamese boat people (Q6449297) and boat people (Q494303) (the labels being shown in the viewer's preferred langauge). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:54, 8 May 2020 (UTC)

@Ghouston, Pigsonthewing: Ok I think I solved the issue, but could someone go over it ? I have never used Wikidata. Thanks, Marteil2003 (talk) 11:05, 8 May 2020 (UTC)

Yes, it's fine. Ghouston (talk) 11:31, 8 May 2020 (UTC)

Thank you all, this is great. And unsurprisingly it is the Vietnamese Wikipedia that has a minimal pair between these two items. Deryck Chan (talk) 16:03, 9 May 2020 (UTC)

Sort sequence

The Commons category:Aaron Jones sorts on "Aaron", not "Jones" as it should. The only lines in the category description are four parent categories and Template:Wikidata Infobox. If I add DEFAULTSORT, I get an error message saying that there is a preceding default, which is correctly shown as "Jones, Aaron". The Wikidata entry correctly shows his given and family names, so anything coming out of Wikidata should be correct. So, where is the incorrect sort coming from? Jameslwoodward (talk) 14:36, 8 May 2020 (UTC)

Probably phab:T252079 as some of the labels were missing. I purged Commons:Category:Aaron Jones but there seems to be a delay in updating it in the parent categories. Peter James (talk) 18:08, 8 May 2020 (UTC)

P94 = P4004 ?

Hello! Please explain the difference between the properties coat of arms image (P94) and escutcheon image (P4004)! Doc Taxon (talk) 17:10, 8 May 2020 (UTC)

It looks like it was created for "shield image" in w:Template:Infobox_settlement, but apparently that is (currently) meant to hold the coat of arms (description is "Can be used for a place with a coat of arms. ").

@Xaris333, ChristianKl: please double-check. --- Jura 17:22, 8 May 2020 (UTC)

@Jura1: I don't have the domain knowledge about flags to give good guidance here. ChristianKl ❪✉❫ 23:11, 8 May 2020 (UTC)

coat of arms (Q14659) is different from escutcheon (Q331357). Xaris333 (talk) 18:38, 8 May 2020 (UTC)

I think I know the difference now, thank you for your help Doc Taxon (talk) 14:02, 9 May 2020 (UTC)

Doc Taxon (talk) 14:02, 9 May 2020 (UTC)

@Xaris333: If so, would you update the property documentation at Property talk:P4004 (the enwiki template probably shouldn't be mentioned) and sort out current uses [27]? --- Jura 15:12, 9 May 2020 (UTC)

Unusual number of statements for a person

What is happening at Prem Raj Pushpakaran (Q61656939)? Is this vandalism? gobonobo ⁺ ^c 01:50, 9 May 2020 (UTC)

It looks well intentioned but a bunch of the statements don't belong there. Especially uncited ones like religion. ChristianKl ❪✉❫ 08:06, 9 May 2020 (UTC)

Not gonna lie, when I saw the title of this talk page section I thought it was a jab at item #383541. Mahir256 (talk) 10:12, 9 May 2020 (UTC)

Requests for permissions/Bot/নকীব বট

~~According to the policy,~~ I want to draw your kind attention to the Request for Permission page. - Regards Nokib Sarkar (talk) 02:18, 9 May 2020 (UTC)

While I see no problem with drawing attention in this way, I don't think we have a policy that requires it the same way we do have one that requires it for CheckUsers. ChristianKl ❪✉❫ 07:42, 9 May 2020 (UTC)
- @ChristianKl:, Thanks for at least noticing that. The request is hanging for about two days. Regards - Nokib Sarkar (talk) 13:23, 9 May 2020 (UTC)

Merging two items : Q93734821 and Q65336472

Hello to all, I typically contribute to wikipedia, but while doing so I noticed that pages in differents languages are not showing up : specifically အောင်စိုးသာ Q65336472 and Aung Soe Tha Q93734821 are same person. I can't linked enwiki page and mywiki page so please merge this two pages. Thanks Cape Diamond MM (talk) 16:51, 9 May 2020 (UTC)

@Cape Diamond MM: I just merged them, and the interlanguage links now work. If you want to do this yourself in the future, follow the instructions at Help:Merge. Just be careful that the two pages refer to exactly the same concept. Vahurzpu (talk) 19:34, 9 May 2020 (UTC)

Gay villages

A category was imported from English Wikipedia that is almost devoid of references at English Wikipedia. I can see if referenced, it would be a legitimate categorization, but it looks like it is just someone's personal opinion, sprinkled with some referenced ones, and some iconic ones. Can I remove them when I come across them here at Wikidata, if unreferenced? I came across it while filling in information at Asbury Park (Q201127) --RAN (talk) 17:50, 9 May 2020 (UTC)

It would be better to find references than make deletions. - Jmabel (talk) 19:08, 9 May 2020 (UTC)

@Richard Arthur Norton (1958- ): There's a 'gay community' history subsection on the Asbury Park article. It even has a reference. I think this term usually applies to a neighborhood and not an entire city. Unless there's an item for a neighborhood that this would apply to, I'd use has part(s) (P527) --> gay village (Q748198), sourced with that reference. gobonobo ⁺ ^c 01:12, 10 May 2020 (UTC)

Excellent, thanks!

How to model different versions of scientific articles?

What is open peer review? A systematic review (Q29649956) and What is open peer review? A systematic review (Q30491890) represents two different versions of one scientific article. I don't know what's the correct property to model that.--GZWDer (talk) 02:16, 7 May 2020 (UTC)

I am wondering about the interest to have two items for that? Is only one not enough? Usually, different version of a given paper fix only spelling and minor stuff. It does not change the meaning or the results of the paper. Pamputt (talk) 07:42, 7 May 2020 (UTC)

(edit conflict) Hmm, usually identifiers have the version at the end, indeed, the DOIs differ there, one has ".1", the other ".2". And usually one can get the newest version by just removing the suffix. But this does not work with DOI, does it? How can you get the newest version of that article without knowing which? Anyway I don't think any author wants old versions being referred to. --SCIdude (talk) 07:44, 7 May 2020 (UTC)

What about using only one item and add the several DOI values with possibly a qualifier such as applies to part, aspect, or form (P518) with an item equivalent to pre-release version (Q51930650)? Pamputt (talk) 08:20, 7 May 2020 (UTC)

It would make sense to merge the items, set the newest DOI with preferred rank and add qualifiers if they apply, as Pamputt proposed. --MarioGom (talk) 09:26, 7 May 2020 (UTC)

For what it's worth, we do sometimes have distinct entities for different editions of books. One example I found is Gray's Anatomy (Q200306) and Gray's Anatomy (20th edition) (Q19558994). —Scs (talk) 13:22, 7 May 2020 (UTC)

WikiProject Wikidata for research has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. Pamputt (talk) 09:55, 7 May 2020 (UTC)

I just merged the two elements, indicating the two DOIs. I marked the older DOI as depreciated. Pamputt (talk) 21:47, 7 May 2020 (UTC)

Also in these, any suggestions what to do with Q83567432? It seems to have been a regularly updated newsfeed. --- Jura 10:04, 7 May 2020 (UTC)

Similarly, How do we treat preprint (Q580922) of an article and its final publication? e.g. Q61228422 posted to bioRxiv (Q19835482) 6 months before publication of Q92219560 in Fungal Ecology (Q15817063). Do we merge items and qualify the identifiers, dates, and source, or keep separate and link via properties (and if so, which properties)? -Animalparty (talk) 22:13, 10 May 2020 (UTC)

Duplicate items: ZooBank vs ORCID

There are a lot of authors duplicated from ZooBank and ORCID imports. I have created a list of high confidence potential duplicates here: User:MarioGom/Potential duplicates/ZooBank vs ORCID. Duplicates can be verified with the links to ZooBank and ORCID and finding duplicate publications. I have queries with wider criteria, but more false positives. Feel free to ping me if you find false positives in the list, or if it should be replenished with more potential duplicates. --MarioGom (talk) 09:23, 7 May 2020 (UTC)

Notified participants of WikiProject Biology

Nice work. I've confirmed and merged one, and will do a few more. (Is there a button to push to regenerate your list, to avoid duplication of effort?)

Would someone scold me if I suggested that this might indicate additional disambiguation criteria that the ORCID and/or ZooBank bots could use while importing? —Scs (talk) 11:17, 7 May 2020 (UTC)

A bot could indeed look at ORCID and ZooBank duplicates for matches. Scs: you can regenerate the list clicking on Manually update this list and waiting up to 1 minute for the success page. However, note that changes take time to propagate and will have no effect for some time even clicking the link. I'm not sure how much it takes, but updating in less than 10 minutes or so usually has no effect. In the absence of manual updates, it updates daily automatically. --MarioGom (talk) 11:45, 7 May 2020 (UTC)

I wasn't thinking of a separate bot (that's more or less what you've written) -- I was thinking that the bot that originally created, say, Enrique Macpherson (Q30512371) could have checked first and noticed that Enrique Macpherson (Q21340373) already existed. —Scs (talk) 11:52, 7 May 2020 (UTC)

I noticed that several of the dups I just merged had similar qids (Q54522789, Q54537815, Q54539266), and indeed they were all created by QuickStatementsBot in late May 2018, with the edit summary "Created a new Item: #quickstatements; invoked by SourceMD:CreateFromWikispeciesDOIs‎". I have no idea how to figure out who created that batch, and it's obviously long finished, and creation of such batches might being done more carefully today, but it's the sort of thing I'm thinking of. —Scs (talk) 13:03, 7 May 2020 (UTC)

The query returns at least one false positive. I have added reciprocal different from (P1889) statements to the items concerned. @MarioGom: Please can you adjust your query to exclude such items? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:49, 8 May 2020 (UTC)

Andy Mabbett: Done, thank you! --MarioGom (talk) 11:44, 8 May 2020 (UTC)

The false positive was probably because there was a link from Q24206732 to Helen Ward (Q43193512); I changed it to link to Helen Ward (Q25450864). Q59846853 also linked to the wrong person, and I'm not certain that the Helen Ward linked in Q47201908, Q47288737, Q48124447, Q50655465, Q51033516, Q80260479, Q87705425, and Q90234880 (and possibly others) is the same person as Helen Ward (Q43193512). Looks like several different people. Peter James (talk) 12:49, 8 May 2020 (UTC)

Duplication because of private ORCID additions

I have noticed many many duplicate items for people because ORCID identifiers are added for people whose information is private at ORCID. The result is that there is no sane way to merge them. Non free ORCID items are really problematic. Thanks, GerardM (talk) 07:33, 8 May 2020 (UTC)

I'm not sure that somebody who only has a name and an ORCID can be assumed to be notable. Ghouston (talk) 09:25, 8 May 2020 (UTC)

Personally, I mostly ignore these two-statement ORCID only items. --- Jura 09:38, 8 May 2020 (UTC)

There are often other sources confirming the use of the ORCID iD by the named individual referred to in the item. I can't say whether that's true in the cases referred to here, because they are not identified. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:46, 8 May 2020 (UTC)

Not when we only have a name and an ORCiD identifier that is not informative.. They have been added by a bot.. When the ORCiD id is NOT private it is easy and obvious to link them to papers. In this case that will not work because of "false friends". Thanks, GerardM (talk) 12:50, 8 May 2020 (UTC)

If it is used in some PubMed publications, you can search it in EuropePMC, for example (explicit link to EuropePMC author profile). At least, you can get some information of the author. They exists for a reason. Also, even if a profile is private, it may provide links to profiles in other websites (e.g. ResearcherID), which may be imported to Wikidata.--GZWDer (talk) 15:46, 8 May 2020 (UTC)

Most of the private ORCID profiles are on Wikidata because they are linked to some publication. Using "what links here", you can usually trace it back to a publication and see the field of research and the author's affiliation (usually on the paper itself) which will, most of the time, uniquely identify the author. Vahurzpu (talk) 19:22, 8 May 2020 (UTC)

How do you ensure that a paper is INDEED linked to that author.. Still the number of duplications is huge. Merging is to be done by hand.. and that is not you ? Thanks, GerardM (talk) 20:00, 8 May 2020 (UTC)

On the contrary; what I said applies in many cases of duplicate items, where on one of them we only have a name and an ORCID identifier. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:42, 9 May 2020 (UTC)

They cannot be merged because you do not know if it is a false friend. THanks, GerardM (talk) 05:34, 10 May 2020 (UTC)

The cases I describe can and should be merged, because there is another source confirming the use of the ORCID iD by the named individual referred to. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:24, 10 May 2020 (UTC)

How to model different versions of scientific articles?

What is open peer review? A systematic review (Q29649956) and What is open peer review? A systematic review (Q30491890) represents two different versions of one scientific article. I don't know what's the correct property to model that.--GZWDer (talk) 02:16, 7 May 2020 (UTC)

I am wondering about the interest to have two items for that? Is only one not enough? Usually, different version of a given paper fix only spelling and minor stuff. It does not change the meaning or the results of the paper. Pamputt (talk) 07:42, 7 May 2020 (UTC)

(edit conflict) Hmm, usually identifiers have the version at the end, indeed, the DOIs differ there, one has ".1", the other ".2". And usually one can get the newest version by just removing the suffix. But this does not work with DOI, does it? How can you get the newest version of that article without knowing which? Anyway I don't think any author wants old versions being referred to. --SCIdude (talk) 07:44, 7 May 2020 (UTC)

What about using only one item and add the several DOI values with possibly a qualifier such as applies to part, aspect, or form (P518) with an item equivalent to pre-release version (Q51930650)? Pamputt (talk) 08:20, 7 May 2020 (UTC)

It would make sense to merge the items, set the newest DOI with preferred rank and add qualifiers if they apply, as Pamputt proposed. --MarioGom (talk) 09:26, 7 May 2020 (UTC)

For what it's worth, we do sometimes have distinct entities for different editions of books. One example I found is Gray's Anatomy (Q200306) and Gray's Anatomy (20th edition) (Q19558994). —Scs (talk) 13:22, 7 May 2020 (UTC)

WikiProject Wikidata for research has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. Pamputt (talk) 09:55, 7 May 2020 (UTC)

I just merged the two elements, indicating the two DOIs. I marked the older DOI as depreciated. Pamputt (talk) 21:47, 7 May 2020 (UTC)

Also in these, any suggestions what to do with Q83567432? It seems to have been a regularly updated newsfeed. --- Jura 10:04, 7 May 2020 (UTC)

Similarly, How do we treat preprint (Q580922) of an article and its final publication? e.g. Q61228422 posted to bioRxiv (Q19835482) 6 months before publication of Q92219560 in Fungal Ecology (Q15817063). Do we merge items and qualify the identifiers, dates, and source, or keep separate and link via properties (and if so, which properties)? -Animalparty (talk) 22:13, 10 May 2020 (UTC)