Property talk:P2427
Documentation
institutional identifier from the GRID.ac global research identifier database
grid\.\d{4,6}\.[0-9a-f]{1,2}
”: value must be formatted using this pattern (PCRE syntax). (Help)List of violations of this constraint: Database reports/Constraint violations/P2427#Type Q43229, Q13226383, Q811430, Q732577, Q1298668, Q56061, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P2427#Unique value, SPARQL (every item), SPARQL (by value)
List of violations of this constraint: Database reports/Constraint violations/P2427#single best value, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P2427#Entity types
Unique constraint
[edit]User:Ivan A. Krestinin has already removed the mandatory flag from the single value constraint. But looking at the constraint violations, it seems that GRID has does not even try to have unique IDs. Therefore I suggest we completely remove the constraint. --Srittau (talk) 12:00, 10 January 2016 (UTC)
- 102 cases per 7845 items. The rate is not looked as very high. It can be errors in GRID DB, so the constraint can be useful to indicate the errors. Maybe somebody want to contact with GRID DB maintainers. — Ivan A. Krestinin (talk) 12:16, 10 January 2016 (UTC)
- I'm already talking with them - this came from an initial import of their data dump based on matching up wikipedia URL's; some of them obviously were not quite right so there's cleanup needed. It is INTENDED to be single-valued. ArthurPSmith (talk) 21:26, 11 January 2016 (UTC)
- Ok, sounds good to me. Let's keep the constraint then, non-mandatory for now. --Srittau (talk) 22:28, 11 January 2016 (UTC)
- After consultation with WikiProject Companies, I've been doing some work to split items of multinational corporations into items for national branches, migrating the GRID ids accordingly. This should help fix some of the constraint violations. − Pintoch (talk) 18:41, 19 March 2017 (UTC)
- @Pintoch: I suspect most of the duplicate issues on the constraints page are such as you describe - but some of them are inappropriate duplicates on the GRID side. I have reported several dozen of these to them so hopefully they will get fixed; if you spot more please go ahead and report on the GRID page for one of the ID's if you can. ArthurPSmith (talk) 20:22, 20 March 2017 (UTC)
- After consultation with WikiProject Companies, I've been doing some work to split items of multinational corporations into items for national branches, migrating the GRID ids accordingly. This should help fix some of the constraint violations. − Pintoch (talk) 18:41, 19 March 2017 (UTC)
- Ok, sounds good to me. Let's keep the constraint then, non-mandatory for now. --Srittau (talk) 22:28, 11 January 2016 (UTC)
- I'm already talking with them - this came from an initial import of their data dump based on matching up wikipedia URL's; some of them obviously were not quite right so there's cleanup needed. It is INTENDED to be single-valued. ArthurPSmith (talk) 21:26, 11 January 2016 (UTC)
After cleaning up most of the multinationals, I have emailed GRID to show them the remaining list of identifiers, and here is their reply:
We have looked through the 229 potential merge candidates you have submitted and we identified 60 duplicates and have redirected them to the correct entry. In addition we have also added missing Wikidata ids to the corresponding GRID entry. Unfortunately all of these changes and corrections will not be visible to you until our next data release. Again, thank you for this input, it has been very helpful and we appreciate the time and effort you put into improving our data very much.
I think we could use this to split the remaining items with duplicate ids into separate items, now that we know that they have been confirmed by GRID. (We seem to have the same notion of organization). Any idea how to proceed? − Pintoch (talk) 10:57, 18 April 2017 (UTC)
- Did they tell you which 60 were agreed on as duplicates? Otherwise it sounds like we have to wait for the next release (early May?) ArthurPSmith (talk) 13:57, 18 April 2017 (UTC)
- Right, as they said they had "redirected" them I thought HTTP redirections were already in place on grid.ac, but indeed we'll have to wait a bit. − Pintoch (talk) 16:57, 18 April 2017 (UTC)
- @Pintoch: the new release is out (somehow dated a week ago, I'm sure it wasn't up then though?)!! I hope to be looking at it the next few days and see if I can figure out how many of those dups have been fixed. ArthurPSmith (talk) 20:23, 31 May 2017 (UTC)
- @ArthurPSmith: Great! Is there any better way to detect merges than simply checking for HTTP redirects on the web interface? − Pintoch (talk) 21:00, 31 May 2017 (UTC)
- the grid.json dump file has merged id's listed with no name and a "redirect" field - for example for grid.481551.c it has the entry:
- @ArthurPSmith: Great! Is there any better way to detect merges than simply checking for HTTP redirects on the web interface? − Pintoch (talk) 21:00, 31 May 2017 (UTC)
- @Pintoch: the new release is out (somehow dated a week ago, I'm sure it wasn't up then though?)!! I hope to be looking at it the next few days and see if I can figure out how many of those dups have been fixed. ArthurPSmith (talk) 20:23, 31 May 2017 (UTC)
- Right, as they said they had "redirected" them I thought HTTP redirections were already in place on grid.ac, but indeed we'll have to wait a bit. − Pintoch (talk) 16:57, 18 April 2017 (UTC)
{"id":"grid.481551.c","status":"redirected","redirect":"grid.410484.d"}
- there are 400 new ones with this release (2374 redirected in May, 1963 in April release), so definitely many updates there! ArthurPSmith (talk) 12:25, 1 June 2017 (UTC)
- By the way I've started some of the cleanup on this (by hand basically - there are a bit over 200 cases to look at). Some of the new redirects seem to be GRID deciding not to include some smaller entities and only point to the larger ones - an example is grid.420437.6 - National Oceanographic Data Center (Q6974619) which has been redirected to grid.454206.1 - National Centers for Environmental Information (Q21015842) - I'm leaving that alone as the old ID was previously valid, but the formatter URL now takes you to the parent organization. Maybe we should delete these GRID id's though? But cases where there were two GRID id's on one wikidata item, or only the old ID that has been updated, I'm removing the old one and ensuring the new one is there. ArthurPSmith (talk) 19:03, 1 June 2017 (UTC)
- FYI I've finished what I thought was needed on cleaning up the redirect iD's. I believe all the remaining redirected GRID ID's in wikidata are cases where GRID has merged two or more records that we probably don't want to merge; none of them should have more than one ID per wikidata item any longer. ArthurPSmith (talk)
Bad claims imported from DB
[edit]Saw several, here is one example:
85.180.90.183 18:16, 18 June 2018 (UTC)
- if it's an accurate representation of what is stated in the source, then the statement should be marked deprecated with an appropriate annotation, not removed. You can also contact the source database here (GRID has a ticket system for tracking such requests) to get anything you feel is wrong corrected. ArthurPSmith (talk) 19:02, 18 June 2018 (UTC)
Establishment Dates with Wrong Calendar Label
[edit]Dates of establishment are available for many Iranian organization listed in GRID data sets. However the dates are in Solar Hijri and not Gregorian. I suggest these references to be removed or bulk calculated and reimported. Examples: Q30265263 (on GRID.ac), Q33122101 (on GRID.ac). I reported this issue to that database maintainers as well. Shervinafshar (talk) 04:10, 13 April 2020 (UTC)
- @Shervinafshar: Oops! Yes indeed. Do you have a guess how many are a problem? ArthurPSmith (talk) 18:00, 13 April 2020 (UTC)
- @ArthurPSmith: turns out there are not many of them which are incorrect: https://w.wiki/Mkg . I don't know in SPARQL how to exclude those without GRID reference so there are few int the results set of this query which are legit; eg. Q47690 or Q273874. Shervinafshar (talk) 00:41, 14 April 2020 (UTC)