Shortcuts: WD:RAQ, w.wiki/LX

Wikidata:Request a query

From Wikidata
(Redirected from Wikidata:RAQ)
Jump to navigation Jump to search

Missing LIS terms in Hungarian

[edit]

Hi, I would need some help. I would like to retrieve library and information science (Q13420675) terms that are not translated into Hungarian, so we can translate the missing terms all at once. I expect no more than 1-200 terms. Example: document retrieval (Q1638872).

The properties I would need: QID, EN preferred label, EN description, EN alternative label(s), DE preferred label, DE alternative label(s). Thank you in advance! Bencemac (talk) 20:19, 12 November 2024 (UTC)[reply]

@Bencemac: Hello, I try it. But it is not easy to find all items for library and information science (Q13420675). So I make another solution. I check for "is part of".
PREFIX schema: <http://schema.org/>
#defaultView:Table
SELECT ?item ?label_en ?desc_en ?label_de ?desc_de ?label_hu ?desc_hu
WHERE {
  
  ?item wdt:P31/wdt:P361* wd:Q199655 .   # part of library science (244 items)
  #?item wdt:P31/wdt:P361* wd:Q1235196.   # part of documentation science (only 1 item)
  #?item wdt:P31/wdt:P361* wd:Q16387.      # part of information science (978 items)

  OPTIONAL {?item rdfs:label        ?label_en. filter(lang(?label_en)="en"). }  
  OPTIONAL {?item schema:description ?desc_en. filter(lang( ?desc_en)="en"). }  
  
  OPTIONAL {?item rdfs:label        ?label_de. filter(lang(?label_de)="de"). }  
  OPTIONAL {?item schema:description ?desc_de. filter(lang( ?desc_de)="de"). }  
  
  OPTIONAL {?item rdfs:label        ?label_hu. filter(lang(?label_hu)="hu"). }  
  OPTIONAL {?item schema:description ?desc_hu. filter(lang( ?desc_hu)="hu"). }  

  FILTER(!BOUND(?label_hu))  # no label_hu 
  #FILTER(!BOUND( ?desc_hu))  # no desc_hu

  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],de,en,hu". }
}
#limit 50
Try it!
I hope this first SPARQL-Query will help you. You can choose what you want in line 6-8. Also in line 19 and 20 you can filter for no label or no description. Best regards. --sk (talk) 10:01, 17 November 2024 (UTC)[reply]
Thank you very much! :) Bencemac (talk) 14:19, 17 November 2024 (UTC)[reply]

railway platforms from Spanish Wikipedia

[edit]

Hi! I could use a query listing all items with a number of platform tracks (P1103) and/or number of platform faces (P5595) statement, but only if that statement is imported from Wikimedia project (P143) Spanish Wikipedia (Q8449) (and is not deprecated). I would love another column showing country (P17) of each item (besides columns with the P1103 and P5595 values). Geogast 🤲 (talk) 15:53, 20 November 2024 (UTC)[reply]

@Geogast: Here is my solution:
#defaultView:Map
SELECT ?item ?itemLabel ?itemDescription ?npt ?npf ?countryLabel ?coordinate
WHERE {
  ?item wdt:P1103 ?npt.                       # number of platform tracks

  ?item p:P1103 ?statnode .
  ?statnode prov:wasDerivedFrom ?refnode . 
  ?refnode pr:P143 wd:Q8449.
  
  OPTIONAL {?item wdt:P1103 ?npf}    # number of platform faces
  OPTIONAL {?item wdt:P17 ?country.}
  OPTIONAL {?item wdt:P625 ?coordinate.}
  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]"}
}
Try it!
But I dont include the "and is not deprecated". Because what did you mean. The platform is deprecated or the statement imported from Wikimedia project (P143) is deprecated. But I hope my query will help you. Best regards --sk (talk) 14:53, 21 November 2024 (UTC)[reply]
Oh, thanks a lot! It helped me a lot to get an overview. However, something is weird: the query gives wrong numbers. Conceição (Q5158299) has the statements: P5595=2; P1103=4 – but the query gives 4 and 4. Same with Botujuru train station (Q11681623). General Gutiérrez station (Q5842425) only has a P1103 statement (not P5595), but the query gives 2 and 2. The "not deprecated" issue seems fine (so I don't need to explain what I meant, right now). Geogast 🤲 (talk) 22:54, 22 November 2024 (UTC)[reply]
Now I see that in line 4, it says P1103 and "number of platform tracks" – and in line 10: P1103 again – but "number of platform faces". Is it possible that the issue is there? In fact, the query should give waaay more results where P1103 and P5595 values differ. Thanks for your work!! Geogast 🤲 (talk) 23:06, 22 November 2024 (UTC)[reply]

P155 labeled

[edit]
SELECT ?item ?itemLabel ?value
WHERE {
  ?item wdt:P155 ?value .
  ?value rdfs:label ?label .
  FILTER(LANG(?label) = "de" && STRSTARTS(?label, "Nekrolog"))
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],de,en". }
}
Try it!

Hi! This query runs into TimeOut. I want to start a query listing all items having the property P155 with a value of German lemma starting with "Nekrolog". Can anyone help me? Thanks a lot, Doc Taxon (talk) 04:34, 21 November 2024 (UTC)[reply]

@Stefan Kühn: Kannst Du mir hier vielleicht helfen? Doc Taxon (talk) 16:07, 21 November 2024 (UTC)[reply]
SELECT ?item ?itemLabel ?value
WHERE {
  ?item wdt:P360 wd:Q5 ;
        wdt:P155 ?value .
  ?value rdfs:label ?label .
  FILTER(LANG(?label) = "de" && STRSTARTS(?label, "Nekrolog"))
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],de,en". }
}
Try it!
does not time out. But obviously does not catch items without is a list of (P360)human (Q5). Hilft das? Or was it the aim to set that where it is missing or what is the use case? --Marsupium (talk) 16:27, 21 November 2024 (UTC)[reply]
@Doc Taxon: Challenge accepted. :-)
#######################################
# Search by name and then filter
#######################################
#defaultView:Table
SELECT ?item ?itemLabel ?itemDescription ?value ?de_label 
WHERE
{
  SERVICE wikibase:mwapi
  {
    bd:serviceParam wikibase:endpoint "www.wikidata.org" .
    bd:serviceParam wikibase:api "Generator" .
    bd:serviceParam mwapi:generator "search" .
    bd:serviceParam mwapi:gsrsearch "inlabel:'Nekrolog'" .      # Filter for all label in all languages
    bd:serviceParam mwapi:gsrlimit "max" .
    ?item wikibase:apiOutputItem mwapi:title.
  }
  ?item wdt:P155 ?value .
  ?item rdfs:label ?de_label .
  FILTER(LANG(?de_label) = "de" && STRSTARTS(?de_label, "Nekrolog"))  
  #FILTER(REGEX(STR(?de_label), "Nekrolog"))
  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
order by ?de_label
Try it!
I hope this query is helpful. If you use the Filter with regexp you will also find items like Q1684432 --sk (talk) 16:33, 21 November 2024 (UTC)[reply]
@Stefan Kühn: dies zeigt mir jeweils die de_label zur Value P155 an, Deines aber die de_label = itemLabel, obwohl ich bis aufs Weglassen der Kommentare gar kein Unterschied sehe. Wie geht das denn? Doc Taxon (talk) 17:29, 21 November 2024 (UTC)[reply]
SELECT ?item ?itemLabel ?itemDescription ?value ?de_label 
WHERE
{
  SERVICE wikibase:mwapi
  {
    bd:serviceParam wikibase:endpoint "www.wikidata.org" .
    bd:serviceParam wikibase:api "Generator" .
    bd:serviceParam mwapi:generator "search" .
    bd:serviceParam mwapi:gsrsearch "inlabel:'Nekrolog'" .
    bd:serviceParam mwapi:gsrlimit "max" .
    ?item wikibase:apiOutputItem mwapi:title.
  }
  ?item wdt:P155 ?value .
  ?value rdfs:label ?de_label .
  FILTER(LANG(?de_label) = "de" && STRSTARTS(?de_label, "Nekrolog"))
  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
ORDER BY ?de_label
Try it!
@Doc Taxon: Deine Zeile
?value rdfs:label ?de_label 
greift das Label vom Vorgänger ab. Bei mir frage ich
?item rdfs:label ?de_label .
ab. Also vom Item das Label. --sk (talk) 17:42, 21 November 2024 (UTC)[reply]
oh nöööö, ich hab das von 100x Lesen 100% überlesen. Danke sehr, Doc Taxon (talk) 17:56, 21 November 2024 (UTC)[reply]
@Stefan Kühn: Kannst Du mir bitte genauer erklären, was der Block SERVICE wikibase:mwapi {} oben macht? Doc Taxon (talk) 18:38, 21 November 2024 (UTC)[reply]
@Doc Taxon: Du findest die passenden Infos unter Wikidata:SPARQL_query_service/query_optimization#Searching_labels. Genauer als dort kann ich es auch nicht erklären. Kurz gesagt: Es gibt einige Spezialservices bei Wikidata, die bestimmte Spezialaufgaben schneller machen, als normale SPARQL abfragen. Zum Download aller Personendaten aus Wikidata, gibt es einen Service, der die die Massendaten z.B. in 25000er Scheibchen serviert (Siehe Beispiel User:Stefan_Kühn/Persondata#Daten-Download). Beste Grüße -- sk (talk) 19:17, 21 November 2024 (UTC)[reply]
Hab ich aber verstanden. Besten Dank, Doc Taxon (talk) 21:13, 21 November 2024 (UTC)[reply]

Withdrawn Union List of Artist Names ID (P245)s and their replacements

[edit]

I am trying to clean up withdrawn IDs of Union List of Artist Names ID (P245) and I have a allover working federated query. However, it would be helpful to have the information if the replacement ID is already in the Wikidata item. But for some reason the BIND(SUBSTR(STR(?replacement), 29) AS ?replacementID)BIND(EXISTS { ?item wdt:P245 ?replacementID . } AS ?replacementExists) always gives true regardless of the existence of the replacement ID.

SELECT DISTINCT
?itemLabel
?gettyIDatItem ?gettySubject ?replacement ?replacementID
?replacementExists
  WITH { SELECT ?item ?gettyHumanURI ?gettySubjectTerm ?gettyID
  WHERE {
    ?item wdt:P245 ?gettyID .
  }
        ORDER BY xsd:integer(SUBSTR(STR(?item), 33))
        LIMIT 5000
        OFFSET 50000
       } AS %items
  WHERE { INCLUDE %items
  BIND(URI(CONCAT("http://vocab.getty.edu/ulan/", ?gettyID)) AS ?gettySubject)
  SERVICE <http://vocab.getty.edu/sparql.json> {
    ?gettySubject dct:isReplacedBy ?replacement .
    BIND(SUBSTR(STR(?replacement), 29) AS ?replacementID)
  }
  BIND(URI(CONCAT(STR(?item), "#P245")) AS ?gettyIDatItem)
  BIND(EXISTS { ?item wdt:P245 ?replacementID . } AS ?replacementExists)
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],mul,en". }
}
Try it!

Thanks in advance for any help! --Marsupium (talk) 13:34, 21 November 2024 (UTC), 16:33, 21 November 2024 (UTC)[reply]

Hilft das?

SELECT DISTINCT ?itemLabel ?gettyIDatItem ?gettySubject ?replacement ?replacementID
(IF(BOUND(?existingReplacement), true, false) AS ?replacementExists)
WITH {
  SELECT ?item ?gettyID
  WHERE {
    ?item wdt:P245 ?gettyID .
  }
  ORDER BY xsd:integer(SUBSTR(STR(?item), 33))
  LIMIT 5000
  OFFSET 50000
} AS %items
WHERE {
  INCLUDE %items
  BIND(URI(CONCAT("http://vocab.getty.edu/ulan/", ?gettyID)) AS ?gettySubject)
  SERVICE <http://vocab.getty.edu/sparql.json> {
    ?gettySubject dct:isReplacedBy ?replacement .
  }
  BIND(SUBSTR(STR(?replacement), 29) AS ?replacementID)
  BIND(URI(CONCAT(STR(?item), "#P245")) AS ?gettyIDatItem)
  OPTIONAL {
    ?item wdt:P245 ?replacementID .
    BIND(?replacementID AS ?existingReplacement)
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],mul,en". }
}
Try it!

Kannst Du mir bei meinem Problem weiter oben vielleicht helfen? Doc Taxon (talk) 16:02, 21 November 2024 (UTC)[reply]

Jep, das zeigt an, wonach ich suchte. Part of why I have posted here was that I'm curious why BIND(EXISTS { ?item wdt:P245 ?replacementID . } AS ?replacementExists) does not work. But it is already good to have another approach that works. Vielen Dank dafür! --Marsupium (talk) 16:33, 21 November 2024 (UTC)[reply]
@Marsupium: wenn ich mich jetzt nicht täusche:
EXISTS wird typischerweise verwendet, um das Vorhandensein eines Musters zu überprüfen, ohne Variablen zu binden. In Deinem Fall musstest Du eine spezifische Kombination von ?item und ?replacementID überprüfen, was besser mit dem OPTIONAL-Ansatz behandelt wird.
Ich hab's jetzt nicht probiert, aber mit EXISTS wäre dies ein Ansatz: BIND(EXISTS { ::VALUES (?i ?r) { (BIND(?item AS ?i) BIND(?replacementID AS ?r)) } ?i wdt:P245 ?r . } AS ?replacementExists)
Dies bindet die spezifischen ?item- und ?replacementID-Werte innerhalb der EXISTS-Klausel und stellt sicher, dass es genau die Übereinstimmung überprüft, nach der Du suchst. Doc Taxon (talk) 18:09, 21 November 2024 (UTC)[reply]
Nee, mein Ansatz funktioniert so auch nicht. Doc Taxon (talk) 18:26, 21 November 2024 (UTC)[reply]

COUNT giving unexpected values

[edit]

I'm trying to get a list of the parent classes of an item, sorted by how many intermediate classes lie between the item and that class. The counts should be consistent with the subclass tree of the item, but they are not. This query lists all the intermediate classes instead of counting them, and produces the expected result - e.g., there are two classes between house cat (Q146) and Carnivora (Q25306). This query, however, says there are 14 classes between those two. The only difference in the queries is that in the latter one, I'm grouping by all but intermediate class, and returning (COUNT(?intermediate) AS ?intermediate_classes). How is it getting 14?! Swpb (talk) 18:11, 26 November 2024 (UTC)[reply]

@Swpb not sure of what's going on but "count( distinct ?var )" seems to have fix it. Maybe the "distinct" globally was removing the duplicates but it occurs after the query. To say you do not want to count all the paths in the property path, here, you have to precise this "groupped variables by group variable".
The order seems to be "query => grouping => count => remove duplicates" in you query, with "count distinct" it seems to be "query => remove duplicates for the groupped variable => count => remove duplicate globally" author  TomT0m / talk page 11:25, 27 November 2024 (UTC)[reply]
Thank you!!! I had tried "distinct", but I had the syntax wrong (outside the innermost parentheses), so I didn't know it could be used inside "count" like that. Swpb (talk) 15:50, 27 November 2024 (UTC)[reply]

Uniqueness problem

[edit]

I have the following query to create a map of mountains in a range to be used by kartographer. So the outer SELECT must return defined columns like ?id ?geo or ?description.

#defaultView:Map
#title:Mountains of range
SELECT DISTINCT ?id ?geo
  (concat(?article, (SAMPLE(?h)), '\\n', '[[File:Wikidata-logo S.svg|16px|link=d:', substr(str(?id),32,13), ']]', (SAMPLE(?idBild))) as ?description)
  ('mountain' as ?marker_symbol) ('small' as ?marker_size)

WITH {
  SELECT ?id ?geo ?img
  #SELECT ?id ?geo (SAMPLE(?img) as ?img_sample )
    WHERE { #(SAMPLE(?img) as ?img_sample)
    ?id wdt:P4552* wd:Q592910.
    ?id wdt:P31 wd:Q8502.
    ?id wdt:P625 [].
    ?id wdt:P625 ?geo.
    OPTIONAL {
      ?id wdt:P18 ?img.
    }
  }
} AS %sub 

WHERE {
  INCLUDE %sub.
  OPTIONAL {
    ?id          p:P2044                     ?stmnode.    # elevation
    ?stmnode       psv:P2044                   ?valuenode.
    ?valuenode     wikibase:quantityAmount     ?height.
    ?valuenode     wikibase:quantityUnit       ?unit.
    # conversion to SI unit
    ?unit          p:P2370                 ?unitstmnode.   # conversion to SI unit
    ?unitstmnode   psv:P2370               ?unitvaluenode. 
    ?unitvaluenode wikibase:quantityAmount ?conversion.
    ?unitvaluenode wikibase:quantityUnit   wd:Q11573.      # meter
  }
      
  #OPTIONAL { ?id wdt:P18 ?img.}
  BIND(IF(BOUND(?img), concat('\\n', '[[File:', substr(str(?img), 52, 400), '|250px]]'), '') AS ?idBild) .
  BIND(IF(BOUND(?height), concat(' (', str(?height * ?conversion),' m)'), '') AS ?h).

  OPTIONAL {
  ?link schema:about ?id .
  ?link schema:isPartOf <https://de.wikipedia.org/> .
    }
  SERVICE wikibase:label { bd:serviceParam wikibase:language 'de' .
                           ?id rdfs:label ?idLabel . }
  BIND(IF(BOUND(?link), concat('[[:de:', substr(str(?link),31,100), '|', ?idLabel, ']]'), ?idLabel) AS ?article).
}
GROUP BY ?id ?geo ?article ?h ?idBild
Mountains of range

The problem now is that there are matching mountains like Kitzbüheler Horn (Q671193) with two image (P18), resp. like Hochetzkogel (Q21877978) with two elevation above sea level (P2044). I just want no duplicate mountains in that case. Did not get SAMPLE to work as expected. help is appreciated. --Herzi Pinki (talk) 21:31, 28 November 2024 (UTC)[reply]

@Herzi_Pinki:
#defaultView:Map
#title:Mountains of range


SELECT DISTINCT ?id ?geo
  (
    concat(
      ?article, 
      coalesce(SAMPLE(?h),""), 
      '\\n', 
      '[[File:Wikidata-logo S.svg|16px|link=d:', 
       substr(str(?id),32,13),
      ']]', 
      coalesce(SAMPLE(?idBild),"")
    ) as ?description
  )
  ('mountain' as ?marker_symbol) 
  ('small' as ?marker_size)

WITH {
  SELECT ?id ?geo ?idBild
    WHERE { #(SAMPLE(?img) as ?img_sample)
    ?id wdt:P4552* wd:Q592910.
    ?id wdt:P31 wd:Q8502.
    ?id wdt:P625 [].
    ?id wdt:P625 ?geo.
    OPTIONAL {
      ?id wdt:P18 ?img.
      bind(concat('\\n', '[[File:', substr(str(?img), 52, 400), '|250px]]') as ?idBild)
    }
     # values ?id {  wd:Q21877978 wd:Q671193} 
  }
} AS %sub 

WHERE {
  INCLUDE %sub.
  OPTIONAL {
    ?id p:P2044/psn:P2044/wikibase:quantityAmount ?height # using normalized quantity in meter
    bind(concat(' (', str(?height),' m)') AS ?h)
  }
      
  #OPTIONAL { ?id wdt:P18 ?img.}

  OPTIONAL {
    ?link schema:about ?id ;
          schema:isPartOf <https://de.wikipedia.org/> ;
          schema:name ?article_name . 
    bind(concat('[[:de:', ?article_name, '|', ?idLabel, ']]') AS ?article_with_name)
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language 'de' .
                           ?id rdfs:label ?idLabel . }
  bind(coalesce(?article_with_name, ?idLabel, '') as ?article)

}
GROUP BY ?id ?geo ?article
Mountains of range
I think this should work. I changed a few things : used the now existing normalized form for the quantity that exists in the export, I deleted some of the "if bound" by moving stuffs if the "optional" clauses they are bound to. I removed the "group by ?h" in the main query because this made impossible to aggregate the height. I used "coalesce" which is a simple way to avoid "if(bound", the first value chosen is the first bound in the list of parameters. Also to ensure one not bound variable has a value anyway, because if you use a function on an unbound value in sparql the result is empty, "coalesce" is also a nice way to do that, like assinging the empty string to an unbound value. author  TomT0m / talk page 13:37, 29 November 2024 (UTC)[reply]
Oh, also I used "schema:name" to get the title of the article, this avoid parsing the url. author  TomT0m / talk page 13:42, 29 November 2024 (UTC)[reply]
@TomT0m: thanks, makes it easier. Thanks also for explaining. Duplicates are gone now. But gone is also the height (cannot figure out, where ?conversion comes from) and the links back to WP (if any). --Herzi Pinki (talk) 14:33, 29 November 2024 (UTC)[reply]
@Herzi Pinki: Sorry did not pay enough intention, the "?conversion" is indeed buggy, was from your own query, forgot to remove it. I'm investigating for the rest. author  TomT0m / talk page 15:10, 29 November 2024 (UTC)[reply]
@Herzi_Pinki: This one (seems to) work as intended :
#defaultView:Map
#title:Mountains of range


SELECT DISTINCT ?id ?geo
  (
    concat(
      ?article, 
      coalesce(SAMPLE(?h),""), 
      '\\n', 
      '[[File:Wikidata-logo S.svg|16px|link=d:', 
       substr(str(?id),32,13),
      ']]', 
      coalesce(SAMPLE(?idBild),"")
    ) as ?description
  )
  ('mountain' as ?marker_symbol) 
  ('small' as ?marker_size)
  ?article 
WITH {
  SELECT ?id ?geo ?idBild
    WHERE { #(SAMPLE(?img) as ?img_sample)
    ?id wdt:P4552* wd:Q592910.
    ?id wdt:P31 wd:Q8502.
    ?id wdt:P625 [].
    ?id wdt:P625 ?geo.
    OPTIONAL {
      ?id wdt:P18 ?img.
      bind(concat('\\n', '[[File:', substr(str(?img), 52, 400), '|250px]]') as ?idBild)
    }
     # values ?id {  wd:Q21877978 wd:Q671193} 
  }
} AS %sub 

WHERE {
  INCLUDE %sub.
  OPTIONAL {
    ?id p:P2044/psn:P2044/wikibase:quantityAmount ?height # using normalized quantity in meter
    bind(concat(' (', str(?height),' m)') AS ?h)
  }
      
  #OPTIONAL { ?id wdt:P18 ?img.}

  OPTIONAL {
    ?link schema:about ?id ;
          schema:isPartOf <https://de.wikipedia.org/> ;
          schema:name ?article_name . 
  }
  #optional {
  #  ?id rdfs:label ?idLabel filter (langmatches(lang(?idLabel), "de")) .
  #}
  SERVICE wikibase:label { 
    bd:serviceParam wikibase:language "de".
    ?id rdfs:label ?idLabel
  }
  bind(
    coalesce(
      concat('[[:de:', ?article_name, '|', ?idLabel, ']]'),      # we have both ?article_name and ?idLabel
      concat('[[:de:', ?article_name, '|', ?article_name, ']]'), # we have only ?article_name
      ?idLabel,                                                  #         only ?idLabel
      ""                                                         #         none
    )
    as ?article
  )
  

}
GROUP BY ?id ?geo ?article ?article_name ?article_with_name
Mountains of range
We have to pay attention to which variable is bound inside the "optional" clauses, especially using the query service. It's possible to avoid using it using the commented optional, especially if you use only one language (it's customary to use english as a default, or "mul" now even). But I think this form is not too bad :) author  TomT0m / talk page 15:37, 29 November 2024 (UTC)[reply]
thanks a lot, works as a charm now. best --Herzi Pinki (talk) 20:39, 29 November 2024 (UTC)[reply]

P549 duplication

[edit]

Hi guys !

It's been a while since I last bugged you with my requests.

I've been associating items of astronomers with their scientific articles items for a while now, and I noticed that several Mathematics Genealogy Project ID (P549) items are duplicates of existing ones (example : 1 and 2). I've been trying to automatically identify these duplicates for some time, but, of course, my query attempts on the subject time out (see below), with or without limits, with or without more identifiers for the non-549 items, etc..

Do you have any ideas on what could be done?

SELECT DISTINCT ?item1 ?item2 ?l1 ?l2
WHERE
{
	?item1 wdt:P31 wd:Q5 ;
	      wdt:P549 [] ;
          wikibase:identifiers 1 ;
          rdfs:label ?l1 .
   ?item2 wdt:P31 wd:Q5 ;
          wdt:P496 [] ;
          rdfs:label ?l2 .
       FILTER(LANG(?l1) IN ("en")).
       FILTER(LANG(?l2) IN ("en")).
  MINUS{?item2 wdt:P549 [].} .
       FILTER(?l1 = ?l2).
}
Try it!

Simon Villeneuve (talk) 12:07, 5 December 2024 (UTC)[reply]

@Simon Villeneuve: Quitte à procéder en deux temps c'est plus simple, on peut commencer par lister les items avec des homonymes, et trouver les homonymes pour chacun d'entre eux. On trouve environ 8000 noms, certains avec des dizaines d'homonymes :
SELECT distinct ?item1 ?l1 WHERE {
  ?item1 wdt:P31 wd:Q5;
         wdt:P549 _:b101;
         wikibase:identifiers 1 ;
         rdfs:label ?l1.

    ?itemHomonym wdt:P31 wd:Q5;
           wdt:P496 _:b102;
           rdfs:label ?l1.
    FILTER((STR(?itemHomonym)) > (STR(?item1)))
    MINUS { ?itemHomonym wdt:P549 _:b103. }

  FILTER(LANG(?l1) IN("en"))
}
Try it!
Il n'est pas nécessaire de filter sur "?l1=?l2" on peut juste réutiliser "?l1" dans les deux motifs de graphes .
Je prépare un notebook pour aider à la suite si nécessaire. author  TomT0m / talk page 15:26, 5 December 2024 (UTC)[reply]
Coucou,
Merci !
C'est la première fois que je vois une notation comme _:b101. Pourquoi utilises-tu celle-ci plutôt que les crochets ([]) ?
Sinon, en ajoutant ?itemHomonym dans les variables de début, j'arrive à 55 000 résultats.
Penses-tu que ces possibles homonymes méritent d'être listés quelque part ? Si oui, où ? Simon Villeneuve (talk) 16:24, 5 December 2024 (UTC)[reply]