Shortcuts: WD:RAQ, w.wiki/LX

Wikidata:Request a query

From Wikidata
Jump to navigation Jump to search

Query for extracting tokens from a date string in various languages

[edit]

I want to extract dates from a string in any language and for that I need a dictionary of words that represent days years or months. It currently has this format:

{
	string: 'September',
	type: 'month',
	ordinal: 9,
	lang: 'en',
},
{
	string: 'October',
	type: 'month',
	ordinal: 10,
	lang: 'en',
},
{
	string: 'Sept',
	type: 'month',
	ordinal: 9,
	lang: 'en',
},
{
	string: 'Septembers',
	type: 'month',
	ordinal: 9,
	lang: 'en',
},
{
	string: '1st',
	type: 'day',
	ordinal: 1,
	lang: 'en',
},
{
	string: '2nd',
	type: 'day',
	ordinal: 2,
	lang: 'en',
},
{
	string: '3rd',
	type: 'day',
	ordinal: 3,
	lang: 'en',
},
{
	string: 'Dezember',
	type: 'month',
	ordinal: 12,
	lang: 'de',
},
{
	string: 'Dez.',
	type: 'month',
	ordinal: 12,
	lang: 'de',
},
{
	string: '10月',
	type: 'month',
	ordinal: 10,
	lang: 'ja',
},
…

I imagined that each entry is a form of a lexeme that has a sense pointing to a day of the month or month. I have an idea on how to get the forms but I don't know how to get all the sense-items (12 month names) plus day of a month (Q116600668). I'm aware that these may not have a lexeme yet. –Shisma (talk) 10:37, 27 October 2024 (UTC)[reply]

For better performance i decided to limit the query to specific languages. Do you see more potential for optimisation? –Shisma (talk) 14:04, 27 October 2024 (UTC)[reply]

Also I don't need plurals I think –Shisma (talk) 14:12, 27 October 2024 (UTC)[reply]

SELECT DISTINCT ?ordinal ?type ?string ?lang  WHERE {
  FILTER (?lang in ('en', 'de', 'ja'))
  {
    ?thing wdt:P31 wd:Q47018901.
    ?thing wdt:P361 wd:Q12138.
    BIND ('month' as ?type)
  } UNION {
    ?thing wdt:P31 wd:Q116600668.
    BIND ('day' as ?type)
  }
  
  ?thing wdt:P1545 ?ordinal.
  
  ?sense wdt:P5137 ?thing.
  ?lexeme ontolex:sense ?sense.
  ?lexeme ontolex:lexicalForm ?form.
  ?lexeme dct:language ?language.
  ?language wdt:P218 ?lang.
  
  ?form ontolex:representation ?string.
}
Try it!

Note to self: Due to recent changes by @Mullanur: this has to be changed to:

SELECT DISTINCT ?thing ?ordinal ?type ?string ?lang  WHERE {
  FILTER (?lang in ('en', 'ja'))
  {
    ?thing wdt:P31 wd:Q47018901.
    ?thing wdt:P361 wd:Q12138.
    BIND ('month' as ?type)
  } UNION {
    ?thing wdt:P31 wd:Q116600668.
    BIND ('day' as ?type)
  }

  FILTER NOT EXISTS { ?form wikibase:grammaticalFeature wd:Q146786 }

  ?thing p:P31 ?instanceOf.
  ?instanceOf pq:P1545 ?ordinal.

  ?sense wdt:P5137 ?thing.
  ?lexeme ontolex:sense ?sense.
  ?lexeme ontolex:lexicalForm ?form.
  ?lexeme dct:language ?language.
  ?language wdt:P218 ?lang.

  ?form ontolex:representation ?string.
}
Try it!

*the series ordinal was moved to be a qualifier of the instance of statement –Shisma (talk) 10:12, 5 November 2024 (UTC)[reply]

Howto dynamically create a Point() ?

[edit]

I have a federated query, and I'm stuck at the point where I have latitude and longitude data from an external service. Next, I would like to calculate the distance from these latitude and longitude coordinates to a specific Wikidata item. However, I am unsure how to convert them into a Point() format that can be used with geo:distance(). A simplified example of the query I'm trying to get working is shown below. Do you know how this should be done?

SELECT ?place ?location ?distance WHERE {
    ?place wdt:P625 ?location .
    BIND(-122.402251 as ?lat) .
    BIND(37.789246 as ?lon) .
#    BIND("Point(?lat ?lon )"^^geo:wktLiteral as ?location2) "DOESNT WORK
    BIND("Point(-122.402251 37.789246)"^^geo:wktLiteral as ?location2) # WORKS
    BIND(geof:distance(?location, ?location2) AS ?distance)
} LIMIT 3
Try it!

--Zache (talk) 11:59, 29 October 2024 (UTC)[reply]

Try:
SELECT ?place ?location ?distance WHERE {
    ?place wdt:P625 ?location .
    BIND(-122.402251 as ?lat) .
    BIND(37.789246 as ?lon) .
    BIND(STRDT(CONCAT( 'Point(', str(?lat), ' ', str(?lon), ')' ), geo:wktLiteral) as ?location2) # NOW WORKs
 #   BIND("Point(-122.402251 37.789246)"^^geo:wktLiteral as ?location2) # WORKS
    BIND(geof:distance(?location, ?location2) AS ?distance)
} LIMIT 3
Try it!
To unpack that BIND() line, firstly ?lat and ?lon need to appear as variables in the expression, not as literal strings. But we need to use str() to turn them from numeric values to string values. Then we need to use CONCAT() to assemble them into the Point(...) string. And then I have used STRDT() to promote the string to the right data-type, which is like the ^^ that you had, but works with expressions rather than just literal strings.
Hope this helps, Jheald (talk) 12:23, 29 October 2024 (UTC)[reply]

Fetching subclasses

[edit]

Hi. I'm trying to write a query which fetches all items that link to the WWI item which also have a statement that is, or is subclass of, Surrey (historic county and county). The closest ive got is this but I cannot figure out why it's not fetching the subclasses of items. Can anyone help? Cheers Jason.nlw (talk) 09:36, 1 November 2024 (UTC)[reply]

@Jason.nlw: I think your wdt:P31|wdt:P279 is the problem. You has to write wdt:P31/wdt:P279. When I change this, then your query run into "OutOfMemory"-Error. Can you write your query more specific? Like only people or only places. Best regards. --sk (talk) 08:45, 2 November 2024 (UTC)[reply]
@Jason.nlw Being a subclass of a country does not make sense. A class is (usually) a type of real world object you can regroup according to some criteria. Real world objects, specific examples, are said to be "instances of" (P31) a class. For example Surrey is an instance of county, an as it is a real world example it cannot have instances.
You might want things that are "part of" (part of (P361)) or located in (location (P276) View with SQID) Surrey? author  TomT0m / talk page 10:38, 2 November 2024 (UTC)[reply]
Thanks @Stefan Kühn and @TomT0m. Yeah I had the whole 'subclass of Surrey' thing confused. I rewrote those bits with P131* and the query now works as required. Thanks for your help! Jason.nlw (talk) 08:18, 4 November 2024 (UTC)[reply]

Get all pages edited in the last month in all subcategories of mathematics

[edit]

Hi. How can I get all pages edited in the last month in all subcategories of mathematics. Thanks Andresv.63 (talk) 06:09, 4 November 2024 (UTC)[reply]

There isn't many items that are classified as mathematics, but here's a query for items edited last month that are related to mathematics: https://w.wiki/Bp$Y . If you meant pages on Wikipedia (the english one in this example) just use Petscan for that: https://petscan.wmcloud.org/?psid=29725855 Infrastruktur (talk) 11:45, 4 November 2024 (UTC)[reply]
@Infrastruktur
Thank you. Petscan gives almost the solution I want. In any case, I want a query to then add things: user, not my pages, pages written in many languages ​​but not mine, etc...
I would like to get that just like Petescan searches in 5 subcategories below mathematics to be able to do that in a Wikidata Query Service query
Andresv.63 (talk) 15:56, 4 November 2024 (UTC)[reply]

Movies taken in a city

[edit]

Hi, I have Property:P915 which denotes the place where a film was shot. This may be a city, a part of a city, a park, a building... Almost anything. How can I query all the films that were shot e.g. in Paris? Bináris (talk) 09:05, 4 November 2024 (UTC)[reply]

@Bináris: something like that :
select distinct ?film ?filmLabel  {
  ?film wdt:P915/(wdt:P276| wdt:P131 )* wd:Q90 

  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],mul,en". }
  
}
Try it!
author  TomT0m / talk page 11:23, 4 November 2024 (UTC)[reply]
@Bináris
#defaultView:Map,Table
SELECT distinct ?item ?itemLabel ?itemDescription ?regionLabel ?coordinate
WITH {
   SELECT distinct ?region                   # get all subregions
   WHERE {
       hint:Query hint:optimizer "None" .         
       BIND(wd:Q90 AS ?state).               # of a region (Paris)
       ?region wdt:P131* ?state.            
       #MINUS { ?region wdt:P576 _:b0. }      # no dissolved regions
   }
} AS %region

WHERE {
  INCLUDE %region.
  ?item wdt:P915 ?region.   # filming location
  OPTIONAL {?region wdt:P625 ?coordinate.}    # coordinate
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
order by ?itemLabel
Try it!
With this query you get also all subregions of Paris. So you get 1317 films. Also as map or table. Best regards --sk (talk) 17:20, 4 November 2024 (UTC)[reply]
Great, thank you! Now I have to guess how to transform this to Pywikibot... Bináris (talk) 10:30, 5 November 2024 (UTC)[reply]
@Stefan Kühn This seems to be the same as my query, so you get the same films, except you have duplicate results because I don't get the place and they are likely to be filmed in several locations in Paris. 1317 films is not correct, I think the number of film is still 965.
@Bináris What do you want to do with those ? If you like we can get the titles of the movie page on a Wikipedia for easier use with pywikibot. (something like including ?article schema:isPartOf <https://en.wikipedia.org> ; schema:name ?title; schema:about ?film.
that would get
select distinct ?name  { ?film wdt:P915/(wdt:P276| wdt:P131 )* wd:Q90 . ?article schema:isPartOf <https://en.wikipedia.org/> ; schema:name ?name; schema:about ?film. }
Try it!
 : 640 results on enwiki. author  TomT0m / talk page 12:25, 5 November 2024 (UTC)[reply]
Thank you. Duplications are no problem, and the exact places are an advantage. Whereever I travel in the world, I want to know where are these locations and what films are related to that very place. Wikipedias where they have articles are not important, that's why I want to use Wikidata. Bináris (talk) 08:02, 6 November 2024 (UTC)[reply]
Pywikibot makes sense to get the results for a journey automatically. Bináris (talk) 08:03, 6 November 2024 (UTC)[reply]
@Bináris I don't see how you need pywikibot for this ?
If you want which film is "related" you might also want to use narrative location (P840) View with SQID as well, as a a film may not be shot at the location the action is supposed to take place. author  TomT0m / talk page 09:30, 7 November 2024 (UTC)[reply]
Yes, I know, thank you. I want to create lists automatically for a bunch of sities. Pywikibot is much more comfortable for me, than run a query each time manually. Bináris (talk) 11:45, 8 November 2024 (UTC)[reply]

Annual energy output by countries?

[edit]

Hi, Is there a query to obtain annual electricity production for countries? There is a property P4131 for annual energy output, but it is determined for power plants. Summing up all power plants gives a very approximate result.

Is it possible to get annual electricity generation by countries? Maybe by referencing external data sources? TVBig (talk) 17:42, 4 November 2024 (UTC)[reply]

@User:TVBig
#defaultView:Table
SELECT #?item ?itemLabel ?itemDescription 
?countryLabel (sum(?output) as ?sum_output)
{  
  ?item wdt:P4131 ?output.   
  ?item wdt:P17 ?country.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
group by ?countryLabel
Try it!
This is my solution. But, the result is not correct. Because you have also plants out of service with a former energy output. This had to be excluded. And some countries have strange sums. If I look at all items with P4131:
#defaultView:Table
SELECT ?item ?itemLabel ?itemDescription 
?countryLabel ?output
{  
  ?item wdt:P4131 ?output.   
  ?item wdt:P17 ?country.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
Try it!
You see different values. Also negative values. I think we have to normalize also the values KW, MW, GW. But I dont found a way to do this. For areas you see a solution here Wikidata:Request_a_query#prefecture_level_cities_sorted_by_area. --sk (talk) 20:37, 4 November 2024 (UTC)[reply]


@sk: To get the normalized quantity it's useful to use the full statement form and not the truthy one. For standards quantities, Wikidata has a database of normalization conversion factors it uses to output, see mw:Wikibase/Indexing/RDF_Dump_Format#Normalised_quantity This gives something like this :

#defaultView:Table
SELECT ?item ?itemLabel ?itemDescription 
?countryLabel ?output ?unitLabel
{  
  ?item p:P4131 [ 
    psn:P4131 [
      wikibase:quantityAmount ?output ; wikibase:quantityUnit ?unit 
    ]
  ].   
  ?item wdt:P17 ?country.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
Try it!

author  TomT0m / talk page 11:50, 5 November 2024 (UTC)[reply]

Thanks @TomT0m. I will read it. Here is a new query for @TVBig. It is the Output in TeraJoule per Country.
:#defaultView:Table
:SELECT ?countryLabel (sum(?output/1000/1000/1000/1000) as ?output_TJ)
:{  
:  ?item p:P4131 [ 
:    psn:P4131 [
:      wikibase:quantityAmount ?output ; wikibase:quantityUnit ?unit 
:    ]
:  ].   
:  ?item wdt:P17 ?country.
:  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
:}
:group by ?countryLabel
:order by desc(?output_TJ)
:
Try it!
But I think we have an error in one data. France make 1000 times more then China. I search for the biggest output in France and found Centrale hydroélectrique de Romanche Gavet (Q30740209) with 503.659.561 GWh. Please check this data. --sk (talk) 23:42, 7 November 2024 (UTC)[reply]

Most recently edited lexemes for a language in WDQS

[edit]

A question was raised at the Search Platform Team hour: Most recently edited lexemes for a language and URL to the API would be possible: https://www.wikidata.org/wiki/Special:ApiSandbox#action=query&format=json&list=search&formatversion=2&srsearch=linksto%3AQ9035&srnamespace=146&srlimit=500&srsort=last_edit_desc

I am wondering if this can be changed to a WDQS query? I have been trying versions of the following SPARQL query and have not found a version working:

SELECT ?item ?itemLabel WHERE {
 {
   SELECT ?item WHERE {
    SERVICE wikibase:mwapi {
      bd:serviceParam wikibase:endpoint "www.wikidata.org" ;
                      wikibase:api "Generator" ;
                      mwapi:generator "search" ;
                      mwapi:gsrsearch "linksto:Q9035" ;
                      # mwapi:gsrnamespace "146" ; 
                      # mwapi:gsrsort "last_edit_desc" ; 
                      mwapi:gsrlimit "100" .
      ?item wikibase:apiOutputItem mwapi:item .
    }
  } LIMIT 100
 }
 hint:Prior hint:runFirst "true".
 SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100
Try it!

Finn Årup Nielsen (fnielsen) (talk) 17:38, 6 November 2024 (UTC)[reply]

I manage to find the the output should be mwapi:title and not mwapi:item:
SELECT
  ?lexeme ?lemma
WHERE {
  {
    SELECT ?lexeme WHERE {
      SERVICE wikibase:mwapi {
        bd:serviceParam wikibase:endpoint "www.wikidata.org" ;
                        wikibase:api "Generator" ;
                        mwapi:generator "search" ;
                        mwapi:gsrsearch "linksto:Q9035" ;
                        mwapi:gsrnamespace "146" ; 
                        mwapi:gsrsort "last_edit_desc" ; 
                        mwapi:gsrlimit 100 .
        ?item_ wikibase:apiOutputItem mwapi:title .
      }
      BIND(IRI(CONCAT("http://www.wikidata.org/entity/", SUBSTR(STR(?item_), 39))) AS ?lexeme)
    }
    LIMIT 100
  }
  hint:Prior hint:runFirst "true".
  ?lexeme wikibase:lemma ?lemma .
}
Try it!
Finn Årup Nielsen (fnielsen) (talk) 22:45, 6 November 2024 (UTC)[reply]
Thanks to having the modification date available to us, we can do this in pure SPARQL: https://w.wiki/Bsmj Infrastruktur (talk) 16:15, 7 November 2024 (UTC)[reply]
Thanks! Interestingly the mwapi version may be faster than the non-mwapi version. Finn Årup Nielsen (fnielsen) (talk) 16:59, 7 November 2024 (UTC)[reply]

Missing LIS terms in Hungarian

[edit]

Hi, I would need some help. I would like to retrieve library and information science (Q13420675) terms that are not translated into Hungarian, so we can translate the missing terms all at once. I expect no more than 1-200 terms. Example: document retrieval (Q1638872).

The properties I would need: QID, EN preferred label, EN description, EN alternative label(s), DE preferred label, DE alternative label(s). Thank you in advance! Bencemac (talk) 20:19, 12 November 2024 (UTC)[reply]