Property talk:P3198

From Wikidata
Jump to navigation Jump to search

Documentation

JewishGen Locality ID
identifier of a town in The JewishGen Communities Database
Associated itemJewishGen (Q820610)
Data typeExternal identifier
Corresponding templateno label (Q14406229)
Domain
According to this template: places, geographic location (Q2221906)
According to statements in the property:
geographic location (Q2221906), geographical feature (Q618123) or pogrom (Q177716)
When possible, data should only be stored as statements
Allowed values[1-9]\d{0,6}
ExampleRadzanów (Q3010530)524980
Aleppo (Q41183)2541857
Barysaw (Q19313)1941088
Sourceen:JewishGen#Databases (note: this information should be moved to a property statement; use property source website for the property (P1896))
Formatter URLhttps://www.jewishgen.org/Communities/community.php?usbgn=-$1
See alsoGeoNames ID (P1566), Yad Vashem Encyclopedia of the Ghettos ID (P3735)
Lists
Proposal discussionProposal discussion
Current uses
Total308
Main statement30599% of uses
Qualifier31% of uses
Search for values
[create Create a translatable help page (preferably in English) for this property to be included here]
Distinct values: this property likely contains a value that is different from all other items. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P3198#Unique value, SPARQL (every item), SPARQL (by value)
Single value: this property generally contains a single value. (Help)
List of violations of this constraint: Database reports/Constraint violations/P3198#Single value, hourly updated report, SPARQL
Format “[1-9]\d{0,6}: value must be formatted using this pattern (PCRE syntax). (Help)
List of violations of this constraint: Database reports/Constraint violations/P3198#Format, hourly updated report, SPARQL
Allowed entity types are Wikibase item (Q29934200): the property may only be used on a certain entity type (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P3198#Entity types
Scope is as main value (Q54828448), as reference (Q54828450): the property must be used by specified way only (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P3198#Scope, SPARQL

Constraints

[edit]

Discussion

[edit]

I am interested in making sure the existing database in JewishGen is more visible, as it is mission-critical as a genealogy resource, and is under-recognized and under-utilized both within the genealogy community and outside the genealogy community. There is a lot of really important information in this database that would be a great addition of data to Wikidata, and would enhance geographic information available on Wikipedia. BrillLyle (talk) 16:01, 6 September 2016 (UTC)[reply]

USBGN vs Geonames

[edit]

@BrillLyle, YULdigitalpreservation, ChristianKl: JewishGen places use U.S. Board on Geographic Names (USBGN) IDs.

  • Does someone have any comparison of Geonames vs USBGN data? E.g. coverage, number of places, etc.
  • Wikidata has 4.6M geographic locations, see https://tools.wmflabs.org/sqid/#/view?id=Q2221906. But not all of these are places on Earth, e.g. 100k are physico-geographical objects, such as Limbo (where souls reside in Catholic afterlife ;-)
  • Geonames has 11M places. Geonames has historic place names, but doesn't have the historic administrative hierarchy (what I call the Nazi geography).
  • Wikidata has 1.5M links to Geonames, see https://www.wikidata.org/wiki/Property_talk:P1566. I'd say this is about 45-50% of all possible matches.
  • Geonames has 540k links to Wikipedia, see https://gist.github.com/VladimirAlexiev/e2048299bc4f9483bf9b02739a2c9776
  • USBGN (JewishGen Locality ID) has only 4 links. Is USBGN already coreferenced to someting, or do we face a huge coreferencing task?

--Vladimir Alexiev (talk) 20:13, 13 March 2017 (UTC)[reply]

USBGN vs JewishGen Locality ID

[edit]

I got a doc "USBGN Numbers on JewishGen.docx" that says:

  • For whatever reason, “USBGN Feature Code Numbers” are most often a negative number.
  • “Artificial USBGN Code” starts with the letter “A”, followed by a 4-digit number, For example: “A0001”. As of early 2013, JewishGen has created Artificial USBGN Codes for 45 localities

But JewishGen Locality ID is recorded as a positive number (eg 524980), makes no allowance for the "Artificial" codes, and the minus is fixed in the "formatter URL". Does this need to be changed?

I vote NO because:

  • 45 Artificial are not worth it
  • Negative IDs would confuse most people (including me).

However, if USBGN includes non-negative IDs then we need to change it. --Vladimir Alexiev (talk) 20:15, 13 March 2017 (UTC)[reply]

USBGN definitely includes non-negative IDs, and these are used in JewishGen beyond the artificial codes, for example https://www.jewishgen.org/Communities/community.php?usbgn=11522461 so the USBGN numbers need to include the negative sign if it is used, and the formatter URL needs to have the negative sign removed. When looking at geographic locations I don't see USBGN IDs listed, so presumably this has not yet been integrated into WikiData. Perhaps the first step would be to add the USBGN data to WikiData? A good script to do this would seem necessary. USBGN data includes multiple entries for the same entity (including different versions of the name in different languages and sometimes historical names). Perhaps there's a way to piggy-back on one of the existing Wikidata collections such as GeoNames ID (P1566), VIAF ID (P214), Library of Congress authority ID (P244), or Who's on First ID (P6766) if we can find a mapping between one of those IDs and the USBGN IDs. --Philip Trauring 11:04, 17 December 2019 (UTC)[reply]

Example: Szezerzec

[edit]

Laura Brazzo from CDEC: Is there someone among you who knows something about the place Szezerzec? It is given as birthplace for two persons in the YV Central Database of Shoah Victims' Names: "Szezerzec". It is not in Geonames nor Getty TGN. I didn't find the parent country. I've found info in the ancestry.com archive but I'm not sure about the source. Is it a place that changed the name or disappeared?

Erika Herzog: I use the JewishGen locality database: Szezerzec (now in Ukraine, close to Lwiw). This is exactly why I wanted to add the JewishGen ID. And possibly the name variations. To Wikidata. To improve discoverability. A lot of people don't know about this great database.

Vladimir Alexiev (talk) 09:31, 23 July 2017 (UTC): Great catch.[reply]

Щирець (Ukr), Shchyrets' (Ukr-Latn), Szczerzec (Pol), שטעריץ / שטשעריץ (Yid), Shtcherzetz (Yid-Latn), Shchirets (Rus-Latn)], Shchezhets, Szczyrzec, Scyrec', Shtsherits, Shterits

"Is USBGN already coreferenced to something, or do we face a huge coreferencing task?" To do real matching we need a lot of data work. So:

  1. @BrillLyle: do you have info how many places does it include?
  2. Can we get the data in bulk? What is the license? Who should we contact to make an official request?
  3. Is there a place hierarchy? On the website there is even historic place hierarchy, which is great