Wikidata:Property proposal/Library classifications' IDs for topics

Library classifications' IDs for topics

edit

Dewey Decimal Classification

edit

Option A

Originally proposed at Wikidata:Property proposal/Authority control
   Not done
Descriptionsubject classification identifier
RepresentsDewey Decimal Classification (Q48460)
Data typeExternal identifier
Domainnot written works and editions
Allowed values\d{3}|\d{3}\.\d+|[12456]--\d+|3[ABC]?--\d+
Example 1social science (Q34749) → 300
Example 2art (Q735) → 700
Example 3radio communications (Q872) → 791.44
Example 4Finland (Q33) → 2--4897
Sourcehttps://www.oclc.org/en/dewey/resources.html (and others to be copied from Dewey Decimal Classification (P1036))
Planned useTransfer pertinent values from Dewey Decimal Classification (P1036) (originally designed for topics, not works)
Expected completenessalways incomplete (Q21873886)
Robot and gadget jobsSee planned use
See alsoSee the other properties in this page, plus Regensburg Classification (P1150), Chinese Library Classification (P1189) and Basisklassifikation (P5748)

Option B

Originally proposed at Wikidata:Property proposal/Authority control
DescriptionDDC number assigned to a publication
RepresentsDewey Decimal Classification (Q48460)
Data typeString
Domainonly written works and editions
Allowed values\d{3}|\d{3}\.\d+|[12456]--\d+|3[ABC]?--\d+
Example 1Detto del gatto lupesco (Q19128645) → 851.1
Example 2America (Paris) (Q19851135) → 306.098
Example 3Who's who in France (Online) (Q19983442) → 920.044
Example 4Postcolonial Gothic Fictions from the Caribbean, Canada, Australia and New Zealand (Q20592194) → 823.0872909
Sourcehttps://www.oclc.org/en/dewey/resources.html (and others to be copied from Dewey Decimal Classification (P1036))
Planned useTransfer pertinent values from Dewey Decimal Classification (P1036) (originally designed for topics, not works)
Expected completenessalways incomplete (Q21873886)
Robot and gadget jobsSee planned use
See alsoSee the other properties in this page (Option B)

Library of Congress Classification

edit

Option A

Originally proposed at Wikidata:Property proposal/Authority control
   Not done
Descriptionsubject classification identifier used in the Library of Congress Classification system
RepresentsLibrary of Congress Classification (Q621080)
Data typeExternal identifier
Domainnot written works and editions
Allowed values[A-Z]{1,3}(\d+(\.\d+)?( *\.[A-Z]{0,3}\d+([ -]\.?[A-Z]{0,3}\d+)?)?( *\d+[a-z]*)?)?
Example 1social science (Q34749) → H
Example 2Augustine of Hippo (Q8018) → BR1720.A9
Example 3Library of Congress Classification (Q621080) → Z696.U4-Z696.U7
Sourcehttps://www.loc.gov/aba/publications/FreeLCC/freelcc.html#About (and others to be copied from Library of Congress Classification (P1149))
Planned useTransfer pertinent values from Library of Congress Classification (P1149) (originally designed for topics, not works)
Expected completenessalways incomplete (Q21873886)
Robot and gadget jobsSee planned use
See alsoSee the other properties in this page, plus Regensburg Classification (P1150), Chinese Library Classification (P1189) and Basisklassifikation (P5748)

Option B

Originally proposed at Wikidata:Property proposal/Authority control
DescriptionLibrary of Congress Classification number assigned to a publication
RepresentsLibrary of Congress Classification (Q621080)
Data typeString
Domainonly written works and editions
Allowed values[A-Z]{1,3}(\d+(\.\d+)?( *\.[A-Z]{0,3}\d+([ -]\.?[A-Z]{0,3}\d+)?)?( *\d+[a-z]*)?)?
Example 1Southern Cross (Q17156005) → Z239.2 R58 1951h
Example 2Existentialistische Marx-Interpretation (Q17416765) → B945.M2983 E9
Example 3The Drunkard's Walk (Q18011319) → QA273.M63 2008
Sourcehttps://www.loc.gov/aba/publications/FreeLCC/freelcc.html#About (and others to be copied from Library of Congress Classification (P1149))
Planned useTransfer pertinent values from Library of Congress Classification (P1149) (originally designed for topics, not works)
Expected completenessalways incomplete (Q21873886)
Robot and gadget jobsSee planned use
See alsoSee the other properties in this page (Option B)

Universal Decimal Classification

edit

Option A

Originally proposed at Wikidata:Property proposal/Authority control
   Not done
Descriptiontype of library classification
RepresentsUniversal Decimal Classification (Q243350)
Data typeExternal identifier
Domainnot written works and editions
Allowed values((\=|\-|\`|\(|\(\=|\*|\+|\:{1,2}|\")?\d{1,3}(\.\d{1,3})*(\)|\")?)+
Example 1social science (Q34749) → 3
Example 2sociology of culture (Q1570681) → 316.7
Example 3politics (Q7163) → 32
Sourcehttp://www.udcc.org/udcsummary/php/index.php (and another to be copied from Universal Decimal Classification (P1190))
Planned useTransfer pertinent values from Universal Decimal Classification (P1190) (originally designed for topics, not works)
Expected completenessalways incomplete (Q21873886)
Robot and gadget jobsSee planned use
See alsoSee the other properties in this page, plus Regensburg Classification (P1150), Chinese Library Classification (P1189) and Basisklassifikation (P5748)

Option B

Originally proposed at Wikidata:Property proposal/Authority control
DescriptionUDC number assigned to a publication
RepresentsUniversal Decimal Classification (Q243350)
Data typeString
Domainonly written works and editions
Allowed values((\=|\-|\`|\(|\(\=|\*|\+|\:{1,2}|\")?\d{1,3}(\.\d{1,3})*(\)|\")?)+
Example 1Unequal catch : gender and fisheries on the Lake Victoria landing sites in Tanzania (Q55739133) → 001(6)
Example 2Mis montañas (Q56643531) → 821.134.2(82)-992
Example 3Introduction to Probability (Q63166569) → 519.2
Sourcehttp://www.udcc.org/udcsummary/php/index.php (and another to be copied from Universal Decimal Classification (P1190))
Planned useTransfer pertinent values from Universal Decimal Classification (P1190) (originally designed for topics, not works)
Expected completenessalways incomplete (Q21873886)
Robot and gadget jobsSee planned use
See alsoSee the other properties in this page (Option B)

Regensburg Classification

edit

Option B

Originally proposed at Wikidata:Property proposal/Authority control

DescriptionRVK number assigned to a publication
RepresentsRegensburg Classification (Q2137453)
Data typeString
Domainonly written works and editions
Allowed values(LD,)?[A-Z]([A-Z]( [0-9]+([a-z]|\.[0-9]+)?( [A-Z][0-9]*)?)?)?( - [A-Z]([A-Z]( [0-9]+([a-z]|\.[0-9]+)?( [A-Z][0-9]*)?)?)?)?
Example 1Postcolonial Gothic Fictions from the Caribbean, Canada, Australia and New Zealand (Q20592194) → HP1145
Example 2Southern cross (Q95961758) → HU 9800
Example 3Existentialistische Marx-Interpretation (Q17416765) → CI 3882, MC 811, CG 5357, MS 4715, CC 7910 and CI 3884
Example 4Wittgenstein. Tractatus (comments, 2001) (Q95949110) → CI 5017 and CI 5015
Sourcehttp://rvk.uni-regensburg.de
Planned useSlow manual addition (see comments of Nstrc)
Expected completenessalways incomplete (Q21873886)
See alsoSee the other properties in this page (Option B)

Motivation

edit

Previous discussions: Wikidata:Identifier migration/1#good to convert; Property talk:P1036#Distinct value constraint and conversion to external-id datatype

So, this is a major problem:

  • The three previous properties where originally designed to be used on topic-pages (as the proposals and the examples demonstrate) as external identifiers, in order to show that one topic is attributed a single identifier in one classification (e.g. social science (Q34749) is designed as 300 or H or 3). Obviously these properties have unique value constraint. The fact that these properties should have external-id datatype (because they are external identifiers) and should have unique value constraint is demonstrated by the fact that Chinese Library Classification (P1189) and Basisklassifikation (P5748) have these characteristics, just as Regensburg Classification (P1150) (ready to be converted) and Colon Classification (ready to be created).
  • However, these three properties have been widely used not to indicate the correspondence between a topic-item on Wikidata and a topic in these classifications, but to classify single works and editions. This is the reason of the high number of unique constraint violations, which prevented their conversion from datatype string to datatype external-id.

Now, given that:

  • external-id datatype and unique value constraint are appropriate to properties matching topic-item to topic in classification
  • string datatype and no unique value constraint are appropriate to properties matching single work-item or edition-item to classification

it is evident that three new properties are needed.

Given that the old properties are widely used in work-items and edition-items,

  • it is probably more convenient (option A, according to which I compiled the boxes above) moving to the three new properties here proposed the topic-values, so that the old properties maintain the string datatype while the new properties assume the external-id datatype and inherit the unique value constraint, which will be removed from the old ones;
  • obviously, if for some reason you consider better to keep topics in the old properties, it is possible the opposite (option B): creating the new properties with string datatype and no unique value constraint and moving to them work-values and edition-values, then migrating the old properties to external-id datatype.

Please vote explicitly for option A or B; if not specified, the vote will be considered for A, according to which I compiled the boxes above. Epìdosis 22:21, 18 May 2020 (UTC)[reply]

  Notified participants of WikiProject Libraries --Epìdosis 22:26, 18 May 2020 (UTC)[reply]

  Notified participants of WikiProject Authority control --Epìdosis 22:26, 18 May 2020 (UTC)[reply]

The Source MetaData WikiProject does not exist. Please correct the name. --Epìdosis 22:28, 18 May 2020 (UTC)[reply]

The Source MetaData/More WikiProject does not exist. Please correct the name. --Epìdosis 22:28, 18 May 2020 (UTC)[reply]

@Jura1: suggested a very interesting comparison between these three cases and Iconclass notation (P1256)/depicts Iconclass notation (P1257), which is worth mentioning here. --Epìdosis 17:26, 19 May 2020 (UTC)[reply]

Discussion

edit
  • Oppose all If exiting properties are being misused, the incorrect uses should be removed (or migrated to new properties; by a bot if necessary). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:49, 18 May 2020 (UTC)[reply]
    @Pigsonthewing: In fact option B is actually proposing that the incorrect uses should be "migrated to new properties". Which is the problem? --Epìdosis 22:52, 18 May 2020 (UTC)[reply]
  • It took me some minutes to read carefully this issue that I did not know and I   Support the effort. I trust the first procedural option if it easier.--Alexmar983 (talk) 23:27, 18 May 2020 (UTC)[reply]
  • In general, it seems like it is more important to maintain the principle that properties' meanings should not change, than to change them (and break all of the entity-matching that had been done using the intended property meaning) just because there are misuses. If at all practicable, Option B actually seems like the better solution to me, even if it is more work for Wikidata. Dominic (talk) 23:46, 18 May 2020 (UTC)[reply]
  • Hard question, thanks Epìdosis to look into that. If I understood everything correctly, I think I prefer the option B which is a bit harder but cleaner. Cheers, VIGNERON (talk) 07:10, 19 May 2020 (UTC)[reply]
  • Very important issue to be fixed as soon as possible. Thank you Epìdosis. I would prefer Option B, even if it will take more efforts to be developed. --Carlobia (talk) 09:56, 19 May 2020 (UTC)[reply]
  •   Question these three properties have been widely used not to indicate the correspondence between a topic-item on Wikidata and a topic in these classifications
    Do we really *need* to represent those relationship ? If it’s the case that Wikidata knows that a topic is a subtopic of another one, using an existing hierarchical property like subclass of (P279) or something … If algebra is a subtopic of mathematics, and mathematics are say 500 in dewey classification, then algebra could be classified as 500 naturally. Are’nt most if not all of those statements just a variant of this argument hence redundant or worse, harmful ? Could not we instead write a few queries to retrieve the same information and build a better wikidata-ish topic classification, justified (sourced) by the topic relationship in external classifications ? author  TomT0m / talk page 11:02, 19 May 2020 (UTC)[reply]
    @TomT0m: I substantially agree, but I don't understand which consequence you are drawing from the reasoning: algebra (Q3968)Dewey Decimal Classification (P1036)"512", so cannot be inferred from mathematics (Q395)Dewey Decimal Classification (P1036)"500". Are you maybe saying that we can delete all library-classification-statements from work items since we can leave them only in topic items, linking to them with main subject (P921)? This would seem reasonable to me, but cannot be done in a simple way, so it would be better first migrating to a new property the IDs in work items and then gradually substituting these new properties with main subject (P921) whenever possible. --Epìdosis 11:38, 19 May 2020 (UTC)[reply]
  •   Support I slightly prefer option (A) but one way or another I do believe this needs to be sorted out. ArthurPSmith (talk) 17:28, 19 May 2020 (UTC)[reply]
  •   Support. I vote for option B. -- Bargioni 🗣 18:00, 19 May 2020 (UTC)[reply]
  •   Support. I agree with ArthurPSmith. Option A is a better choice. --Csisc (talk) 01:52, 21 May 2020 (UTC)[reply]

Let's summarize: as of now, there are 5 votes for option B (Dominic, VIGNERON, Carlobia, Bargioni and I, since I've been convinced by VIGNERON), 6 with the unclear vote of Pigsonthewing (who said "migrated to new properties"), against 3 votes for option A (Alexmar983, ArthurPSmith, Csisc). If there is no other vote, I think we can proceed on Tuesday 26th enacting option B. --Epìdosis 17:11, 24 May 2020 (UTC)[reply]

  Support the conversion (either A or B) but with some clarifications (@Epìdosis: please consider):

Library classifications and main subject (P921)

edit

Sorry, I'm too late for the party, otherwise I might have   Oppose. The existing library classification properties such as Dewey Decimal Classification (P1036)} are to be used on topics. They should be migrated from datatype string to datatype external-identifier (A, but no new properties). Does the the proposal B imply to create a second twin property for each library classification property? Do you know there are hundreds of library classifications out there. What makes the three above so special? I'd prefer to use main subject (P921) instead. -- JakobVoss (talk) 21:22, 30 May 2020 (UTC)[reply]

@JakobVoss: Yes, proposal B, substantially approved with no contrary votes, implies creating a second twin property for these three library classification properties. I understand your objection: using main subject (P921) would be better. For this reason, it is our intention not to allow using other library classification properties on publications. Unfortunately, these three have been widely used on publications, so we think it's better to "save" these regrettable uses transferring them to new IDs, which (I think) can be gradually dismissed substituting them with appropriate main subject (P921). Proposal B can be viewed just as a transition towards the elimination of library-classification properties on publications, in order to supersede them with main subject (P921). I hope you can agree with this clarification. Good night,
--Epìdosis 21:44, 30 May 2020 (UTC)[reply]
Library classifications do not (necessarily) refer only on the subjects of works. E.g.
* LCC B2 is philosophical "Periodicals. Serials. French and Belgian"
* LCC B21 is philosophical "Collected works (nonserial). English and American"
* B41 is philosophical "Dictionaries. English and American"
etc.
On the one hand it would be missleading to transform such classifications into values for main subject (P921).
On the other hand it would be a pity to loose such information and search option (i.e.: options for search strategies).
I would agree, that "LC Subjects" could be used as values for main subject (P921); but the classification of a certain book is an information for its own, because the class is part of a systematic (inter alia: hierarchial) structure.
--Nstrc (talk) 12:00, 31 May 2020 (UTC)[reply]
@Nstrc: Thank you for the precisation! So, I can reformulate: "transition towards the elimination of library-classification properties on publications, in order to supersede them with main subject (P921) whenever appropriate; in the other cases library-classification properties will be retained on publications". --Epìdosis 15:44, 31 May 2020 (UTC)[reply]
But that would ruin the system of the library-classifications. - However, I'm convinced, that there is a general difference between an undefinite[1] and unhierarchical quantity of key words (tags) [which could be used with main subject (P921)] and a definite list of classes.
The key words maybe better for a precise description of the book at issue; rather a definite list of classes is better for grouping books and finding similar books.
Cfr. as well Regensburger Verbundklassifikation (P1150) und Classification of edition items?. (Sorry, for the widespread statements. I learnt only step by step, that these are intertwined problems.)
--Nstrc (talk) 16:45, 31 May 2020 (UTC)[reply]
@Nstrc: OK, maybe I'm not totally understanding your position. Are you suggesting that the same property (e.g. Dewey Decimal Classification (P1036)) should be used both for topics and for publications? --Epìdosis 16:50, 31 May 2020 (UTC)[reply]
I would agree with the distinction between the two kinds of usages (however, I do not realy understand, what is the sense and purpose of the usage for topics). My point is, that I would like to use library-classifications - and namely: as well Regensburg Classification (P1150) - for publications.
They are desigend for that; they are well established and they contain information, that should be not lost and which should be used for Wikidata as well.
--Nstrc (talk) 17:08, 31 May 2020 (UTC)[reply]

---

@Nstrc: You write "I would agree with the distinction between the two kinds of usages", so in fact you support option B. You say you want "to use library-classifications - and namely: as well Regensburg Classification (P1150) - for publications", so you need a twin property of Regensburg Classification for publications: I've added the proposal above, please add 4 examples. --Epìdosis 17:43, 31 May 2020 (UTC)[reply]

@Epìdosis: Thank you. - Should I create hypothetical examples? Or should I search for items, where Regensburg Classification (P1150) is used for a certain publication yet, although it is not allowed [i.e.: searching for constraint voilations regarding Regensburg Classification (P1150)]?
--Nstrc (talk) 17:53, 31 May 2020 (UTC)[reply]
@Nstrc: As you prefer, both types of examples are fine. --Epìdosis 19:19, 31 May 2020 (UTC)[reply]


@Epìdosis:

Examples for RVK

it would be:

  • HP 1145: Anglistik. Amerikanistik / Englischsprachige Literatur außerhalb der Britischen Inseln und der USA / Commonwealthliteratur, Postkoloniale Literatur, Neue Englischsprachige Literatur / Gattungsgeschichte / Epik, Erzählende Prosa

(source: https://kxp.k10plus.de/DB=2.1/PPNSET?PPN=602029244)


  • HU 9800: Anglistik. Amerikanistik / Amerikanische Literatur / 20. Jahrhundert / Literaturgeschichte / Sonstige

(source: https://kxp.k10plus.de/DB=2.1/PPNSET?PPN=336804237)


  • CI 3882: Philosophie / Geschichte der Philosophie / Geschichte der Philosophie des Abendlandes von Antike bis 20. Jahrhundert / Philosophie des 20. Jahrhunderts / Deutschland und deutschsprachige Länder / Autoren / Autoren M / Marcuse, Herbert / Teilsammlungen
  • MC 8113: Politologie / Geschichte der politischen Philosophie und der Ideologien / Marxismus / Allgemeines / Marx, Karl / Einzelprobleme
  • CG 5357: Philosophie / Geschichte der Philosophie / Geschichte der Philosophie des Abendlandes von Antike bis 20. Jahrhundert / Philosophie des 19. Jahrhunderts / Deutschland und deutschsprachige Gebiete / Autoren / Autoren M / Marx, Karl / Abhandlungen, Studien
  • MS 4715: Soziologie / Spezielle Soziologien / Politische Soziologie / Marxismus, Materialismus
  • CC 7910: Philosophie / Systematische Philosophie / Rechts-, Gesellschafts- und Staatsphilosophie / Marxismus, Marxismus-Leninismus, Kommunismus, Neomarxismus
  • CI 3884: Philosophie / Geschichte der Philosophie / Geschichte der Philosophie des Abendlandes von Antike bis 20. Jahrhundert / Philosophie des 20. Jahrhunderts / Deutschland und deutschsprachige Länder / Autoren / Autoren M / Marcuse, Herbert / Einzelschriften

(source: https://kxp.k10plus.de/DB=2.1/PPNSET?PPN=1628486317)


  • CI 5015: Philosophie / Geschichte der Philosophie / Geschichte der Philosophie des Abendlandes von Antike bis 20. Jahrhundert / Philosophie des 20. Jahrhunderts / Deutschland und deutschsprachige Länder / Autoren / Autoren W / Wittgenstein, Ludwig / Kommentare zu einzelnen Werken

(source: https://kxp.k10plus.de/DB=2.1/PPNSET?PPN=321082443)

--Nstrc (talk) 19:31, 31 May 2020 (UTC)[reply]

Class names

edit

Let's follow Servizio Bibliotecario Nazionale

Especially I would appreciate, if we would do it in future times similar to SBN for the book Hegel et la pensée moderne:

  • "Dewey · 193 (23.) FILOSOFIA OCCIDENTALE MODERNA. GERMANIA E AUSTRIA"

I.e.: Mentioning not only the class number, rather as well the class name - and linking the class number to a list of publlications, which are belonging to the same class. -

And the same regarding the other library-classifications.

--Nstrc (talk) 17:22, 31 May 2020 (UTC)[reply]

Further example for the same 'style', GVG for the book Alice doesn't:
* Basisklassifikation: 24.31 (Systematische Filmwissenschaft)
--Nstrc (talk) 18:16, 31 May 2020 (UTC)[reply]
Connecting library classification identifiers with labels and links to publications is a matter of queries and user interface design. It's enough to have class identifiers so the rest can be looked up via API depending on your needs (languages, type of catalog, publication etc.). For instance look up labels of Basisklassifikation 24.31. Try out https://coli-conc.gbv.de/cocoda/ for more APIs to library classification data. To query lists of books there are APIs as well. -- JakobVoss (talk) 13:50, 2 June 2020 (UTC)[reply]
@Nstrc: See this edit done via Cocoda. Select a Wikidata item on the left and a RVK class on the right, make sure to log in via Wikidata and to select Wikidata as mapping registry, so you can add RVK mappings to Wikidata while browsing both classification systems. I'd be very happy if you give it a try! -- JakobVoss (talk) 14:01, 2 June 2020 (UTC)[reply]

Conclusion

edit

New summary: 6 votes for option B (Dominic, VIGNERON, Carlobia, Bargioni, Nstrc and I, since I've been convinced by VIGNERON), 7 with the unclear vote of Pigsonthewing (who said "migrated to new properties") and 8 with the vote of JakobVoss ("The existing library classification properties such as Dewey Decimal Classification (P1036) are to be used on topics. They should be migrated from datatype string to datatype external-identifier (A, but no new properties).", which means that he supports the use of old properties as in proposal B, not A, while he opposes the creation of new properties at all), against 3 votes for option A (Alexmar983, ArthurPSmith, Csisc). As for consensus reached by option B, old properties have just been converted from "string" datatype to "external-id" datatype; so, for property creators, the 4 properties above are ready for creation, as a fulfillment of option B. --Epìdosis 22:36, 3 June 2020 (UTC)[reply]

@Epìdosis: so if I understand well, we should create a new property corresponding to the option B (Regensburg Classification). We have already Regensburg Classification (P1150) so I do not understand exactly what is the difference here. Pamputt (talk) 05:37, 19 June 2020 (UTC)[reply]
@Pamputt: The properties to be created are the four above marked as ready. The difference is that the four existing properties (having datatype external-id) should be used in items which are topics (there is a one-to-one correspondence between topic and ID), while the new to-be-created properties (having datatype string) should be used in items which are works (they indicate which is the topic treated by the work, so there isn't a one-to-one correspondence, as more works can treat the same topic). If you have other questions, obviously ask :) --Epìdosis 07:34, 19 June 2020 (UTC)[reply]
@Epìdosis: I've created Dewey Decimal Classification (works and editions) (P8359). I am not sure to well understand the difference between Dewey Decimal Classification (works and editions) (P8359) and Dewey Decimal Classification (P1036) so I let you fill it so that I can copy for the other properties. Pamputt (talk) 17:08, 19 June 2020 (UTC)[reply]
@Pamputt: Compiled ;-) --Epìdosis 18:50, 19 June 2020 (UTC)[reply]

Dewey Decimal Classification (works and editions) (P8359), Library of Congress Classification (works and editions) (P8360), Universal Decimal Classification (works and editions) (P8361) and Regensburg Classification (works and editions) (P8362) have been created. Pamputt (talk) 19:03, 19 June 2020 (UTC)[reply]

Notes

edit
  1. As long as they exists as Wikidata items. However the quantitity of Wikidata items is as well indefinite.