Wikidata:WikiProject Chemistry/Proposal:Models
Chemistry is an interesting field and has many concepts. We have pure chemical compound (Q11173), chemical substance (Q79529), mixture (Q169336), ion (Q36496), chemical element (Q11344), and much, much more. Even chemical compounds are hard: Wikipedia (Q52) has entries for fully stereochemically defined compounds, for racemic mixtures, for compound classes (like fatty acid (Q61476)), and for entries for compounds with undefined stereochemistry.
The WikiProject Chemistry only defines guidance for pure chemical compounds: "each pure chemical substance (i.e. not mixtures or solutions) the property instance of (P31) with the value chemical compound (Q11173)". However, it does not provide guidance on how to model many of the other topics, leading to questions:
- Fats and fatty acids (tho confounded with the chemical difference between fat and fatty acid)
- Related compounds
- Radioactive element
- Mixtures
- A specific chemical element (Q11344) is a class?
- Hydrates
- Elements and periods
- Following a specific chemical ontology?
- Problem of concept definition for some ions
(and many more)
On this page I am proposing some models (and I will want to use ShEx for this, see these WikidataCon slides) to start a discussion to formalize the approaches we want to use. This matters (lol) to use as we have in WikiPathways (Q7999828) a lot of compound classes (like fatty acid (Q61476)). For this reason I have started aspects for Scholia (Q45340488) to visualize a specific chemical and (still a pull request) for chemical classes. However, as Scholia uses the Wikidata SPARQL end point, it benefits from consistent structuring of data.
Chemical Compounds
editAll chemicals having the following properties have to be considered as instance of (P31):chemical compound (Q11173) or instance of a subclass of chemical compound (Q11173):
- constant chemical composition
- composed of several elements
- no global electric charge
- fully defined stereochemistry (i.e. cis/trans or E/Z configurations, o-, p-, m- configurations, D/L configurations, R/S configuration, endo/exo configurations)
- atoms or groups of atoms are linked by covalent bond, ionic bond, metallic bond or Coordinate covalent bond
- can be isolated in pure form and is stable enough allowing measurement of chemical and physical properties like melting point,... → this is controversial: hypothetical compounds and compounds that cannot be isolated in pure form, but are known from their derivatives would be exlcuded. Cf. en:Category:Hypothetical chemical compounds. Wostr (talk) 21:39, 9 January 2018 (UTC)
This definition includes neutral salts, stable radicals like nitric oxide (Q207843), some hydrates and some coordination complexes
Examples:
Particular cases
editA) Chemicals with incompletely defined stereochemistry have to be classified as subclass of (P279):chemical compound (Q11173) and not as instance of (P31):chemical compound (Q11173)
Examples:
B) Chemicals composed of only one chemical element have to be classified as instance of (P31):simple substance (Q2512777)} and not as instance of (P31):chemical compound (Q11173)
Examples:
Propositions by Wostr
edit- Treat all chemical compounds as portions of matter and classify under chemical compound (Q11173) (which would be subclass of (P279) = chemical substance (Q79529)). Thus, all chemical compounds would be classes. Chemical compound = chemical substance composed of electrically neutral molecular entities made of atoms of at least two chemical elements.
- chemical entity (Q43460564) (?)
- chemical substance (Q79529)
- chemical compound (Q11173)
- compound classes like heterocyclic compounds, oxides etc. (the whole classification tree has to be discussed)
- chemical compounds items (subclass of (P279) to this point)
- in rare cases, where item about 'molecule of chemical compound' exists; e.g. 'water molecule' instance of (P31) = water (Q283).
- Comment Incorrect. Water is made of water molecule, so the right relation is « part of » here. . Instance of is inappropriate, an instance of « water » is « the water I drank an hour ago ». author TomT0m / talk page 15:52, 13 February 2018 (UTC)
- in rare cases, where item about 'molecule of chemical compound' exists; e.g. 'water molecule' instance of (P31) = water (Q283).
- chemical compounds items (subclass of (P279) to this point)
- compound classes like heterocyclic compounds, oxides etc. (the whole classification tree has to be discussed)
- chemical element (Q11344)
- here would be all items about chemical elements, but also items like hydrogen atom (Q6643508) or dihydrogen (Q3027893)
- chemical compound (Q11173)
- mixture (Q169336)
- racemic mixtures here
- molecular entity (Q2393187)
- chemical group (functional group (Q170409) ?)
- ion (Q36496)
- but also items like 'water molecule' and hydrogen atom (Q6643508)
- chemical substance (Q79529)
- Hovewer, there are some inconsistencies in this approach, but it's IMHO more related to that what we are trying to collect in WD (substances with their properties,its uses [by classification in e.g. drugs classes], and not just entities).
- chemical entity (Q43460564) (?)
- More consistent approach is to adopt ChEBI classification. As every chemical compound would be treated like a molecule (entity), not as a portion of matter (substance), the definition would be like: electrically neutral molecular entity made of atoms of at least two chemical elements. 'Water molecule' would be equal to 'water', so only one item would be necessary. But: in chemical compounds items we would have many properties that pertains only to chemical substances (like surface tension, vapor pressure, safety classification and many others). And the second, bigger BUT: chemical elements cannot be treated like 'entities' because 'chemical element' include all the isotopes, all the allotropic forms etc.
(Racemic) Mixtures
editBecause the mixture does not generally define the ratio of the amount of the components, I opt for defining it as a class.
Examples:
- rac-tilidine (Q45024802) (racemic mixture)
Model
edit- subclass of (P279) mixture (Q169336) (**)
- two or more has part(s) (P527) something
(**)Which would also indicate that the compound is a subclass of (P279) chemical substance (Q79529)
Compound Classes
editWikidata supports properties for two important databases that define compound classes: ChEBI (Q902623) (ChEBI ID (P683)) and LIPID MAPS (Q20968889) (LIPID MAPS ID (P2063))
Examples:
Model
edit- subclass of (P279)* chemical compound (Q11173): any compound class is a subclass of chemical compound (Q11173)
Proposition by Wostr
editMetaclasses:
- group or class of chemical substances (Q17339814) (formerly with 'chemical compound class' alias but chemical substance ≠ chemical compound)
- structural class of chemical entities (Q47154513) – for chemical compound classes where the distinction is made on structural basis (per IUPAC definition available in this item); unlimited number of compounds in class (i.e. every compound that match conditions of a class is a member of this class)
- classes like azole (Q419639), aromatic amine (Q584276), phenol (Q407142) and many, many others would have instance of (P31) with structural class of chemical entities (Q47154513).
- eqiv. of 'open classes' in ChEBI
- Q? group of chemical compounds – for group of compounds that have limited number of members (and structural similarity is not needed)
- structural class of chemical entities (Q47154513) – for chemical compound classes where the distinction is made on structural basis (per IUPAC definition available in this item); unlimited number of compounds in class (i.e. every compound that match conditions of a class is a member of this class)
Every item about compound class should have instance of (P31) with any metaclass from above.
Chemical Elements
editSee Wikidata:Elements WikiProject
Model
edit- instance of (P31) chemical element (Q11344)
- no properties related to pure substances or compounds made from this element, like molecular oxygen (oxygen (Q629)) and C60