Property talk:P828

Documentation

has cause
underlying cause, entity that ultimately resulted in this effect

Description

underlying cause, thing that ultimately resulted in this effect. See Help:Modeling causes for usage notes. Inverse of has effect (P1542).

Represents

cause (Q2574811)

Data type

Item

Template parameter

new infobox from French medicine project would be interested in this information (link?)

Domain

term (note: this should be moved to the property statements)

Allowed values

all types of causes (note: this should be moved to the property statements)

Example

mercury poisoning (Q408089) → mercury (Q925)
malaria (Q12156) → Plasmodium malariae (Q133969)

Source

reliable sources (note: this information should be moved to a property statement; use property source website for the property (P1896))

Tracking: usage

Category:Pages using Wikidata property P828 (Q26250078)

Tracking: local yes, WD no

no label (Q101364578)

does not have cause (P9353)

Remove 'possible'?

Latest comment: 10 years ago3 comments2 people in discussion

This property was mentioned at Wikidata:Property_proposal/Person#medical_condition. Even in the context of items about medical conditions and procedures, I think 'possible' is not needed for this property name, as multiple 'causes' are presumed to be alternatives. i.e. on bone fracture (Q68833) each use of 'possible causes' should have a source to medical literature; multiple values are different possible causes, and they would have date qualifiers if the cause is no longer accepted by the medical profession. John Vandenberg (talk) 05:11, 3 December 2013 (UTC)Reply

Done by user:Tobias1984 - thank you! John Vandenberg (talk) 04:56, 12 December 2013 (UTC)Reply

I still see 'possible' as an alternative label. Concur with above and suggest this is also removed. Its an open world.. anything is possible... not what we want to encourage with this property. --Genewiki123 (talk) 17:04, 9 September 2014 (UTC)Reply

constraints

Latest comment: 10 years ago1 comment1 person in discussion

"One of" constraint is not good as we can't describe all possible reasons of illness. I'll try to add instead constraint on a type of item. Infovarius (talk) 13:00, 27 December 2013 (UTC)Reply

A better way to model causation

Latest comment: 10 years ago18 comments6 people in discussion

Tobias1984, WS, Matthiassamwald, Danrok, John Vandenberg, others, let's improve the way we model causes. For a tl;dr of what I'm talking about, see the proposal and examples, and let me know what you think.

This property is currently labeled "medical causes", which I think should be broader. We should have a generic property to structure data about causes. This would allow us to not only say "bone fracture cause stress fracture", but also:

2007–2008 financial crisis (Q896666) cause United States housing bubble (Q2928006) (and more)
Cretaceous–Paleogene extinction event (Q55811) cause Chicxulub impactor (Q5096573)
American Civil War (Q8676) cause slavery in the United States (Q118382)
rainbow (Q1052) cause reflection (Q165939), refraction (Q72277)
evolution (Q1063) cause natural selection (Q43478), genetic drift (Q486420)

Such "cause" statements are clearly incomplete. The values in the statements above are only one link in a chain of events that lead to the subject. There are more immediate causes and more underlying causes. For many effects, there are also significant contributing factors that are not directly related to the cause of the effect.

Death is a model system for structuring statements about cause. Our mortality cause properties cause of death (P509) and manner of death (P1196) use the controlled vocabulary of the World Health Organization and the U.S. Standard Death Certificate. A typical example would be "cause of death: gunshot wound; manner of death: homicide".

Causes are often more complex. Consider the death of James A. Garfield, modeled in https://www.wikidata.org/wiki/Q34597#P509. As Wikipedia explains, Garfield was shot twice -- once in the arm and once in the back -- on July 2, 1881. Then, on "Monday, September 19, 1881, at 10:20 p.m. President Garfield suffered a massive heart attack and a ruptured splenic artery aneurysm, following blood poisoning and bronchial pneumonia." Here there are several other linked causes: heart attack, splenic artery aneurysm, blood poisoning and bronchial pneumonia. But those immediate issues were ultimately caused by Garfield's gunshot wounds. Garfield's death was accelerated by his doctors' dirty, pre-modern medical technique.

Ask Google "james garfield cause of death" and you'll get "Sepsis, Pneumonia, Heart attack". The most important cause is missing! The underlying cause of death for James Garfield was gunshot wounds sustained two months before his death. Also, the misguided treatment by Garfield's doctors actually poisoned him, and was thus a significant contributing factor of his death. The complexity of Garfield's cause of death is uncommon but not very rare. Other effects like the American Civil War and the 2008 financial crisis have causes that are at least as complex.

These effects all have a chain of direct causes. The American Civil War had slavery as its underlying cause but that led to a more immediate cause Battle of Fort Sumter (Q543165). Same with the 2008 financial collapse and United States housing bubble (Q2928006); that led to a immediate cause United States subprime mortgage crisis (Q844541). And so on.

A proposal:

Change the label of this property from "medical causes" to simply "cause". Allow it to be applied beyond medicine.
Cause would have the alias "underlying cause". Multiple values would be allowed, but fewer would be better.
Create a new property immediate cause. This would often be used in conjunction with cause, and would be used when the immediate cause followed directly any values in cause. Multiple values would be allowed.
Create a new property "contributing factor" to represent any significant circumstances that led to the effect, but did not directly result in the underlying cause.
Create properties "immediate cause of death" and "contributing factor of death" for mortality properties.

Examples:

James Garfield (see here for context)

cause of death: gunshot wound

number: 2
applies to part: arm, back
point in time: July 2, 1881

immediate cause of death: heart attack, splenic aneurysm, blood poisoning, pneumonia
contributing factor of death: septic technique, starvation

Cretaceous–Paleogene extinction event

cause: Chicxulub impactor
immediate cause: starvation, impact winter

American Civil War (see here for context)

cause: slavery
immediate cause: Battle of Fort Sumter, United States presidential election of 1860, ...
contributing factor: States' rights, caning of Charles Sumner, Dred Scott v. Sandford, Bleeding Kansas, ...

Analyzing causation in terms of underlying and immediate cause has precedence not only in standardized vocabulary in demographic/epidemiological documents like death certificates, but also work in history, anthropology and philosophy. We even have a Wikipedia article on it!: Proximate and ultimate causation (promixate: immediate, ultimate: underlying). Separating proximate and ultimate causes is a key feature in books like Jared Diamond's excellent Guns, Germs and Steel; see here.

I think the proposed approach strikes a good balance between simplicity and comprehensiveness. Modeling this data as property values and not qualifier values also eases querying; recall how qualifiers will not be queryable in the foreseeable future. Making our modeling of causes broader and more robust should enable some interesting applications.

What do you think? Emw (talk) 02:54, 9 September 2014 (UTC)Reply

I think this is more or less related to processes. Evolution is a process, something kind of abstract. If we try to tell the story of plants evolution, we could see thay as a sequence of events, like "speciation event" maybe ? (if a speciation is an event, speciation can be seen as a class of envents, or as a process). Natural selection does not really strike me in french like a cause but as a mechanism involved in the evolution process. Death seems to have causes indeed, as a particular death is an event. This annoys me the same way as the part of property can be applied to both classes and instances actually.

For the use of qualifiers, we need to find a compromise between their future use and the level of details we want to achieve. I don't think it is really time to try to be as expressive with wikidata statements that we could be in a Wikipedia article. The chain of events that lead to an event is of course interesting to have, but maybe we should limit oursevels waiting for an app that can actually use and display the datas in a meaningful way ?

In general causes forms a graph : https://www.google.com/search?q=causal+graph&tbm=isch Each nodes of the graph can be seen as events, and maybe the arrows can be seen as processes that led from an event or a set of events to another event. It's easy to model that in Wikidata of course. I think a causal graph has necessarily a granuarity : if you zoom on an edge, you can probably build another more detailed graph involved in the process, with another set of edges : the heart attack process could be described in more biological terms involving the cells of the hearth maybe. So essentially we need to ask : on which level of details do we want to go ? An immediate cause might not be that immediate if we want to get more details. And the more details we might want, the more items we might need, if we take one item per nodeh in the causal graph.

I think what actually lacks in your model is a pattern of significant processes we might want to allow or consider at the same level. The point of view of people death in historical events analysis might not always need to be as detailed as forensic analysis level. For example at the level of plate tectonics (Q7950)  

, the significant events are continent spit, mountain chain creation, but not something like galaxy creation or volcanic eruption, who are too broad or not. Do we need a notion of process, which links the causes and consequences at a somewhat acceptable granularity level ? For example if we have a set of plausible processes in speciation analysis, in biology, if we want to tell the story of a specific speciation we might use this processes as the edges of the graph. I don't know, like population separation speciation as a speciation subtype. An acceptable set of types of causes for population separation speciation might be continent separation. TomT0m ^(talk) 11:41, 9 September 2014 (UTC)Reply

TomT0m, I agree that the ideal model for causation would be some weighted directed graph (including cycles as needed for positive feedback, etc.), but I don't see a straightforward way to currently achieve that on Wikidata. I also agree that our statements about causes need to be reasonably scoped. For example, yes President Garfield's heart attack was caused by a blockage or rupture of a particular blood vessel or chamber of his heart, but that's likely too much information. Similarly, a typhoon in India might be caused in part by the beating of a butterfly's wings in Mexico, but that would also be too much information, with too negligible and distant a cause. Same with the causal link between mountain range creation and galaxy creation: yes, one led to another, but linking them in a Wikidata item would be rather silly. We agree on that. I don't foresee many users making such claims.

I do not think we need to restrict cause or its related properties to only have values that are events or processes. Things get murky at the top of an ontology, and I don't think we should need to worry users with whether States' rights or Dred Scott v. Sandford are processes. Emw (talk) 12:57, 9 September 2014 (UTC)Reply

I'd support this proposal. I don't see a good reason to have a "medical cause" unless it was somehow modeled as a sub-property of "cause" that included additional medically relevant constraints. I also think the division between cause, immediate cause, and contributing factor strikes a reasonable balance between expressivity and usability. I think more sophisticated structures, such as the one TomT0m describes, will need to be tied to specific contexts and can emerge as new or subproperties as the modeling needs and the data arise.--Genewiki123 (talk) 17:35, 9 September 2014 (UTC)Reply

I have to say though that your example does make me a little uncomfortable. The connection between the concept of slavery and the american civil war is very hard to compare to that between a gunshot wound and a man's death - its certainly not equivalent. I would model the concept of slavery as a contributing factor, related to but not subsuming states rights conflicts. It almost makes me think you should just take out "cause" and divide things up between "immediate cause" and "contributing factor". That might result in less arguing downstream...--Genewiki123 (talk) 17:35, 9 September 2014 (UTC)Reply

Genewiki123: it was almost an edit conflict, I also wanted to suggest using only "contributing factor". :) Just imagine describing underlying causes of War in Donbas (Q16335075)... there would be many sources, I guess... Or maybe we can call it "significant contributing factor"? Danneks (talk) 19:02, 9 September 2014 (UTC)Reply

Genewiki123, Danneks, reliable sources disagree on the underlying causes of many wars, especially ongoing ones like the War in Donbass. For these cases, sources aligned with each participant will often make statements about underlying causes, immediate causes and contributing factors that contradict each other. But they will often talk in those terms: underlying and immediate causes, and contributing factors. In my opinion we could use this three-tiered approach and also capture knowledge diversity. Just don't mark any of the statements as "preferred" rank, and make avid use of qualifiers like statement disputed by (P1310).

For some wars, there is more consensus in reliable sources on causes and their weight. For the American Civil War, the consensus among modern historians is that slavery was the primary cause and that states' rights was not a primary cause but a means to achieve the end of slavery at best, and a simple pretext at worst. See here ("'State's rights, or sovereignty, was always more a means than an end, an instrument to achieve a certain goal more than a principle.'") , here ("States' rights was entirely a matter of protection of slavery."), etc.

Upon thinking about it more, it seems this might be a better way to model things:

American Civil War (context here)

cause: slavery in the United States (preferred rank), states' rights in the United States (deprecated rank, statement disputed by (P1310) (list of modern historians and P387 (P387)))
immediate cause: secession of Southern United States, Battle of Fort Sumter, United States Presidential election of 1860
contributing factor: caning of Charles Sumner, Dred Scott v. Sandford, Bleeding Kansas, etc.

We can refine such statements over time. Others might assign things differently, but I think the framework of a three-tiered approach to statements about causation lets us capture richer statements. Or maybe I'm just off base and a two-tiered approach would would better for modeling causation in conflicts where there is much disagreement among sources. I think different content groups could work out what works best for them.

Let's consider a clearer-cut, less divisive example where a three-tier model of causation shines.

Space Shuttle Challenger disaster (see here, especially here, also here for context):

cause: faulty design of O-rings
immediate cause: O-ring seal failure, structural failure of external tank
contributing factor: cold weather, failures in communication, mismanagement

There often will not be enough accessible source material to support such detailed modeling of causes. Or perhaps users just want to make simple statements about cause; maybe they just want to make a basic "cause" statement and be done with it. Causation models for a subject can grow over time. As they do, having an expressive framework that's still easily comprehensible would help. Emw (talk) 04:04, 10 September 2014 (UTC)Reply

@Emw: No, I don't think this is a clear-cut example :) Just one more source: 1. I guess it is possible to find a source which describes all people involved as causes of this disaster... So the description of causes definitely requires some restrictions on the domain of discourse... not sure it is feasible with only one generic property. Danneks (talk) 12:44, 10 September 2014 (UTC)Reply

@Emw: If we take a generic cause path, I think we should think twice on what we consider a cause. Or else we will get the philosophical questions :) A weighed graph makes sense only if we can make any sense to the weight, this seems uncompatible with a generic concept. That's why a model defined more at a concrete application level might be better : the granularity and the set of causes are more easily identifiable. The cause concept is ... not so easy. TomT0m ^(talk) 20:54, 9 September 2014 (UTC)Reply

@Emw: The slavery to american civil war issue brings up another aspect of this model that maybe is worth considering. To me the concept of slavery, Q8463, should not have the claim "caused" "american civil war" hanging off of it. Concepts like Q8463 are to vague to be applied like that. The war was not caused by the abstract notion of human ownership. It was caused by a disagreement about how that concept (and others) should be regarded by the law and about how decisions about laws should be decided. If there was no disagreement between the states, there would have been no war. To reiterate my earlier comment, I think the three properties you propose are a good step and that they are significantly more useful construct than the 'medical cause' that they would subsume. My question at this point is whether or not it is possible to define a Domain and/or a Range for these properties that would help yield useful data. For example, I would exclude abstract ideas from the Domain of causes and include observable events. I don't know if these kinds of upper ontology based constraints are realistic here or not.. --Genewiki123 (talk) 17:12, 10 September 2014 (UTC)Reply

Genewiki123, I agree that the statements "American Civil War cause slavery" (preferred) and "American Civil War cause states' rights" (deprecated/normal) omit information. But humans easily intuit and machines can be engineered to infer that in instances of conflict, when people talk about causes like "slavery" or "states' rights", "disagreement over slavery" and "disagreement over states' rights" are implied. Similarly for immediate causes like "United States Presidential election of 1860" and contributing factors like "Dred Scott v. Sandford" and "caning of Charles Sumner". Those entities themselves did not cause the war, disagreements over them did.

I'm wary of domain and range constraints for cause (alias: underlying cause), immediate cause and contributing factor. Excluding abstract ideas like "evolution" from the domain of cause would diminish the property. And while setting the range of cause to event or process or some such would indeed be intuitive, and probably more rigorous, I think it would also significantly raise the barrier to entry. Are states' rights and Dred Scott v. Sandford events? Probably not. But I don't want users to have to create boilerplate items like "disagreement over states' rights" and "disagreement over the Dred Scott v. Sandford" to use properties like cause, immediate cause or contributing factor. Also, our upper ontology is messy and currently of limited use. I don't want to get into long discussions about the ontological nature of abstract ideas or events. I want to enable the Wikidata community to easily structure data about causes.

I also agree that some guidelines (at least) would help. It seems the biggest concern of TomT0m and Danneks is about granularity. For example, don't make statements like "continent split cause galaxy creation" or statements like "Challenger disaster cause (list of every person involved in Challenger)". Also, per Dannek's comment about the ongoing War in Donbass, for controversial events that do not yet have consensus among reliable sources about causes, do not set the preferred rank for any cause statement unless there is consensus for that particular statement. Emw (talk) 02:00, 11 September 2014 (UTC)Reply

@Emw: OK, maybe we can write in this guideline something like "items connected by cause property should be homogeneous and of comparable influence" to avoid hilarious statements like "American Civil War cause Harriet Beecher Stowe <reference: Linkoln>". There definitely should be a way to delete such statements, even if they are referenced :) Danneks (talk) 13:13, 11 September 2014 (UTC) Or maybe I'm exaggerating the future difficulties and WD:COMMON is enough. Danneks (talk) 05:31, 12 September 2014 (UTC)Reply

@Emw: Excluding abstract ideas like "evolution" from the domain of cause would diminish the property. I do not say I would not want a statement about the fact that some facts are part of the evolution process, I just don't think that cause is philosophically the good property for that. This enlights the fact that we don't really have a good idea of what a cause is, which will not really help on using this property. You provide a model, but no real appliable rules to use it. A question about abstract causes for example : how do we disambiguate abstract and concrete causes ? For example the animal/vegetal separation can be thought as a special case of speciation, speciation is one of the mechanisms involved in evolution. But aren't we fine if we say that evolution is a sequence of events which separtes organisms in kinds ? Is really speciation a cause of evolution or evolution a cause of speciation ? I think we re fine if we say that continent splitting can cause a lot of speciations. Aren't we fine if we say that DNA mutation is a process involved in evolution (or speciation) ? Does that mean that a specific DNA mutation can be a cause of a specific speciation event ? Or, in the case of speciation preceeded by a continent split, does this mean that as individuals will not mate across continents again it is inevitable that they will diverge enough to separate into different species ?

To answer only the quote : could'nt we create a property can be a cause of, which would not be the cause property ? Though a more direct cause of speciation would be population separation, a separation of continent cause a lot of separation of populations, which in turn can (will enventually) lead to speciations. But I would tend to think that this is not a cause but a common pattern. A lot of specific speciations events follow this pattern. But there is other patterns. A good way to guide the use of cause would be by defining patterns like that. We should be able to distinguish easily beetween causal patterns and causes of a specific event. TomT0m ^(talk) 10:58, 15 September 2014 (UTC)Reply

Support this proposal. Filceolaire (talk) 18:42, 10 September 2014 (UTC)Reply

@Emw: What about using follows (P155) / followed by (P156)? I don't see much difference between a causal series and a sequence series. In any case I

Support the creation of properties for primary causes, secondary causes, supporting factors, and outcome/yield. --Micru (talk) 00:47, 11 September 2014 (UTC)Reply

Micru, great question about follows/followed by. One subtle but important difference I see between those and the proposed properties cause, immediate cause and contributing factor is that the latter three address different scopes (underlying vs. immediate) and degrees of cause (underlying and immediate causes vs. contributing factor). They could also be considered subproperties of followed by, since all causes precede their effects, but not all things that precede an effect are causes.

Your comment also reveals that we would probably benefit from having inverse properties for each of these:

property: has cause (alias: underlying cause, effect of, caused by), inverse property: cause of (alias: underlying cause of, has effect)
property: has immediate cause, inverse property: immediate cause of
property: has contributing factor, inverse property: contributing factor of

The standard cavaet would apply to those inverse properties: many accidents may have been immediately caused by explosion (Q179057), but we should obviously avoid putting thousands of immediate cause of statements in 'explosion'. A few inverse property statements like immediate cause of statements in 'explosion' would likely be useful for exploration / property discovery by new users, though.Emw (talk) 02:56, 11 September 2014 (UTC)Reply

@Emw: I support them all because they are needed for modelling systems (specially systems that emerge over time). I don't see a problem limiting these properties to notable causes. However, that is more a question of how accurate users want to be with their representation.--Micru (talk) 12:02, 11 September 2014 (UTC)Reply

Big difference.. Night follows the day but day does not cause the night.. Chapter 2 follows chapter 1, etc. follows != cause caused by --Genewiki123 (talk) 16:15, 11 September 2014 (UTC)Reply

immediate cause of = cause of

Latest comment: 2 years ago2 comments2 people in discussion

Merge? --Fractaler (talk) 07:05, 26 October 2016 (UTC)Reply

No, see the discussion here. — The Erinaceous One 🦔 01:42, 19 September 2022 (UTC)Reply

Add topic