Wikidata:WikiProject Events and Role Frames
The primary aims of WikiProject Events and Role Frames are
- to define a set of properties that consistently model eventualities (states, processes and events) and their participants (Pustejovsky, 2021). (By eventualities we mean things that occur or happen, as simple as throwing a ball or mowing the lawn, or as complex as one country invading another country or a town being flattened by a tornado};
- to fill gaps in Wikidata regarding items for these states/processes/events/actions; and
- to encourage use of the proposed model and newly introduced items across Wikidata.
Pustejovsky, James, (2021) The Role of Event-Based Representations and Reasoning in Language, in Caselli T, Hovy E, Palmer M, Vossen P, eds. Computational Analysis of Storylines: Making Sense of Events. Cambridge University Press.
Motivation
[edit]One of the known weaknesses of Wikidata is the spotty coverage of eventualities (processes, states, events) and their prototypical participant structures. Let’s look at the spotty coverage issue first.
Spotty State/Process/Event/Action Coverage
[edit]One of the most common verbs in most languages is “to bring”, e.g., “I brought flowers to my mother”, “J'ai apporté des fleurs à ma mère”, “Я принёс цветы маме”, “Ich habe meiner Mutter Blumen mitgebracht”. Until we added "bringing (Q124457329)" in February 2024, there was no such concept in Wikidata. We examined over 11,500 rolesets contained in PropBank (Q7250039) that describe English predicating expressions (mostly verbs) and identified over 7500 potentially missing Wikidata items. Each of the semantic Q item “gaps” needs further examination to determine if it warrants a new item and a corresponding lexeme, but the list gives us a starting point. We will also be carefully comparing the PropBank entries to the over 7500 English verbs already defined as Wikidata lexemes. Ideally we will able to expand these lexemes with additional sense distinctions and predicate argument structures as well as filling in missing entries. We want to emphasize that although we start with English, the Q item gaps are semantic, not lexical and we should use multiple languages, including Chinese, Hindi, Arabic, and Russian, to determine appropriate fillers. We also plan to rely heavily on the careful curation of Czech, German, English and Spanish common predicate argument structures to be found in SynSem to ensure cross-lingual compatibility of our item mappings. We welcome input from other languages as well.
State/Process/Event/Action Role Structures
[edit]All states/processes/events/actions have core semantic roles - "eating" has the "eater" and the "eaten", "throwing" has the "thrower", the "target" and the "projectile". These roles are not optional. Every act of "eating" has an "eater" and the "eaten" independently of how and in which language it is expressed. Most of the existing items for such classes do not mention these roles. For example, "throwing (Q12898216)", defined as “launching of a ballistic projectile by hand” does not have any statements that indicate the existence of the thrower, the target, or the projectile, let alone the specifications of the kinds of entities these attributes are likely to be.
Some Wikidata items for event/action concepts include statements for some of the semantic roles. For example, "eating (Q213449)" uses the "practiced by (P3095)" property whose object is "eater (Q20984678)". Although "practiced by (P3095)" is defined as “type of agents that study this subject or work in this field”, it is often used to indicate an agent of an action. We can expand the definition of "practiced by (P3095)" to encompass the semantic role use as well, or alternatively we could replace it with a new "has semantic role" property. In either event, we would want to qualify it with the type of role, in this case an Agent. In "eating (Q213449)" there is also a property "uses (P2283)" that points to "food (Q2095)" to indicate what is "eaten". This property also has many other uses. In the absence of a Wikidata property that can be used exclusively to indicate the value of a semantic role, using existing properties would require adding qualifiers such as "object of statement has role (P3831)" to indicate that the value is a semantic role.
Caveats
[edit]This project does not address the problem of ontological consistency of Wikidata items. But, as we examine Wikidata events, we might also fill in the gaps in some of the “subclass of” inheritance relations. For example, departure (Q21171241) is not currently a subclass of going (Q19279529). The item execution (Q3966286) defined as “homicide as capital punishment” does not seem to be connected to capital punishment (Q8454).
[(arw) There are other semantic issues beyond subclass of (P279) - such as skos:altLabel ... for example, bringing (Q124457329) is a subclass of moving (Q115095261) which has a altLabel of "renaming". This is simply highlighted as an additional item that may be corrected. (map) Mahir has addressed this as he describes in the Discussion page.]
Proposal
[edit]The proposal outlines a step-by-step procedure for expanding Wikidata state/process/event/action coverage. It has four steps:
1. Adding missing Wikidata state/process/event/action classes as Q items, as well as English lexeme L items, and lexeme sense S items. We will use item for this sense (P5137) or predicate for (P9970) to tie the senses to the concepts.
2. Adding missing state/process/event/action roles to both concepts and lexemes. For English lexeme senses we can use the already defined has semantic argument (P9971). For concepts denoted by Q items our preference is a new "has semantic role" property, as explained below.
3. Specifying selectional preferences for state/process/event/action roles
4. Adding role specifications to the state/process/event/action instances
Step 1. Adding missing Wikidata state/process/event/action classes
We propose to go systematically over the PropBank RoleSets (see below for examples of PropBank Frame Files). For each RoleSet, we look for an existing Q item class that represents the same concept. For example, when we examine PropBank's "see.01" defined as "to perceive an object with one's eyes", we find two relevant Q items: "visual perception (Q162668)" and "seeing (Q25374341)". The former is described as "ability to interpret the surrounding environment using light in the visible spectrum" and the latter as "the event of perceiving something using eyesight" which looks like a better candidate. The other 5 PropBank senses of "see", entered as L items, would map to quite different Q items, if available. The mappings can be specified by creating a new identifier, "PropBank ID".
If no such item is found, we create one. For example, we could not find a Q item equivalent to PropBank's "bring.01" defined as "carry along with, move literally or metaphorically". We created a new Q item "bringing (Q124457329)" described as "transporting something toward somebody/somewhere". We also added translation (P5972) for Russian, French, Spanish, Chinese and Punjabi and made it a "subclass of" "moving (Q115095261)".
Step 2. Adding missing event/action roles
When we find or create an appropriate state/process/event/action Q item, we will go over the roles of the RoleSet. In the "eat.01" example, there are two roles: the "consumer, eater" and the "meal". For each role, we look for a Q item statement that describes the role.
eating (Q213449)practiced by (P3095)eater (Q20984678)
describes the "consumer, eater" role. RoleSet "eat.01" indicates that this role is a "PAG" - a Prototypical Agent or Actor. Since the "practiced by (P3095)" property may have uses other than designating a semantic frame role, we add a qualifier:
eating (Q213449)practiced by (P3095)eater (Q20984678)
We will also add an English Lexeme for eat, as Lxxxx-eat with Sense Lxxxx-eat-S1 (defined using the property, ontolex:sense). Then, Lxxxx-eat-S1 can use has semantic argument (P9971) to reference eater (Q20984678), qualified with object of statement has role (P3831) actor (Q23894381).
PropBank's "meal" role of "eat.01" is a PPT (Prototypical Patient). The statement that best describes it is:
eating (Q213449)uses (P2283)food (Q2095)
We can add a qualifier to this statement specifying the role type:
eating (Q213449)uses (P2283)food (Q2095)
The Lexeme can be treated similarly.
We provide a mapping between PropBank's prototypical roles such as "PAG" and "PPT" and the corresponding Q items in the table below in the Semantic Roles section.
If we cannot find a suitable qualifier value for object of statement has role (P3831), we can use "semantic role (Q117747915)" as a "back off". Ideally the Q items in our table would all be declared as subclasses of "semantic role (Q117747915)".
When an event/action class does not have a statement describing a core semantic role, we look for an existing Q item that most closely describes that role. For example, "creation (Q11398090)" (process during which something comes into being and gains its characteristics) corresponds to PropBank's "create.01". It has a statement for the "creator" role but no statements for "Arg1-PPT thing created". The item that best describes the object of creation is "artificial object (Q3619132)". To complete the frame, we add the following statement:
creation (Q11398090)has characteristic (P1552)artificial object (Q3619132)
We used the very generic "has characteristic (P1552)" property because we could not find a more specific existing one. The qualifier provides the interpretation.
Another example is "offensive (Q2001676)" which does not have statements describing the attacker and the defendant. The item "attacker (Q31924059)" seems appropriate for the attacker role and "defender (Q111729140)" for the defender role:
offensive (Q2001676)has characteristic (P1552)attacker (Q31924059)
offensive (Q2001676)has characteristic (P1552)defender (Q111729140)
The agent (Q392648) and theme (Q118826633) would more appropriately be defined at a higher level, perhaps for the Q item aggression (Q191797), and inherited by offensive (Q2001676).
When no role Q item is found, we need to create one.
Step 3. Specifying selectional preferences for event/action roles
Each role, in an event/action frame typically describes the classes of entities that would normally be expected to play that role in that frame's instances. For example, we normally expect that the "eater" in an "eating" instance would be an organism. Because these expectations could be violated we call them selectional preferences, not restrictions. Unfortunately, Wikidata does not have an existing property to specify selectional preferences and we have to resort to a common substitution by using a combination of "has characteristic (P1552)" with an "object of statement has role (P3831)" qualifier:
eater (Q20984678)has characteristic (P1552)organism (Q7239)
It may be that the Q Items that instantiate the roles/arguments may provide sufficient information. For instance, eater (Q20984678) is defined as "human or other live being who eats something" and uses has characteristic (P1552) to reference organism (Q7239). Exactly how selectional preferences should be included requires more discussion.
Step 4. Adding role specifications to the event/action instances
When a new event/action instance is created, ideally, the creator should consult the class of the instance and make sure that the semantic roles of the class are instantiated. For example, suppose we want to enter the event of Mickey Mouse creation by Walt Disney on 18 November 1928. Let's call the ID for this event Q_mm_creation. Wikidata uses over 300 properties to indicate event/action instance roles. We can pick "creator (P170)" for the creator role, "has effect (P1542)" for the created artifact role and "point in time (P585)" for time. We add the following 3 statements:
Q_mm_creationcreator (P170)Walt Disney (Q8704)
Q_mm_creationhas effect (P1542)Mickey Mouse (Q11934)
Q_mm_creationpoint in time (P585)18 November 1928
We are using the "object of statement has role (P3831)" qualifier to specify the role played by the object. In the case of event/action classes, we used high-level semantic role items such as "agent (Q392648)" or "theme (Q118826633)" as the objects of "object of statement has role (P3831)". In the case of event/action instances we use the actual role items such as "creator (Q2500638)" or "attacker (Q31924059)".
Since we are planning to add semantic roles to companion lexemes as well, an alternative approach would be to follow the path from the Q item to the lexeme, perhaps using item for this sense (P5137) or predicate for (P9970), and retrieve the semantic roles for event instantiations from the lexeme. The more languages we have represented the more desirable this would be. That also might require one or more additional properties. In the short term, associating the roles with the Q items would seem to make them more generally accessible.
Also note that we do not propose to attach the "default" roles such as "location", "start", "end", "point in time" to the event/action classes since all events/actions take place in defined times and places. The instances, though, should specify them (if known). See Semantic Roles below for more details.
Wikidata contains a very large number of event/action instances. For example, "Petsamo–Kirkenes Offensive (Q705222)" is one of many instances of "offensive (Q2001676)". Currently, it has the following statements for the attacker and the defender roles:
Petsamo–Kirkenes Offensive (Q705222)participant (P710)Soviet Union (Q15180)
Petsamo–Kirkenes Offensive (Q705222)participant (P710)Nazi Germany (Q7318)
These statements do not specify who was the attacker and who was the defender. Ideally, we should add the "object of statement has role (P3831)" qualifier to indicate the role:
Petsamo–Kirkenes Offensive (Q705222)participant (P710)Soviet Union (Q15180)
Petsamo–Kirkenes Offensive (Q705222)participant (P710)Nazi Germany (Q7318)
Alternatively, we could replace the current use of participant (P710) and its qualifier in the above statement with target (P533), thus avoiding the object of statement has role (P3831) qualifier.
Obviously, we cannot inspect all existing event/action instances and "fix" them but, at least, we now have a method for doing it.
--Anatole Gershman (talk) 21:33, 12 June 2024 (UTC)
Semantic Roles
[edit]Semantic roles, originally termed cases, are often also referred to as predicate arguments, slots, thematic relations (VerbNet, LIRICS), frame elements (FrameNet), etc. Thematic relations traditionally only refer to the core arguments of the predicating element, and do not include more adjunct-like information found in temporal and locative modifiers. The latter can be applied quite generally and are considered more peripheral. Defining adjuncts precisely has remained a persistent challenge for the linguistics community, making it difficult to distinguish consistently between core and peripheral arguments. The term "semantic roles" can encompass both. Time and place are critical elements of useful descriptions of event instances.
Our stated goal is a mapping between Wikidata items and PropBank semantic roles. The original aim of PropBank was to add semantic role information to the syntactic structures in the Penn Treebank. Since there is no one-to-one mapping between syntactic constituents and semantic roles, annotators were asked to examine every clause in the Penn Treebank featuring a specific lexical item, such as "throw" as a predicating element, and assign the most suitable semantic role label to each one. A PropBank Frame File, listing the different coarse-grained senses of the lexical item and appropriate argument structures for each one, was referred to during this process. For example, the frame for "throw", as listed below, indicates an ARG0-PAG, a prototypical agent (Dowty, 1990), an ARG1-PPT, a prototypical patient or theme, and an ARG2-GOL (the goal or destination of the entity being thrown). There can be up to six numbered core arguments, and a dozen additional peripheral ARGM's, marked individually with function tags such as manner (MNR), locative (LOC), direction (DIR), comitative/accompanier (COM), etc. There are also several more syntactic function tags to mark modals (MOD), negations (NEG), discourse markers (DIS), etc. The full list with their definitions can be found in the PropBank Guidelines, available at the PropBank GitHub site linked below. The example frame files referred to above are provided below in the PropBank Frame File Examples subsection. After the original 50K sentence Penn Treebank was PropBanked, funding was provided to expand the number of genres and now almost 2M tokens of English have been PropBanked, as well as several other languages including Chinese, Arabic, Korean, Hindi, Urdu, German, French, Russian, Spanish, etc. English PropBank has also been mapped to VerbNet and FrameNet as part of SemLink: Mapping together PropBank/VerbNet/FrameNet, and one can browse a combined representation of those three resources at the Unified Verb Index. PropBank's coverage has also been extended to provide support for Abstract Meaning Representation (AMR) annotation (which uses PropBank Frame Files), unifying PropBank rolesets across different parts of speech.
Below is a table listing our recommended semantic role labels for Wikidata that are mapped to PropBank labels and are adopted from the Uniform Meaning Representation (UMR) project. They have been carefully reviewed to ensure that they accommodate cross-linguistic typological variation (Bonial et al. 2011 A Hierarchical Unification of LIRICS and VerbNet Semantic Roles (Q118174236), Van Gysel et al, 2021 Designing a Uniform Meaning Representation for Natural Language Processing (Q115519832)). For the most part we are relying on existing Wikidata has semantic argument (P9971) definitions to realize our PropBank semantic roles. At this point they also include Start, Temporal and Place.
UMR Semantic Role | Wikidata item | PropBank Function Tag | Description | Example |
---|---|---|---|---|
Actor | actor (Q23894381) | PAG | An animate entity who performs an action | "The chef prepared the meal." |
Force | force (Q126009669) | PAG | An event or inanimate entity that acts upon an undergoer in a way that is usually spontaneous, forceful, and direct | "The wind blew the door open." |
Causer | agent (Q392648) | PAG CAU |
An animate entities who acts on another actor to cause them to engage in the action | "My grandmother made me eat liver." |
Undergoer | undergoer (Q111335542) | PPT | The entity that undergoes the action when it is not clearly a Patient or Theme. | "The kitten licked her fingers." |
Patient | patient (Q170212) | PPT | Subclass of undergoer. The patient is an undergoer in an event that is usually structurally changed, for instance by experiencing a change of state or condition; is often acted upon by an agent; is causally involved or directly affected by other participants; and exists independently of the event. | "The chef prepared the meal." "The roommates painted the walls." |
Theme | theme (Q118826633) | PPT | Subclass of undergoer. The theme is an undergoer that is central to an event or state that does not have control over the way the event occurs, is not structurally changed by the event, and/or is characterized as being in a certain position or condition throughout the state. Often in motion. | "She packed her suitcase for the trip." |
Recipient | recipient (Q20820253) addressee (Q19720921) |
GOL | The entity that receives something. | "The librarian handed me a book." |
Experiencer | experiencer (Q1242505) | PPT PAG |
The entity that directly experiences a sensation or emotion | "Many tourists saw the accident." "He felt a sense of relief." |
Stimulus | stimulus (Q109566760) | PAG CAU |
The entity that causes an emotional or mental state | "The loud noise startled the cat." |
Instrument | instrument (Q6535309) | MNR | An inanimate entity used to perform an action or event | "The rock broke the window." "She cut the paper with scissors.” |
Start | origin (Q3885844) | DIR | The entity from which an action originates or the starting point of an action or event | "I flew from Heathrow." "The bidding opened at $5." |
Goal | goal (Q109405570) | GOL | Where an action is directed. In motion verbs, the final destination | "He ran to the store." |
Companion | companion (Q106645134) | COM | An animate entity that accompanies another entity or entities and is presented as an oblique argument; who an action was done with | "I went to the movies with friends." |
Material/Source | material (Q214609) source (Q31464082) |
DIR | The location, entity, or material from which an action or event originates | "Water flowed from the faucet" "I milked the cow." "The shirt is made of cotton." |
Place | location (Q109377685) | LOC | The place where an event or action occurs | "The party will be at the park." |
Affectee | affectee (Q125995757) | PPT | Entity positively or negatively affected by the circumstances of an event or action without being the primary undergoer | "The movie made her cry." |
Cause | cause (Q2574811) | CAU | Why an event or inanimate entity brings about an action or event | "The pool was closed because of lightening." |
Temporal | duration (Q2199864) time (Q12322185) Frequency (Q125995799) |
TMP | When an action took place. This includes all temporal referents, such as dates, duration, frequency, order, repetition, etc. | “He went to the store yesterday.” "I've been reading email for three hours." "They cleaned the kitchen first." "She lost her keys again." |
Extent | extent (Q125953445) | EXT | The degree or amount to which something happens | "He ran five miles." "The price increased by 5%." |
Manner | means (Q12774177) | MNR | The way in which something is performed | "He worked quickly and mechanically." |
Reason | cause (Q2574811) | PRP | The reason, explanation or justification for an event or action | "I went to the store because we were out of milk." "He left early because he had another meeting." |
Purpose | cause (Q2574811) | PRP | The purpose or intended objective of an event or action | "I went to the store to buy milk." "He left early to get to another meeting." |
Attribute | attribute (Q109674924) | PRD | The quality or characteristic ascribed to an entity | "The house is big." |
Result | result (Q2995644) | PRD | The entity described by a secondary predicate | "She kicked the door shut.” "You scared me to death." "He painted the door red." |
Direction | relative direction (Q2151613) | DIR | Motion along a specified (literal or figurative) path | “I walked down the street.” "I turned left." |
--Anatole Gershman (talk) 22:04, 21 June 2024 (UTC)
Examples of PropBank Frames
[edit]Here are the complete PropBank frames referenced above.
Bring
bring.01 - carry along with, move literally or metaphorically
bring (v.)
Role Label | Role Description |
---|---|
ARG0-PAG | bringer |
ARG1-PPT | thing brought |
ARG2-GOL | benefactive or destination brought-for, brought-to ; |
ARG3-PRD | attribute, state after bringing, secondary action |
ARG4-DIR | ablative, brought-from |
active, benefactive:
She [ARG0-PAG] brought [REL] them [ARG2-GOL] shame [ARG1-PPT]
Eat
eat.01 - consume, comsuming
Aliases: eat (v.) eating (n.)
Role Label | Role Description |
---|---|
ARG0-PAG | consumer, eater |
ARG1-PPT | meal |
Arg0, 1: His [ARG0-PAG] eating [REL] carrots [ARG1-PPT] constantly [ARGM-TMP] has tinted his skin a suspiciously bright orange hue.
Throw
throw.01 - throw, sending through the air, manually, projection of an object through space
Aliases: throw (v.) throwing (n.) throw (n.)
Role Label | Role Description |
---|---|
ARG0-PAG | thrower |
ARG1-PPT | thing thrown |
ARG2-GOL | thrown at, to, over, etc. |
See
see.01 - view
Aliases: see (v.) seeing (n.) sight (v.) sight (n.)
Role Label | Role Description |
---|---|
ARG0-PAG | viewer |
ARG1-PPT | thing viewed |
ARG2-GOL | attribute of arg1, further description |
sight-n: both args:
The climax is his visit to the dead man 's house and his [ARG0-PAG] sight [REL] of the body [ARG1-PPT].
Create
create.01 - create
Aliases: create (v.) creation (n.)
Role Label | Role Description |
---|---|
ARG0-PAG | creator |
ARG1-PPT | thing created |
ARG2-VSP | materials used |
ARG3-GOL | benefactive |
ARG4-PRD | attribute of ARG1 |
Creation [REL] of a new , realistic U.S. policy [ARG1-PPT]
Attack
attack.01 - to make an attack, criticize strongly
Aliases: attacking (n.) attack (n.) attack (v.)
Role Label | Role Description |
---|---|
ARG0-PAG | attacker |
ARG1-PPT | entity attacked |
ARG2-PRD | attribute |
Metaphorical attack, illness:
The new medication has reduced Sally 's [ARG1-PPT] asthma [ARG1-PAG] attacks [REL] .
Potential New Properties
[edit]a) has semantic role
As we mentioned above, few of the existing event/action classes have fully specified semantic roles. For example, "creation (Q11398090)" does not have a statement for the "created". We used "has characteristic (P1552)" - a very generic property, with an "object of statement has role (P3831)" qualifier to indicate the role function:
creation (Q11398090)has characteristic (P1552)artificial object (Q3619132)
It seems desirable to have a more specific property, "P_has semantic_role" for this purpose. It would be a sub-property of "has characteristic (P1552)" and used when no existing property such as "practiced by (P3095)" could be found to indicate a semantic role. We would still use the qualifier to indicate the role function.
The current version of this property proposal is can be found here: [[1]]
b) has selectional preference
We also mentioned that Wikidata does not have an existing property to specify selectional preferences and that we had to resort to a common substitution by using a combination of "has characteristic (P1552)" with an "object of statement has role (P3831)" qualifier:
eater (Q20984678)has characteristic (P1552)organism (Q7239)
Here again, it seems desirable to have a more specific property "Q_has_selectional_preference". With this dedicated property, there will be no need for the "object of statement has role (P3831)" qualifier.
--Anatole Gershman (talk) 15:46, 24 June 2024 (UTC)
Statistics
[edit](to be filled in later)
Queries
[edit](to be filled in later)
Current tasks
[edit](to be filled in later)
Participants
[edit]The participants listed below can be notified using the following template in discussions:{{Ping project|Events and Role Frames}}
Related links
[edit]- PropBank
- FrameNet
- VerbNet
- UVI
- The DARPA Wikidata overlay: Wkidata as an ontology for natural language processing (Q119958789)
- PDT-Vallex
- ENG-Vallex
- NomVallex
- AnCorpaVerb
- ADESSE: Base de datos de Verbos, Alternancias de Diátesis y Esquemas Sintáctico-Semánticos del Español
- Pattern Dictionary of English Verbs
- Mapping Czech Verbal Valency to PropBank Argument Labels
- A Visual Dictionary of Tibetan Verb Valency
- ValPal