Welcome to Scribd!

0% found this document useful (0 votes)

22 views

Week8 Semantics Lab

Uploaded by

This document provides instructions for three tasks related to computational linguistics: 1) Calculating word similarity using cosine similarity and Euclidean distance on a word co-occurrence matrix, and again using pointwise mutual information. 2) Assigning word senses to the open-class words in sentences from a text using WordNet and determining the number of possible sense combinations. 3) Calculating inter-annotator agreement for part-of-speech tagging using Fleiss' kappa statistic on a sample agreement table.

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Week8 Semantics Lab

Uploaded by

amine karoui

0% found this document useful (0 votes)

22 views2 pages

Original Title

Week8-semantics-lab

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

0% found this document useful (0 votes)

22 views2 pages

Week8 Semantics Lab

Uploaded by

amine karoui

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

Jump to Page

You are on page 1of 2

Search inside document

Computational Linguistics University of Passau

Lab session Annette Hautli-Janisz, Prof. Dr.

Summer 2022

Week 8: Lexical and distributional semantics

Task 1: Word similarity

Consider the frequencies in the following word co-occurrence matrix:

pie data computer

cherry 442 8 2
digital 5 1683 1670
information 5 3982 3325

a) Use cosine similarity (the normalized dot product) and Euclidean distance to calculate
which of the vectors for words ‘cherry’ and ‘digital’ are closer to ‘information’. Discuss this
way of calculating word similarity.

b) One of the most important concepts in NLP is Pointwise Mutual Information (PMI). It is
a measure of how often two events x and y occur, compared with what we would expect if
they were independent. The pointwise mutual information between a target word w and a
context word c is then deﬁned as:
P (w,c)
PMI(w,c) = log2 P (w)P (c)

Use the PMI values to recalculate cosine similarity and Euclidean distance.

Task 2: Word sense assignment

This is a customer review mentioned on Kaggle (https://www.kaggle.com/code/econdata/
exercise-word-vectors/notebook).

I absolutely love this place. The 360 degree glass windows with the Yerba buena
garden view transports you to what feels like a different zen zone within the city.
[...]

a) Using WordNet, determine how many senses there are for each of the open-class words
in each sentence. How many distinct combinations of senses are there for each sentence?
How does this number seem to vary with sentence length? What steps of pre-processing
are necessary for this type of assigning meaning to sentences?

1
b) Using WordNet, tag each open-class word in your corpus with its correct tag. Was choo-
sing the correct sense always a straightforward task? Report on any difﬁculties you encoun-
tered.

Task 3: Fleiss’ kappa

Fleiss’ κ measures inter-annotator agreement for more than two annotators and illustrate
the extent of agreement between annotators above the level of agreement that would be
−P̄e
achieved if the annotators made their judgements randomly. It is deﬁned as κ = P̄1− P̄e
, with
P̄ as actual agreement, and P̄e as the expected agreement by chance.
Use the judgements on POS labeling in the agreement table below to calculate Fleiss’ κ.

NN JJ DT PRP VBZ
This 5 0 5 5 0
crap 7 8 0 0 0
game 10 5 0 0 0
is 0 0 0 0 15
over 3 3 3 3 3

Lecture Word Embeddings WordTo Vec IR
Document60 pages
Lecture Word Embeddings WordTo Vec IR
Asma MSCS 2022 FAST NU LHR
No ratings yet
Keyword Extraction From A Single Document Using Word Co-Occurrence Statistical Information
Document5 pages
Keyword Extraction From A Single Document Using Word Co-Occurrence Statistical Information
ankurkothari
No ratings yet
Lecture 2 - Word Emedding
Document45 pages
Lecture 2 - Word Emedding
Andrew Chung
No ratings yet
Vector Semantics 3
Document5 pages
Vector Semantics 3
chuck212
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
Document33 pages
Natural Language Processing With Deep Learning CS224N/Ling284
rakesh
No ratings yet
Distributed Word Representations For Information Retrieval
Document46 pages
Distributed Word Representations For Information Retrieval
Rahul Kumar
No ratings yet
Statistics Questions Summary
Document17 pages
Statistics Questions Summary
Akshay [Nit Durgapur] (Akshu)
No ratings yet
Alexu Aux Bert
Document5 pages
Alexu Aux Bert
张拓
No ratings yet
Lecture 3. Vector Semantics
Document51 pages
Lecture 3. Vector Semantics
Long Đặng Hoàng
No ratings yet
Word Embeddings Classification
Document52 pages
Word Embeddings Classification
Ouri
No ratings yet
Vector Semantics 5: (Count (C) )
Document3 pages
Vector Semantics 5: (Count (C) )
chuck212
No ratings yet
Coding Theory Book
Document243 pages
Coding Theory Book
jHexst
No ratings yet
Item2Vec: Neural Item Embedding For Collaborative Filtering: Oren Barkan and Noam Koenigstein
Document6 pages
Item2Vec: Neural Item Embedding For Collaborative Filtering: Oren Barkan and Noam Koenigstein
JaspreetSingh
No ratings yet
GloVe Research Paper Explained. An Intuitive Understanding and - by Nikhil Birajdar - Towards Data Science
Document21 pages
GloVe Research Paper Explained. An Intuitive Understanding and - by Nikhil Birajdar - Towards Data Science
Shahrouz Alizadeh
No ratings yet
Vector Semantics 2 Word Embeddings (Vector Semantics)
Document5 pages
Vector Semantics 2 Word Embeddings (Vector Semantics)
chuck212
No ratings yet
Project: Crossword Generator
Document3 pages
Project: Crossword Generator
KARL KING
No ratings yet
2019-【MOCO v1】-Momentum Contrast for Unsupervised Visual Representation Learning
Document12 pages
2019-【MOCO v1】-Momentum Contrast for Unsupervised Visual Representation Learning
liujianhang130
No ratings yet
Neural IR
Document45 pages
Neural IR
Asma MSCS 2022 FAST NU LHR
No ratings yet
Gms Thesis
Document7 pages
Gms Thesis
ashleyfishererie
100% (2)
Apostolico LZ Gaps
Document10 pages
Apostolico LZ Gaps
Utun Jay
No ratings yet
Levy Improving Distributional
Document16 pages
Levy Improving Distributional
Manu Carbonell
No ratings yet
Chapter Transformers
Document8 pages
Chapter Transformers
RAKESH SWAIN
No ratings yet
Paper 1
Document20 pages
Paper 1
kannan rs
No ratings yet
1 s2.0 S2352154621000577 Main
Document13 pages
1 s2.0 S2352154621000577 Main
Peggy Wang
No ratings yet
Word Embedding Generation For Telugu Corpus
Document28 pages
Word Embedding Generation For Telugu Corpus
Durga P
No ratings yet
Chapter 5
Document53 pages
Chapter 5
yehenew
No ratings yet
Lecture Wise 2 Questions CN
Document7 pages
Lecture Wise 2 Questions CN
Seemon Bhadoria
No ratings yet
Mark
Document3 pages
Mark
Ella
No ratings yet
An Intrinsic Nearest Neighbor Analysis of Neural Machine Translation Architectures
Document11 pages
An Intrinsic Nearest Neighbor Analysis of Neural Machine Translation Architectures
c.monz
No ratings yet
Learning Representations That Convey Semantic and Syntactic Information
Document14 pages
Learning Representations That Convey Semantic and Syntactic Information
kamalshrish
No ratings yet
Nong 2018
Document6 pages
Nong 2018
suman.sukhavasi
No ratings yet
Aids Lab PDF
Document53 pages
Aids Lab PDF
Shivam Verma
No ratings yet
Bash Ri 2017
Document5 pages
Bash Ri 2017
Hafizh Fakhrizal
No ratings yet
Winnowing-Based Text Clustering: Javier Parapar Álvaro Barreiro
Document2 pages
Winnowing-Based Text Clustering: Javier Parapar Álvaro Barreiro
Jorge Luis Marquez Garcia
No ratings yet
7 CMAC (Cerebellar
Document6 pages
7 CMAC (Cerebellar
Richard Bryan
No ratings yet
08 Word Embeddings (2021)
Document58 pages
08 Word Embeddings (2021)
rui91seu
No ratings yet
DS LabManual
Document19 pages
DS LabManual
gbhggg81
No ratings yet
Paper 1-A Two Step Approach To Weighted Bipartite Link
Document5 pages
Paper 1-A Two Step Approach To Weighted Bipartite Link
Michael Cabanillas
No ratings yet
Final Practice
Document12 pages
Final Practice
flying ostrich
No ratings yet
5.AI Lab-Manual
Document22 pages
5.AI Lab-Manual
jdmochi1278
No ratings yet
Decrypting Cryptic Crosswords Semantically Complex
Document12 pages
Decrypting Cryptic Crosswords Semantically Complex
Vijay Bantu
No ratings yet
Bounds On Ad Hoc Threshold Encryption
Document23 pages
Bounds On Ad Hoc Threshold Encryption
Ajith S
No ratings yet
Web by Voice
Document5 pages
Web by Voice
maticsmarcan
No ratings yet
Pascal Junod Thesis
Document5 pages
Pascal Junod Thesis
jennymancinibuffalo
100% (1)
21 Word2Vec 24 09 2024
Document63 pages
21 Word2Vec 24 09 2024
Shreyash Reshu
No ratings yet
Data Science Interview Questions (#Day15)
Document12 pages
Data Science Interview Questions (#Day15)
ARPAN MAITY
No ratings yet
1 PB
Document8 pages
1 PB
sushanttwitterp
No ratings yet
[2023] LatEtAl23-extended_2023-08-23
Document14 pages
[2023] LatEtAl23-extended_2023-08-23
dmhoa1997
No ratings yet
Zero-Knowledge Range Proofs
Document18 pages
Zero-Knowledge Range Proofs
KCHQ
No ratings yet
An Application of Vector Space Theory in Data Tran
Document5 pages
An Application of Vector Space Theory in Data Tran
M Abubakar Ghumman
No ratings yet
Noun-Verb Based Technique of Text Watermarking Using Recursive Decent Semantic Net Parsers
Document4 pages
Noun-Verb Based Technique of Text Watermarking Using Recursive Decent Semantic Net Parsers
Hina Dar
No ratings yet
Word 2 Vec
Document6 pages
Word 2 Vec
alihamda535
No ratings yet
Cheap Talking Algorithms: Daniele Condorelli Massimiliano Furlan October 13, 2023
Document20 pages
Cheap Talking Algorithms: Daniele Condorelli Massimiliano Furlan October 13, 2023
Satoshi Nakamoto
No ratings yet
Algebraic Geometry Thesis
Document4 pages
Algebraic Geometry Thesis
amywellsbellevue
100% (2)
Big Data Computing - Assignment 0
Document3 pages
Big Data Computing - Assignment 0
VarshaMega
No ratings yet
Ecco: An Open Source Library For The Explainability of Transformer Language Models
Document9 pages
Ecco: An Open Source Library For The Explainability of Transformer Language Models
shruthi h p
No ratings yet
Similarity Distances For Natural Language Processing
Document16 pages
Similarity Distances For Natural Language Processing
ibrahimcakirlar35
No ratings yet
Part 3
Document5 pages
Part 3
KEVIN KUMAR
No ratings yet
Perceptrons: Fundamentals and Applications for The Neural Building Block
From Everand
Perceptrons: Fundamentals and Applications for The Neural Building Block
Fouad Sabry
No ratings yet
Math for Deep Learning: What You Need to Know to Understand Neural Networks
From Everand
Math for Deep Learning: What You Need to Know to Understand Neural Networks
Ronald T. Kneusel
No ratings yet
Tutorial 01 Solution
Document15 pages
Tutorial 01 Solution
amine karoui
No ratings yet
Week6 Lab Pos Tagging
Document3 pages
Week6 Lab Pos Tagging
amine karoui
No ratings yet
Week5 Lab Langmodels
Document2 pages
Week5 Lab Langmodels
amine karoui
No ratings yet
Week9 Disc Arg Lab
Document2 pages
Week9 Disc Arg Lab
amine karoui
No ratings yet
Grade 4 DLL Quarter 4 Week 4 (Sir Bien Cruz) - For Merge
Document43 pages
Grade 4 DLL Quarter 4 Week 4 (Sir Bien Cruz) - For Merge
Mary Rose del Rosario
No ratings yet
Poems Class 12
Document13 pages
Poems Class 12
Ishita Bhavnani
No ratings yet
Words Their Way Spelling Homework
Document9 pages
Words Their Way Spelling Homework
api-234936728
100% (5)
Brochure Automotive en
Document24 pages
Brochure Automotive en
pandiyarajmech
No ratings yet
Clasification of Amplifiers
Document20 pages
Clasification of Amplifiers
vaishnavi1994
No ratings yet
Introduction To Peakvue
Document55 pages
Introduction To Peakvue
Ahmed Nazeem
100% (4)
Maxsurf 7: Manual de Usuario
Document144 pages
Maxsurf 7: Manual de Usuario
siriovagabundo
100% (1)
15 ახალი ტესტი ინგლისურში
Document85 pages
15 ახალი ტესტი ინგლისურში
veriko
No ratings yet
Indistar
Document2 pages
Indistar
api-367519610
No ratings yet
Mumbai Mix Data
Document9 pages
Mumbai Mix Data
Sarita Joshi
No ratings yet
How To Use A Bench Grinder
Document10 pages
How To Use A Bench Grinder
Rodel M. Vasquez
No ratings yet
Dani Cavallaro - The Mind of Italo Calvino - A Critical Exploration of His Thought and Writings (2010)
Document211 pages
Dani Cavallaro - The Mind of Italo Calvino - A Critical Exploration of His Thought and Writings (2010)
Hilmi
No ratings yet
Tropical Architecture
Document18 pages
Tropical Architecture
Pamela Sarmiento
No ratings yet
5th Math Worksheet Set 2
Document4 pages
5th Math Worksheet Set 2
kaboomer12345
No ratings yet
5 Sector Report Rural Development
Document208 pages
5 Sector Report Rural Development
Goldmine Multitrade
No ratings yet
FM3-05 70survivalmanual
Document443 pages
FM3-05 70survivalmanual
geodkyt
No ratings yet
Andal Doctrine - Psychological Incapacity
Document2 pages
Andal Doctrine - Psychological Incapacity
Donna Dumaliang
0% (1)
Bohr Diagram Lesson
Document26 pages
Bohr Diagram Lesson
Glenn Clemente
No ratings yet
DISP TRUST 18 SEPTIEMBRE 2023 Clientes
Document100 pages
DISP TRUST 18 SEPTIEMBRE 2023 Clientes
Alex David Dueñas Quintero
No ratings yet
Profiles of Drug Substances Vol 39
Document548 pages
Profiles of Drug Substances Vol 39
Bình Nguyên
No ratings yet
Linear Circuit Analysis: Chapter # 1: Basic Concepts
Document47 pages
Linear Circuit Analysis: Chapter # 1: Basic Concepts
rizwanspirit11
No ratings yet
The Danger of His Silence Discipline
Document18 pages
The Danger of His Silence Discipline
Asaba isaac
No ratings yet
Soil and Water Conservation Structures
Document294 pages
Soil and Water Conservation Structures
Azizrahman Abubakar
No ratings yet
Talking To The Boss
Document2 pages
Talking To The Boss
nicolas
No ratings yet
Addition and Subtaction of Decimal
Document16 pages
Addition and Subtaction of Decimal
Lilian Democrito Yusay
No ratings yet
Price List 2022 LumiQuick
Document2 pages
Price List 2022 LumiQuick
sandu chicu
No ratings yet
Accrual and Prepayment OCR
Document19 pages
Accrual and Prepayment OCR
rahimanwar091174
No ratings yet
Transformation of Old Building Into Heritage Museum
Document8 pages
Transformation of Old Building Into Heritage Museum
NorliHarun
No ratings yet
Anime Small Encoded Free
Document7 pages
Anime Small Encoded Free
Jojo Santoso
No ratings yet