MLRD 1
MLRD 1
MLRD 1
Simone Teufel
This course: Machine Learning and Real-world Data
(MLRD)
Three Topics:
Classification according to sentiment (7 sessions)
Sequence analysis of proteins (4 sessions)
Network analysis of social networks (5 sessions)
Plenty of data:
thousands of movie reviews
hundreds of amino acid sequences
thousands of users and links between them
Computer Science as an empirical subject
... He’s incredible in fights. ... Also his relationship with Irons,
who plays Alfred, is just wonderful in general. Irons was
exceptional in the role.
A bad review
... He’s incredible in fights. ... Also his relationship with Irons,
who plays Alfred, is just wonderful in general. Irons was
exceptional in the role.
incredible positive
wonderful positive
exceptional positive
Sentiment lexicon words in the bad review
Task 1:
explore the review data (1800 documents)
make judgment about sentiment of 4 reviews
explore the sentiment lexicon
guess 10 sentiment-indicating words
write a program that tests the sentiment lexicon approach
write a program for using the star ratings to evaluate how
well your program is doing
and always keep a record of what you do