Corpus Linguistics Lect 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Introduction to Corpus Linguistics

• The word corpus is derived from a Latin word “Corps”


meaning “body”.
• It is body or large collection of real-world text (books,
novels, magazine, speeches, journals.)
 Definition of Corpus?:
“ A large collection of machine readable text is known
as corpus (plural corpora). It can be spoken or written.”
or
“ A large collection of language text in electronic form,
selected to be representative of a particular language
for linguistic research.” _Sinclair (2005)
 What is Corpus Linguistics?
• “ Systematic study of language through large
collections of machine readable text (corpus) is known
as corpus linguistics.”
OR
• “ Study of language that involve the analysis of large
collection of real world language data is known as
corpus linguistics.”
• The term corpus linguistics first appeared only in early
1980s.
• Sinclair and J. R Firth are leading linguist in this field.
 Examples of Corpus
I. The Brown Corpus of Standard American English:
The first modern, electronically readable corpus
was the Brown corpus prepared by Nelson Francis &
Henry Kucera in early 1960s. This corpus consists
of one million words of American English text.
II. The BNC ( British National Corpus): A 100 million
word corpus of British English in early 1990s.
III. Kolhapur corpus ( Indian English)
IV. The London corpus of Spoken British English.
 Types of Corpus
1. General Corpus: consists of general text, text that do
not belong to a single text type, or subject field.
2. Monolingual: A single language corpus often used
to obtain lexical, grammatical info about a specific
language.
3. Bilingual: A corpus containing texts from two
languages, enabling comparative analysis.
4. Parallel: A corpus containing texts from several
languages and their translation useful for
translational strategies.
 Features of Corpus linguistics:
I. Using authentic language data:
II. Evidence based study
III. Investigating qualitative as well as
IV. quantitive aspect of language
 Uses
I. Lexicography: creating dictionaries
II. Discourse analysis
III. Language Acquisition
IV. Analysis of Political Discourse

You might also like