Indexingand Abstracting Services

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/346178324

Indexing and Abstracting Services

Chapter · January 2013

CITATION READS
1 12,599

1 author:

Dr Edeama Onwuchekwa

11 PUBLICATIONS   43 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Information Literacy and Lifelong Learning View project

All content following this page was uploaded by Dr Edeama Onwuchekwa on 24 November 2020.

The user has requested enhancement of the downloaded file.


Indexing and Abstracting Services

Edeama O. ONWUCHEKWA

National Open University of Nigeria, Lagos.

Citation Details: Onwuchekwa, E. O (2013) Indexing and Abstracting Services. In:


Issa, A, Igwe, K.N and Uzuegbu, C.P. (ed). Provision of Library and Information Services
to uses in the era of Globalization. Lagos. Waltodanny Visual Concept. pp 203-221

INTRODUCTION

Indexing is a superior technique for retrieving relevant information contained in


documents stored in the Library. The access points in indexing are analyzed in order to
bring out the subject terms that have been sufficiently treated. For each of the subject
terms that have been chosen as an access point using indexing techniques, the
bibliographic details of the document will be provided and Users who have interest in the
different subject areas that have been covered will be able to locate the same documents.

Abstracting provides an added value to the document being sought, apart from providing
the full bibliographic details of the documents; it will also provide a summary of the
document. This will enable the user determine if the document would be useful to him/her
when it is finally retrieved. Abstracting and indexing databases have been found to still be
both relevant and necessary (Rabe 2002).

Abstracts, index entries, title listings, and other forms of document representations are
highly organized and detailed guides that lead the user to the originals that the libraries are
expected to furnish, In addition to acting as guides, document representations also provide
the user with a means of appraising the value of the available literature, its relevance to his
area of interest, and his need for the original. Rarely do data contained in secondary
publications serve as substitutes for the originals. Without surrogates, such as indexes and
abstracts, search through the accumulated literature would be impossible. This chapter will
consider the concept and practicalities of indexing and abstracting services in the
information organizations.

CONCEPT OF INDEXING

According to Salton (1989) in Chowdhury (2004)The process of constructing document


surrogates by assigning identifiers to text items is Known as indexing .When the task of
indexing is based on the conceptual analysis of the subject of the document , it is called
Subject indexing.

According to Taylor (2009) Indexing is the process by which the content of an information
resource is analyzed, and the “aboutness” of that item is determined and expressed in a
concise manner. Indexing is also concerned with describing the information resource in
such a way that users are aware of the basic attributes of a document, such as author, title,
length, and the location of the content. Indexing typically concerns textual items only;
although image indexing is a growing area of practice.

One of the functions of an information retrieval system is to match the contents of


documents with user’s queries. The content of each input document in a collection is to be
analyzed and represented in such a way that it becomes convenient for matching. The
systems personnel have to prepare a surrogate for every document and the surrogates
must be maintained in an organized manner.

The technique of producing an index is called indexing. Indexing is the process of providing
a guide to the intellectual content of a document or a collection of documents. The result of
this process is an index, which will serve as a pointer to the intellectual content in a
document. It is able to perform this role through the descriptors that are used in describing
the intellectual content of documents. The reader who is interested in a document will use
the descriptors assigned to the document by the indexer. The ultimate objective of the
index is to reduce the efforts a user expends in accessing a topic of interest in a particular
document or a set of documents which have been stored in a collection.
An index is an important tool for retrieving information contained in documents stored in
the library, documentation or information centre. It provides a means of locating the
information relevant to a request.

THEORIES OF INDEXING

Some theories for explaining the process of indexing does exist although information
scientists differ in accepting some of these views. Different researches have but Fugmann
(1993) proposed a theory of indexing based on five general axioms

1) The axiom of definability proposes that compiling information relevant to a topic


can only be accomplished to the degree to which a topic can be defined.

2) The axiom of order suggests that any compilation of information relevant to a topic
is an order creation process.

3) The axiom of sufficient degree of order posits that the demands made on the degree
of order increases as the size of a collection and frequency of searches increase.

4) The axiom of predictability says that the success of any directed search for relevant
information hinges on how readily predictable are the modes of expression for concepts
and statements in the search file.

5) The axiom of fidelity equates the success of any directed search for relevant
information with the fidelity with which concepts and statements are expressed in the
search file.

FUNCTIONS OF AN INDEX
As Wellisch (1994) puts it, the indexing of all verbal texts whether in print or electronic
form must fulfil certain functions if the resulting index is to retrieve or find a particular
name, term or passage in a text that the user has either read before, or that is presumed to
contain the desired information.

The American, British and International Standards (National Information Standards


Organization 1993; International Organization for Standardization 1994), which are largely
worded the same way on this issue and express the considered judgment of experienced
indexers, stipulate the following basic functions of an index: the function of an index is to
provide users with an efficient and systematic means for locating documents or parts of
documents that may address their information needs or requests. An index should :

 identify and locate potentially relevant information in the document or collection


being indexed;

 discriminate between information on a topic and passing mentions of a topic;

 exclude passing mentions of topics that offer nothing significant to the potential
user;

 analyse concepts treated in a document so as to produce suitable index;

 use headings based on its terminology;

 indicate relationships among topics;

 group together information on topics scattered by the arrangement of the document


or collection;

 direct users seeking information under terms not chosen as index headings to terms
that have been chosen by means of see references;

 suggest to users of a topic to look up also related topics by means of see also
references;

 arrange entries into a systematic and helpful order.

TYPES OF SUBJECT INDEXES


According to Aina (2004), there are various types of indexes, depending on what is being
used as access points and the subject indexes are popularly used in the Library because
most users generally approach the Library through the subject. The different types of
Subject indexes are
1) CITATION INDEXES
One of the earliest subject indexes is the coordinate indexing which involves a combination
of two or more single terms in order to create a phrase that will meet the request of the
user. This is particularly used in specialized Libraries where readers request for multi-
terms when information is required than single items which are generally assigned to
documents. For example a reader is interested in documents that deal with “Female
Librarians in Ghana” is not only interested in the documents that deal with females or
Librarians or Ghana only. Rather, he/she is interested in documents that contain the three
Uniterms together. This type of indexing is commonly associated with post –coordinate
indexing because coordination is done at the time of searching, and it is also called
manipulative indexing or computerized indexing where Boolean Logic operators are used.

2) PERMUTED TITLE INDEXES


This is a type of subject index that is based on using the keyboard in the title of the
document and is very common. This is based on the assumption that the title correctly
reflects the content of the document. Thus the keywords in the title correspond to the
subject terms of the document. This type of index employs natural indexing language. It is
worthy to note that that the assumption that the title correctly holds the content of the
document does not necessarily hold true in the sense that some titles may not be
informative enough when they do not contain information conveying words.
3) PRECIS (Preserved Context Indexing System) Indexes.
This is a special type of index which is very useful to readers in that it provides a summary
of the subject content of a document. This type of index involves providing a link to the
index in the context in which the author of the document has used the term. There is a lead
term as well as the summary of the document in the context of other index terms used in
the document. The other displayed index terms are then arranged from the specific to the
general. Each index has a lead term and other terms that can clearly show a summary of the
document. Any of the displayed terms can also appear as the lead term. Thus access to a
document is possible through any of the index terms used in the document.
4) CITATION INDEX
At the end of journal articles, conference papers, books etc., authors usually provide a list of
references that they have consulted in preparing the work; therefore each citation in the
List (cited paper) has a relevance to the document of the author. There are also other
papers that would have cited each citation in the list; these other papers are referred to as
citing articles. It is possible therefore to list all the citing articles under each cited paper.
This is a special type of index which lists all citing articles under a cited paper that is for
each article cited in the index it would list all other articles that have cited it (citing
articles).
Citation index is the basis of Science Citation Index, Social Science Citation Index and Arts
and Humanities Citation Index, which are produced at the Institute of Scientific Information
in Philadelphia, USA.

BOOK INDEXES AND PERIODICAL INDEXES


The procedure for producing indexes to books and periodicals are basically the same
except that the terms used by the authors of a book are generally used as index terms
unlike periodical indexes where there is always a need for a controlled language. Another
difference is that the indexing of books in a single operation that begins and ends with one
indexer. On the other hand, indexing of periodicals is a continuous project that will involve
several indexers.
BACK OF THE BOOK INDEX
It is very common for every standard book to have back of the book indexes as presently
librarians during their acquisition processes access the quality of a book with the presence
of the back of the book index. The back of the book index usually contains important topics
described in the document, names of personalities and corporate bodies and geographical
names and the pages with which they are located. This index makes the topics treated in
the book more accessible to the user thereby saving the stress of scanning or going through
the whole book in search of a topic or issue.
In traditional back-of-the-book indexing, the index is a list of terms or terms phrases
arranged alphabetically with locator references that make it possible for the user to
retrieve the desire content. Language of the indexing terms is typically derived from
language of the text, thus the kind of indexing done in this context is referred to as derived
indexing.
In back-of-the-book indexing, there are two methods for creating multiple entry points:
1) Double-posts, whereby two or more index entries of the same meaning are added to the
index, with none designated as preferable to the other, and all have the same locators
pointing to the same points of text. (This is the preferred method when the entries do not
have subentries.)
2) See cross-references, whereby additional index entries of the same meaning are added
to the index, but they point to a single favoured entry, which is the only one with the
locators. (This is the preferred method when the entry has subentries.)

PERIODICAL INDEX
Periodicals play a critical role in information centres since they convey the most up-to-date
information on developments in the users‟ areas of expertise. Universities that are strong
in research spend colossal sums of money on periodicals. More importantly, periodicals
constitute the heart of academic research in universities. The importance of periodicals in
university libraries cannot therefore be overemphasized.
According to Matanji (2012), there are two types of periodical indexes. There are indexes
to a single journal and indexes to several journals. Very often, the editors of most journals
will issue an index at the end of the volume. This is generally an index of authors and
subjects included in all the issues for a particular year. The terms are selected form the title
of each article in the journal and there is no need for controlled language.

INDEXING LANGUAGE
Indexing language is made up of words or descriptors that are used in the intellectual
contents of documents. These terms are expected to be used by the searcher to search for
documents in collection. There is always a need for an artificial language to be used by the
indexer and the searcher to describe a document since the terms or concepts identified in a
book are represented by words or phrases. These types of indexing language have been
described with their unique characteristics.
Natural Indexing Language: The indexer uses the exact words and phrases used by the
author of the document. This is very easy to use by the indexer and the searcher but the
major problem is that there is no discrimination between synonyms, semantics,
homographs, singular and plurals. This type of indexing tends to scatter documents on the
same subject, where the authors have used different terms. The function of this type of
language is to ensure that the indexer and the searcher operate at the same level by using
the same language. This is to facilitate the retrieval of relevant information from the
collection of the library. This language appears in a variety of forms. In natural language
indexing, any term that appears in the title, abstract or text of a document record may be an
index term. There is no mechanism to control the use of terms for such indexing. Similarly,
the searcher is not expected to use any controlled list of terms. Natural indexing language is
used mainly in the back of book index and computerized indexes such as Keyword in
Context (KWIC) and Key Word out of Context (KWOC) indexes.

Controlled Indexing Language: The terms that are used to represent subjects are assigned
to particular documents are controlled or executed by an indexer. The indexer exercises
some control over the terms that are to be used as index terms because the indexer assigns
only terms that have been listed as possible index terms. There is generally a preconceived
standard list of terms to be used for a particular system. Thus, when and indexer has
identified terms that represent the document, he/she will consult this standard list to
ensure that the terms used are consistent. There are two types of this standard list. This list
is sometimes called an authority list. The first type is the alphabetical controlled list in
which the terms are arranged alphabetically. The two common examples are subject
headings list and thesauri. The second type is the classification scheme which assigns
notation to subject terms. The searcher is expected to consult the same controlled list
during formulation of a search strategy.

LEVELS OF INDEXING

Some researchers for example Quinn (1994) and Fugmann (1993) have postulated that
there are five levels in the process of indexing:
 The first level is known as the concordance, which consists of all references to all
words in the original text arranged in alphabetical order.
 The second level is the information theoretic level which calculates the likelihood of
a word being chosen for indexing based on its frequency of occurrence in a given
text document.
 The third level is the linguistic level which attempts to explain how meaningful
words are extracted from large units of text. (Some Indexers have proposed that
opening paragraphs, chapters etc. are good sources for choosing indexing terms).
 The textual or the skeletal framework is the fourth level. Here the text is prepared
by the author in an organized manner and held together by a skeletal structure. The
onus therefore lies on the indexer to identify the skeleton and markers that will
determine the content of the given text.
 The fifth level of indexing theory is the inferential level. An indexer should be able to
make inferences about the relationships between words and phrases by
understanding the sentence structure.

PRE-COORDINATE AND POST-COORDINATE SYSTEMS

Subject indexing systems have been classified broadly as pre-coordinate and post-
coordinate systems. It has already been mentioned that the major objective of any indexing
system is to represent the contents of documents through keywords or descriptors.

In post-coordinate systems, one entry is prepared for each keyword selected to represent
the subject of a given document, and all the entries are organized in a file. This term serves
as a lead term to the document. When a user puts forward a query, it is analyzed and some
keywords are selected that are representative of the user’s query. These query terms are
then matched against the file of index terms and relevant documents are retrieved.
Uniterm, Peek-a-boo, etc., are examples of post-coordinate systems.

In pre-coordinate systems, as the name implies, keywords chosen at the subject analysis
stage are coordinated at the indexing state, and thus each entry represents the full content
of the document concerned. Because the coordination is done before searching by the user,
such type of indexing is called “pre-coordinate indexing.” PRECIS, POPSI, Chain procedure,
Relational Indexing, NEPHIS, etc., are examples of pre-coordinate indexing systems. Thus,
for a document discussing the application of computational linguistics to the indexing of
periodicals, entries prepared according to any of the pre-coordinate systems will represent
the full context in which the entry word occurs, whereas in the post-coordinate system
each term is generated without any context. i.e unless all the corresponding entries are
found the content of the document cannot be learnt.

TECHNIQUES FOR INDEXING DOCUMENTS

Indexing is an art that involves a number of stages. The first stage in indexing a document is
to have a general idea of the document by going through the title, preface, foreword,
content pages and possibly introduction. One can also flip through the text and make some
spot reading. This will give the indexer sufficient familiarization with the document; hence
this stage is called the familiarization stage. The indexer wants to know what the document
is about by identifying concepts that are conveyed by words and phrases in the document,
examining the title, abstract, preface, introduction, chapter headings, major headings, sub-
headings, etc. It is important that the indexer takes into account the needs of the users.

The next stage, which is the analysis, involves the indexer using his intellectual judgment
by identifying the concepts the book has treated. Sometimes the indexer may use the exact
term used by the author or he might formulate an appropriate term. These terms are
intended to accurately describe the whole document. The indexer at this stage is doing
what is referred to as subject analysis or concept analysis.

This is where the subject background of the indexer comes into play, especially if he/she
has a sufficient subject background of that document. At this stage, the terms identified by
the indexer are what he/she judges to be the terms that represent the totality of the
document. In a situation where the use of terminology is controlled, the indexer cannot use
these terms directly as index terms or access points. Rather terms identified have to be
translated into an indexing language used by the system which is the language used by both
the indexer and the searcher in an information storage and retrieval process. This language
exercises some control over what terms to be used as index terms.
During this stage, the indexer assigns subject descriptors chosen from the controlled
language that the users of the discipline are familiar with. This stage is called the
translation stage. However, in a setting where there is no need to exercise control over the
terminology of the system, such as the bock of a book index or computerized indexes, this
last stage may not be necessary.

However, in determining the policies, certain features of indexing have to be explained.


These include depth of indexing, specificity, exhaustively and weighting, etc. Depth of
Indexing involves selecting as a large number of topics from a document, that is making as
many important topics as are treated in a document as index terms for the document.
Specificity involves selecting only terms that are specific to the document, which is a term
that entirely covers the document.

EVALUATION OF AN INDEX

All types of indexes that are produced should be evaluated to determine their effectiveness
in terms of how many documents that contain a particular term can be retrieved from a
system and the relevance of the documents retrieved. The effectiveness of an indexing
system is controlled by two parameters, called indexing exhaustively and term specificity.
By exhaustively, this means the degree to which the subject matter of a given document has
been reflected through the index entries. An exhaustive indexing system thus is supposed
to represent the contents of the input documents fully.

However to attain this objective the system has to select as many keywords as possible to
represent the idea put forward in the document. In a non-exhaustive system only a few
keywords are chosen which gives a broad representation of the subject.

Term specificity refers to how broad or how specific are the terms or keywords chosen
under a given situation. The more specific the terms, the better is the representation of the
subject through the index entry.

Recall Ratio: This is a quantitative measure used to determine the ability of an index as an
aid to retrieving documents containing information on a particular request from a
collection of documents present in a library of an information centre. It also refers to the
proportion of relevant materials retrieved by a system, and can be represented thus:

Recall =

Number of relevant documents retrieved

Number of relevant documents in the collection

Precision Ratio is a quantitative ratio of the number of relevant documents retrieved to the
total number of documents retrieved.

Precision =

Number of relevant documents retrieved

Total number of documents retrieved

These parameters are expressed in percentage terms and this means that both recall and
precision may vary between 1 and 100%.

It is therefore obvious that there is an inverse relationship between recall ratio and
precision ratio. The higher the recall ratio the lower the precision ratio and vice versa. The
more documents that are recalled, the less precise the indexing system would be, and the
less documents that are recalled, the more precise the indexing system is. Thus, the indexer
must ensure a fair balance of recall ratio and precision ratio. We therefore expect about
70% recall ratio and 60% precision ratio.

It should be noted that specificity and exhaustively have influence on recall and precision
ratios. When the indexing policy of a library or an indexing agency is to support
exhaustively, then it would result in a high recall of documents and a low precision that is
most of the documents recalled would not be relevant. By increasing the number of
keywords during a search process, it may happen that we may choose subjects that are
very narrowly discussed in the given documents. In order words the system will retrieve a
large number of non-relevant documents, thereby reducing increasing recall but reducing
precision.

On the other hand, when an indexing agency supports specificity, then the recall of
documents would be low, but the precision would be high as only documents that are
relevant to the user would have been recalled. Specificity promotes low recall and high
precision while exhaustively promotes high recall and low precision. For Example, if we are
looking for information on “Internet” one can use related terms such as “net”, information
superhighway, “World Wide Web” etc. for the search process. By doing this, one may
retrieve more information and will increase the possibility of higher number of relevant
items. This concludes that making the search more exhaustive, we tend to get a higher
recall.

Time: The main function of an index is to reduce the time it would take a use to retrieve
documents in a collection. Thus, a good index is that which takes a minimum time to
retrieve documents that are relevant and precise to the query. However, the time it takes a
user to retrieve relevant documents does not depend solely on the index alone, the ability
of the user to precisely select terms that appropriately describe the query is a factor in
quickly retrieving relevant documents in a collection. Thus, if an index is good but the user
has not used the appropriate descriptors, the user would take a longer time to retrieve
relevant documents; but all things being equal, a good index should enable a reader to use a
short time to get relevant documents from the collection.

Cost: A good index should be able to serve its purpose with minimum cost. Thus, a good
index should be affordable to an average library. No matter how efficient an index, if it is
costly, then it might not be available to an average library. No matter how efficient an
index, if it is costly, then it might not be available to an average library, archives or
information Centre. When it is available a reader might have to subsidize the cost, which
many readers might not be able to afford.

Indexing operations have been formed intellectually by human indexers for quite a long
time now. Automatic systems have been developed comparatively recently where text
analysis and indexing are performed by computers. However, the basic task involved in
indexing are the same, which is to analyze the content of a given document and present
analysis by some content identifiers or keywords. In subject indexing, the basic objective is
to match the contents of documents with user’s queries, and thus the product of the
conceptual analysis is presented in a natural language form. A number of systems-Chain,
PRECIS,POPSI, Relational Indexing have been developed over the years for preparing
subject index entries of documents.

AUTOMATIC INDEXING

Salton gives the following lucid definition of automatic indexing as ‘when the assignment of
the content identifiers is carried out with the aid of modern computing equipment the
operation becomes automatic indexing’. Borko and Bernier suggest that the subject of a
document can be derived by a mechanical analysis of the words in a document and by their
arrangement in a text. In fact, all attempts at automatic indexing depend in some way or
other on the original document texts, or document surrogates. The words occurring in each
document are listed and certain statistical measurements are made, like word frequency
calculation, total collection frequency, or frequency distribution across the documents of
the collection.

There are numerous software packages available for performing parts of the indexing
process. Concordance-generators like AntConc (Anthony, 2006) use cluster analysis to
study the relationships between words and surrounding text, but require the user to
provide specific words to search. Another cluster analysis application, Grokker (Groxis,
2006), aids in text searches by creating visual maps of related terms. Many indexers use
products like CINDEX (Indexing, 2006), MACREX (Macrex, 2005), and SKY Index (Sky,
2006), which function like databases, and can be integrated with Microsoft Access and
dBASE.

These indexing tools automate the mundane aspects of indexing. They provide features
such as formatting type, sorting terms, arranging entries on the index page, search and
replace, spell-checking, and the ability to import/export index files. None determines which
terms are correct to index, nor do they attempt to capture context-sensitive information.
Thus, our prototype is unique in its ability to perform these “intelligent” tasks, as well as
the mundane ones. This will result in a greater time savings than can be realized by the
current state-of-the-art.

An early attempt at automated indexing, Indexicon, utilized the indexing and tagging
features of commercial word processors, but failed to live up to product claims. According
to Mulvany (1994), Indexicon allowed the user to determine the “level of sensitivity” of
terms to include, and “exclusion zones” of terms to exclude, but failed to generate cross-
references, and omitted many words that should have been included. Furthermore,
Indexicon was “incapable of recognizing … terms that do not appear verbatim in the text.”
Our system uses mathematical techniques to disambiguate homographs, and to identify
synonyms in order to provide cross-references.

CONCEPT OF ABSTRACTING

Abstracting is a process that consists of analyzing of the content of an information resource


and writing a succinct summary or synopsis of that work. An abstract is not a review of the
work, nor does it evaluate or interpret the work that is being interpreted. Although it
contains keywords and concepts found in the larger text, the abstract is an original
document rather an excerpted passage.

Users needing to stay abreast with information in their field can do so by reviewing
abstracts published in that area. Abstracts aid in the decision of which articles need to be
read in full versus which can be skimmed or skipped altogether. Librarians and other
information professionals find the use of abstracts assist in the speed and utility of user’s
searches.

A lot of authors have defined the abstract from different points of view. Lancaster (2003)
defines an abstract as a brief but accurate representation of the contents of a document and
he opines that an abstract is different from an extract, an annotation or summary.

Rowley (1996) defines an abstract as a concise and accurate representation of the content
of a document in a style similar to that of the original document. She adds that an abstract
covers all the main points made in the original document and usually follows the style and
the arrangement of the parent document. Abstracts as documentary products always take
the form of short texts either accompanying the original document or included in its
surrogate.

An abstract is different from an extract, an annotation or a summary. An extract is an


abbreviated version of a document created by drawing sentences from the document itself,
whereas an abstract, though it may include words occurring in a document, is a piece of
text created by the abstractor rather than a direct quotation from the author. An annotation
is a note added to a title or other bibliographic element of a document by a way of comment
or explanation, and a summary is a restatement within a document (usually at the end) of
the document’s salient findings and conclusions.

An abstract can also be defined as a “summary”, usually by a professional, other than the
author, of essential contents of a work, usually an article in a periodical, together with the
specification of its original. Ashworth defines the term abstract as a précis of information,
which in its narrower sense now usually refers to the information contained in an article in
a periodical, short pamphlet, or serial publication as to who wrote the abstract.

Lancaster (2003) defines an abstract as accurate representation of a document without


added interpretation or criticism and without distinction as to who wrote the abstract..
Rowley (1996) defines an abstract as a concise and accurate representation of the contents
of a document in a style similar to that of the original document. She adds that an abstract
covers all the main points made in the original document, and usually follows the style and
arrangement of the parent document.

USES OF ABSTRACTS

Abstracting is not a simple art. It involves a possession of certain skills such as good
literacy skills, especially writing skills, the knowledge of the subject field and indication of
the characteristics of the potential users.

An abstract is different from an annotation, extract and summary. While an abstract


provides the skeletal representation of a document, an annotation provides a brief
description of the document. An extract lifts one or two paragraphs of a document
verbatim, which would represent the whole document, and a summary is just a brief
restatement of the major findings reported in document. The following are some of the
major uses of abstracts:

1. They promote current awareness: abstracts repackage the information contained in


the original document into a more condensed form and therefore are less time-consuming
to read and to keep up-to-date.

2. They save reading time: abstracts are much smaller in size in comparison to the
original document, and yet can provide as much information as the user needs without
going into the full text.

3. The facilitate selection: in an information retrieval environment one may retrieve a


large number of items, reading the full texts of which may be impossible or very time-
consuming. In such cases the user may consult the abstract of the retrieved items in order
to be selective.

4. The help overcome the language barrier: most abstracting journals cover more than
one language, and therefore the user can find out what studies and research have been
published in languages that he or she cannot read, which available

5. They facilitate literature searches: without indexed abstracts, searches of open and
classified literature would be impossible due to the huge volume of material available

6. They improve indexing efficiency, as they can be indexed much more rapidly than
can the original documents. The rate of indexing can be improved by a factor of two to
four, and the cost of preparing the index is reduced with little or no reduction in quality.

7. They aid in the preparation of reviews and can be of much help in the preparation of
bibliographies and so on.

TYPES OF ABSTRACTS
Different authors have made several attempts to classify the types of abstracts in different
categorizes. Some have suggested that abstracts should be distinguished by their length,
amount of detail, and inclusion of judgment or critical analysis and the language used. The
most common ways of categorizing abstracts are stated below as:

Abstract by writer

Abstracts may be written by authors, by subject experts, or by professional abstractors may


categorize as: author – prepared abstracts expert-prepared abstracts, expert-prepared
abstracts, and professional prepared abstracts. Expert abstractors are usually the choice of
abstracting journals. They are trained in abstracting and are also expert in the subject field.
Thus, their abstracts should be accurate, comprehensive, lucid, and terse. These abstracts
are usually very prompt and are well written, though sometimes may be expensive,
professional abstractors abstract for a living, and may be employed to handle work in more
than one language.

Abstracts by Purpose

Abstracts are written with certain purposes in mind, and therefore there may be different
abstracts to serve different purposes. Borko and Bernie have identified four different types
of abstracts: the indicative abstract, informative abstract, critical abstract and special
purpose abstract.

An informative abstract is intended to provide readers with quantitative and qualitative


information as presented in the parent document. Informative abstracts may include
information or purpose, scope, and methods, as well as the results and finding, informative
abstracts are often longer and are difficult to write, but they often save the user the time
necessary to consult the original work. Informative abstracts are generally used for
documents pertaining to experimental investigations, inquiries, or surveys.

While most abstracts describing experimental work can be constructed in this sequence,
the optimum sequence may depend on the audience for whom the abstract is primarily
intended. For example, a results-oriented arrangement, in which the most important
results and conclusions are placed first, may be useful to some audiences.
An indicative abstract simply indicates what the parent document is all about. They are
also called descriptive abstracts, because they usually describe what can be found in the
original document. Indicative abstracts may contain information on purpose, scope or
methodology, but not on results, conclusions or recommendations. Thus, an indicative
abstract is unlikely to serve as a substitute for the original document.

Indicative abstracts are best used for less-structured documents, such as editorials, essays,
opinions or descriptions; or for lengthy documents, such as books, conference proceedings,
directories, bibliographies, lists, and annual reports. Indicative abstracts are usually
written for documents that do not contain information relating to methodology or results.
The abstract should, however, describe the purpose or scope of the discussion or
descriptions in the document. Also, it may describe essential background material, the
approaches used, and/or arguments presented in the text.

A critical abstract contains some kind of critical comments or review by the abstractor. For
indicative and informative abstracts, abstractors usually function as impartial reporters,
whereas in a critical abstract, the abstractor deliberately includes his/her own opinion and
interpretations. Preparation of critical abstracts requires subject expertise and is a time-
consuming job.

Some abstracts may have been written to serve a special purpose or with a specific
category of users in mind. Such abstracts are called slanted or special purpose abstracts.
Depending on the nature of the target user group, an abstractor may stress some part of the
abstract (with more emphasis on in formativeness) at the expense of some other part(s)
leading to an indicative abstract for that part). Some abstracts may have a slant towards
some part of the subject dealt with in the original document, these are particularly useful
for mission-oriented works rather than in discipline-oriented works.

Lancaster (2001) suggests that another category of abstract can be identified in this group,
called the modular abstract. Here an abstractor is expected is expected to prepare different
kinds of abstracts – indicative, informative, critical, and so on – any one of which may be
used depending on the requirement of the abstracting agency. In fact, the abstractor writes
various modules of abstract at the same time. Modular abstracts are intended as full
content descriptions of current documents in five parts: a citation, an annotation, an
indicative abstract, an informative abstract, and a critical abstract. The prime purpose of
modular abstracts is to eliminate the duplication and waste of intellectual effort in the
independent abstracting of the same documents by several services, without any attempt
to force ‘standardized’ abstracts on services whose requirements may vary considerably as
to form and subject slant.

Abstracts by form

Three other kinds of abstract can also been identified, the structured abstract, mini-
abstract, and telegraphic abstract. A structured abstract may have a frame and slots that
are to be filled in with information taken from the original document. This type of abstract
is valuable in the compilation of handbooks summarizing a large number of studies
performed in a particular field. The terms mini-abstract refers to a highly structured
abstract designed primarily for searching by computer. In fact, it is a cross between an
abstract and an index and can be called a ‘machine-readable-index-abstract’. The term
telegraphic abstract refers to an abstract that contain brief statements (as opposed to
complete sentences), and thus the resulting abstract looks like telegraphic text.

QUALITIES OF ABSTRACTS

An abstract must be brief and accurate and it must be presented in a format designed to
facilitate the skimming of a large number of abstracts in a search for relevant material.
Guinchat and Menou in Chowdhury (2004) suggest that an abstract should possess the
following qualities.

1. Conclusion: however long the abstract is care should be taken to avoid expressions
or circumlocutions that can be replaced by single worlds, but this should not be done at the
expense of precision.

2. Precision: One should use expressions that are exact and as specific as possible
without exceeding the abstract’s requested length.
3. Self-sufficiency: The description of the document should be complete in itself and
fully understandable without reference to any other document.

4. Objectivity: There must not be any personal interpretation or value judgment on


the part of the abstractor (obviously this does not apply to critical abstracts).

PROCEDURES FOR ABSTRACTING

According to Aina (2004) when an abstractor has decided to abstract a document, the
following steps must be followed:

Step 1: The reference of the document must be accurately recorded. It is essential that the
author, title, date of publication (place of publication and publisher, if it is a book or
monograph), (title of the journal, volume number and issue number , if it is a journal )
pagination , affiliation of the author etc must be recorded accurately. This is very
necessary because if the bibliographic details are wrong, it will be very difficult to locate or
access the material.

Step 2: The next stage involves familiarity with the document and having a good grasp of
what the author has put down in the document. This involves going through the preface,
the introduction, the different parts of the documents etc. In some cases it might be
necessary to read the whole document, but by and large, in many situations, spine reading
the document would enable the abstractor to have an understanding of the document.

Step 3: Having become familiar with the document , the abstract is now in a position to
make notes of what he/she considers is the essential contents of the document in his/her
own words and ensuring that these words are information conveying. This is done in a
narrative form. The Subject analysis involves summarizing the essential points of the
documents, noting important areas such as scope, the objectives of the study, the
significant contributions, the special features of the document, the trends of the book etc.
When the abstract is completed it should be edited to ensure that there are no grammatical
errors, spellings, punctuation mistakes etc. The reference at this can be rechecked to
ensure accuracy.
AUTOMATIC ABSTRACTING

With the rapid increase in the availability of full text and multimedia information in digital
form, the need for automatic abstracts or summaries as filtering tool is becoming extremely
important. Craven (2000) in his works proposes a hybrid abstracting system in which some
task are performed by human abstractors and others by an abstractors assistance software
called Textnet.

Automatic abstracting and text summarization are now used synonymously to describe
systems that generate abstracts or summaries of texts. In simple abstracting or
summarization systems, parts of texts-sentences or paragraphs-are selected automatically
based on some linguistic or statistical criteria to produce the abstract. More sophisticated
systems may merge two or more sentences, or parts thereof, to generate one coherent
sentence, or may generate simple summaries from discrete items of data.

Interest in automatic abstracting and text summarization is reflected by the huge number
of research papers that have appeared in a number of international conferences and
workshops at the beginning of the 21st Century. With the rapid increase in the availability
of full text and multimedia in the in digital form, the need for automatic abstracts or
summaries as a filtering tool is very necessary.

CONCLUSION

The existence of machine-readable databases, coupled with the ability to access these
resources online, has created a virtual revolution in information services. Online systems
have offered several dramatic benefits to libraries. First and foremost, they have allowed
libraries that formerly had no strong traditions in literature searching to offer literature
searching services of a high quality. Academic libraries, hospital libraries, small libraries in
general, and, more recently, some public libraries have all benefited in this way.

The major positive force affecting abstracting and indexing (A&l) services has obviously
been the use of computer technology to generate machine-readable databases from which
printed publications could be produced. While this development may have kept the cost of
the printed product from rising even more rapidly than it has. it has produced an even
greater benefit such as the ability to use the same database to provide other information
services—group and personalized SDI and retrospective searching on demand— and to
generate more specialized publications .

In summary, the growth and development of abstracting and indexing services have several
important implications for the information organization. They demonstrate some virtues of
the book format. They accentuate the primary importance of subject cataloging. They call
into question the fundamental principles of descriptive cataloging as it is now commonly
practiced. They serve as pathfinders in the accelerating drive toward finding suitable
machine solutions to the most important library problem.

REFERNCES

“ANSI/NISO 239.14-1997.” Guidelines for Abstracts. . Bethesda: NISO Press

Aina, L.O. (2004) Library and Information Science Text for Africa. Ibadan: Third World
Information Service, pp.204-206. ISBN 978- 32836-1-8

Anthony,L.(2006).AntConc3.1.2 Concordance Generation Software,


http://www.antlab.sci.waseda.ac.jp/. Accessed 8th June 2013

Borko H. & Bernier C (1978) Concepts and Methods. New York, Academic Press.

Chann, L.M (2005) Library of Congress Subject Headings: Principles and Application. 4TH
Ed. Westport: Libraries Unlimited.

Chowdhury, G.G (2004) Introduction to Modern Information Retrieval. 2ND Edition.


London: Facet Publishing

Cleverdon, C W (1984) Optimizing Convenient Online Access to Bibliographic Database.


Information Services and Use. Vol 4 .Pp 37-42

Craven .T (1986) String Indexing, Orlando. Florida, Academic Press.


Foskett, A(1996) Subject Approach to Information. 5th Ed, London. Library Association
Publishing.

Fugmann R. (1980) On the Practice of Indexing and its Theoretical Foundation.


International Classification. 7(1) 13-20

Fugmann. R. (1993) Subject Analysis and Indexing: Theoretical Foundation and Practical
Advice. Franckfurt, Indeks Velag.

Goldsteinn J , Mittal V & Carbonnel J (1998) Summarizing Text Documents : Sentence


Selection & Evaluation Metrics . In Proceeding of the 23rd Annual International
Conference on Research and Development in Information Retrieval ACM. Pp121-
128.

Groxis, Inc. Grokker software, http://www.groxis.com. San Francisco: Groxis, Inc.

http://dspace.hil.unb.ca:8080/bitstream/handle/1882/1021/Shelly-
Lukon.pdf?sequence=1 Accessed on the 6th of June 2013

Indexing Research. CINDEX Indexing Software, New York, NY, http://www.indexres. com.
New York: Indexing Research.

International Organization for standardization. 1994. Guidelines for the content


Organization and presentation of indexes. (ISO 199-1994). Geneva: ISO.

Lancaster F. W (2003) Indexing and Abstracting in Theory and Practice, 3rd. London.
Facet Publication.

Lancaster, F. & Warner, A (2001) Intelligent Technologies in Library and Information


Service Application. ASIST Monograph Series. Medford NJ. Information Today Inc.

Lancaster, W. (1979) Information Retrieval Systems: Characteristics, Testing and


Evaluation. 2nd Ed. New York: John Wiley & Sons.

Macleod .I.(1990) Storage and Retrieval of Structural Approach. Information Processing


and Management .26 (2) pp197-205
Macrex Indexing Services. (2005). Macrex Indexing Program, http://www.macrex.com.
Daly City, CA: MACREX. Accessed 8th of June 2013

Matanji, P (2012) in-house indexing of periodical literature: a study of University Libraries


in Kenya, Master’s thesis Submitted in the University of South Africa

Mulvany, N., Milstead, J. (1994) “Indexicon, The Only Fully Automatic Indexer: A Review,”
http://www.bayside-indexing.com/idxcon.htm. Accessed 8th of June 2013

National Information Standards Organization. 1993. Proposed American National Standard.


Guidelines for indexers and related information retrieval devices. Bethesda, NISO.

Poux M & Ledocay V. (2000). Understanding of Medico- Technical report: Artificial


Intelligence in Medicine. pp149-172.

Quinn, B (1994) Recent Theoretical Approaches in Classification and Indexing. Knowledge


Organization. 21 (3) 140-147

Rabe, D. 2002. Are abstracting and indexing databases still relevant. The Indexer
23(2):80-82.

Rowley, J (1994). ‘Aspects of a Library Systems Methodology’, Journal of Information


Science, 20 (1), 41-45.

Rowley, J.(1996).The basics of information systems, 2nd ed, London, Library Association
Publishing.

Rowley, J., (1998). The electronic library, fourth edition of Computers for Libraries, London,
Library Association Publishing.

Salton, G.(1989). Automatic text Processing: the Transformation, Analysis and Retrial of
information by computer, Reading , MA, Addison – Wesley.

Schartz .C. & Eisenmann .L.(1986) Subject Analysis. Annual Review of Information Science
and Technology , 21, 1986 Pp 37-61
Schwartz, C.,(2001). Sorting out the web: Approaches to Subject Access, Westport, Ablex
Publishing.

Silher H, Mc Coug. K (2000) Efficient Text summarization using Lexical Chains in H .


Liebermann. User Interfaces. New Orlaeans , A.I, New York

SKY Software. SKY Index 6.0 Professional Edition, http://www.sky-software.com. Stephens


City, VA: SKY Software.

Soergel, D. 1994. Indexing and retrieval performance: the logical evidence. Journal of the
American Society for Information Science 45(8):589-599.

Taylor A & Jourdrey, D (2009) The Organization of Information. 3rd Ed. Westport:
Libraries Unlimited.

Tiamiyu, M (2003) Organization of Data in Information Systems: A Synthesis for the


Information Professions. Ibadan: Stirling-Horden Publishers. ISBN 978-032-077-6

Wellisch, HH. 1994. Book and periodical indexing. Journal of the American Society for
Information Science 45(8):620-627.

Xin Lu (1990) Document Retrieval: A structural Approach, Information and Management


.26(2) 209-218

View publication stats

You might also like