— In text mining finding the significance of each term is a text is an important problem. In a tw... more — In text mining finding the significance of each term is a text is an important problem. In a two-class setting, relevance frequency is used to compute the discriminating power of a term for text categorization. Relevance frequency has shown better performance than other term weighting methods on number of datasets. However, relevance frequency is based on intuitive considerations. In this paper, we develop a probabilistic model to explain relevance frequency. The model provides a theoretical foundation for relevance frequency.
Question Answering Systems (QAS) are tools to retrieve precise answers for user questions from a ... more Question Answering Systems (QAS) are tools to retrieve precise answers for user questions from a large set of text documents. Researchers from information retrieval and natural language processing community have put tremendous efforts to improve the performance of QASs across several languages. However, Hindi, the fourth most spoken language has not seen a proportional development in the field of question answering to an extent that information seekers accept QASs as a good alternative of search engines. In this chapter, a pipelined architecture for the development of QASs has been explained in the context of English and Hindi languages. This chapter also reviews the developments taking place in Hindi QASs while explaining the challenges faced by researchers in developing Hindi QASs. To encourage and support the new researchers in conducting researches in Hindi QASs, a list of techniques, tools and linguistic resources required to implement the components of a QAS are described in this chapter in a simple and persuasive manner. Finally, the future directions for research in Hindi QASs have been proposed.
In the past decade, we have witnessed a tremendous rise in the use of electronic devices in educa... more In the past decade, we have witnessed a tremendous rise in the use of electronic devices in education. Starting from nursery classes at the preschool level to the postgraduate programs at the universities, electronic devices are being used extensively to enhance and facilitate quality of education. Although the use of computer networks is an inherent feature of online learning, the traditional schools and universities are also making extensive use of network-connected electronic devices such as mobile phones, tablets, and computers. However, it is humanly impossible to analyze enormous volume of data generated from the active usage of devices connected through a large network. The educators and academic administrators can benefit from their counterparts in business and service industries where a complex system of methods and techniques, usually referred as data analytics or data mining, are being used to analyze a large influx of real-time data in decision-making. Researchers have started paying attention to the application of data mining and data analytics to handle big data generated in the educational sector. In the context of education, these techniques are specifically referred to as Educational Data Mining (EDM) and Learning Analytics (LA). Generally, EDM looks for new patterns in data and develops new algorithms and/or new models, while LA applies known predictive models in instructional systems. This chapter starts by describing major EDM and LA techniques used in handling big data in commercial and other activities. It will also provide a brief description of how EDM and LA are affecting the typical stakeholders of a higher education institution. Furthermore, the chapter will provide a detailed account of how these techniques are used to analyze the learning process of students, assessing their performance and providing them with detailed feedback in real-time. The technologies can also assist in planning administrative strategies to provide quality services to all stakeholders of an educational institution. Not all the stakeholders involved in providing education are experts of big data. However, in order to meet their analytical requirements, the researchers have developed easy-to-use data mining and visualization tools. In this chapter, we have provided the necessary details of some of these tools. The institutions/governments across the world are adopting EDM/LA to frame strategic policies, to understand the students learning behaviors etc. As case studies, we have also discussed some implementation of EDM and LA techniques in universities in different countries.
Question Answering Systems (QASs) have emerged as a good alternative for information seekers to r... more Question Answering Systems (QASs) have emerged as a good alternative for information seekers to retrieve precise information over the Internet. A good amount of research has been done to improve the performance of QASs across several languages, including European and Asian languages. However, Arabic, a morphologically rich Semitic language spoken by over 422 million people, has not seen similar development in the field of question answering. This article reviews the developments taking place in Arabic QASs as well as the challenges faced by researchers in developing Arabic QASs. After conducting an extensive literature survey of a number of English and Arabic QASs, this article classifies them according to several criteria. The most commonly used architecture for the development of an Arabic QAS, known as pipeline architecture, has been presented. In order to encourage and support the new researchers and scholars in conducting research in Arabic QASs, a list of techniques, tools, and computational linguistic resources, required to implement the components of the presented pipelined architecture, are described in this article in a simple and persuasive manner. Finally, the gap analysis between the research in Arabic and English QASs has been performed and accordingly, some future directions for research in Arabic QASs have been proposed.
—The social media is gaining a lot of importance among businesshouses, academicians, medical prac... more —The social media is gaining a lot of importance among businesshouses, academicians, medical practitioners, politicians, among others, due to its role in creating awareness about products, services, and socio-political views. The end users of these products, services, and views provide their feedbacks in the form of comments. An accurate determination of the sentiments of end users is crucial in designing policies and plans for products and services in future. As the processing power and storage capacities of computers have increased several folds, researchers can focus more on the accuracy of sentiment detection than consumption of computational resources. In this paper, we are applying a set of heuristics to analyse sentiments using freely available dictionary resources and open source tools. We have tested these heuristics over a large data set collected from standard sources. The experimental results are promising and opening new research directions in dictionary-based sentiment analysis.
The social media is gaining a lot of importance among businesshouses, academicians, medical pract... more The social media is gaining a lot of importance among businesshouses, academicians, medical practitioners, politicians, among others, due to its role in creating awareness about products, services, and socio-political views. The end users of these products, services, and views provide their feedbacks in the form of comments. An accurate determination of the sentiments of end users is crucial in designing policies and plans for products and services in future. As the processing power and storage capacities of computers have increased several folds, researchers can focus more on the accuracy of sentiment detection than consumption of computational resources. In this paper, we are applying a set of heuristics to analyse sentiments using freely available dictionary resources and open source tools. We have tested these heuristics over a large data set collected from standard sources. The experimental results are promising and opening new research directions in dictionary-based sentiment analysis.
Social media has become an integrated part of the modern life-as-it has become one of the main in... more Social media has become an integrated part of the modern life-as-it has become one of the main instruments for sharing information, helping people, getting in touch with other people. Billions of people are using social networking websites such as Facebook, Twitter, and LinkedIn and generating huge amounts of data every day. Teaching,-particularly in higher education, has become more student-centered triggering the need to employ social media technologies where students can be linked to their institutions to have access to teaching material besides news and announcements. Analyzing the behavior of the users on social media is a complex task. In this paper, we present empherical analyze of some of the important aspects of user's behavior on social networking highlighting many interesting aspects of this type of user's behavior. This paper is a first part of a series of research work investigating the implementation of mobile learning through the use of social media technologies in teaching and learning in higher education sector.
— Question Answering Systems have emerged as a good alternative to search engines where they prod... more — Question Answering Systems have emerged as a good alternative to search engines where they produce the desired information in a very precise way in the real time. However, one serious concern with the Question Answering system is that despite having answers of the questions in the knowledge base, they are not able to retrieve the answer due to mismatch between the words used by users and content creators. There has been a lot of research in the field of English and some European language Question Answering Systems to handle this issue. However, Arabic Question Answering Systems could not match the pace due to some inherent difficulties with the language itself as well as due to lack of tools available to assist the researchers. In this paper, we are presenting a method to add semantically equivalent keywords in the questions by using semantic resources. The experiments suggest that the proposed research can deliver highly accurate answers for Arabic questions.
The International Conference on Information and Communication Technology Research (ICTRC), 2015
Due to very fast growth of information in the last few decades, getting precise information in re... more Due to very fast growth of information in the last few decades, getting precise information in real time is becoming increasingly difficult. Search engines such as Google and Yahoo are helping in finding the information but the information provided by them are in the form of documents which consumes a lot of time of the user. Question Answering Systems have emerged as a good alternative to search engines where they produce the desired information in a very precise way in the real time. This saves a lot of time for the user. There has been a lot of research in the field of English and some European language Question Answering Systems. However, Arabic Question Answering Systems could not match the pace due to some inherent difficulties with the language itself as well as due to lack of tools available to assist the researchers. Question classification is a very important module of Question Answering Systems. In this paper, we are presenting a method to accurately classify the Arabic questions in order to retrieve precise answers. The proposed method gives promising results.
This paper presents a new non-iterative algorithm to optimize the generation schedule of thermal ... more This paper presents a new non-iterative algorithm to optimize the generation schedule of thermal power plants in a power system. The operating cost for different generators in the plant is modeled as exponential function. A generalized formula for n-thermal units considering transmission losses is proposed to find the optimum generation schedule. To demonstrate the effectiveness of the proposed algorithm, a sample system consisting of six thermal generators is considered. Performance of the proposed algorithm is compared with the quick method developed for quadratic cost functions. It is observed that the results obtained by the proposed algorithm are more accurate and the processor takes less time to run the algorithm. Therefore the proposed algorithm is suitable in real time applications where accuracy and time are the two main factors.
In order to increase web search effectiveness, Meta search engines are invented to combine result... more In order to increase web search effectiveness, Meta search engines are invented to combine results of multiple search engines as a result of larger coverage of indexed web. Meta search engine is a kind of system which is useful for internet users to take advantage of multiple search engines in searching information. Recently several approaches were developed using ontology and ranking measures. Accordingly, Meta search engine is developed here using ontology and semantic similarity measure. In order to bring semantic in keyword matching, a semantic similarity measure (SSM) is developed. Here, every concept sets are matched with the title sets using SSM that consider the hyponyms and hyponyms of the keywords presented in the title sets. Along with three different ranking measures relevant to contents, title sets and raking value given by the standard search engines are effectively combined to improve the effectiveness. Finally, the experimentation is carried out using different set of queries and the performance of the meta-search engine is evaluated using TREC-style average precision (TSAP) measure. The proposed semantic meta-search engine provides 80% TSAP which is high compared with existing search engine and meta-search engine.
Image segmentation is an important part in
image recognition systems and it has been successfully... more Image segmentation is an important part in image recognition systems and it has been successfully used in various fields such as medical imaging, finger ridge, retina, and face recognition, etc. In this paper, we are proposing a novel hybrid method for image segmentation to segment all constituent objects of the image under consideration using combination of fuzzy c-means (FCM) and boundary tracking mathematical modeling technique named level set method (LSM). In the proposed method, a contour is obtained by FCM method which serves as initial contour for improved LSM Method. Finally, experimental results validate the effectiveness of the proposed combined method for image segmentation.
Image segmentation is a growing field and it has been successfully applied in various fields such... more Image segmentation is a growing field and it has been successfully applied in various fields such as medical imaging, face recognition, etc. In this paper, we propose a method for image segmentation that combines a region based artificial intelligence technique named fuzzy c-means (FCM) and a boundary based mathematical modeling technique level set method (LSM). In the proposed method, the contour of the image is obtained by FCM method which serves as initial contour for LSM Method. The final segmentation is achieved using LSM which uses signed pressure force (spf) function for active control of contour.
An open domain question answering system is one
of the emerging information retrieval systems
ava... more An open domain question answering system is one of the emerging information retrieval systems available on the World Wide Web that is becoming popular day by day to get succinct and relevant answers in response of users’ questions. The validation of the correctness of the answer is an important issue in the field of question answering. In this paper, we are proposing a World Wide Web based solution for answer validation where answers returned by open domain Question Answering Systems can be validated using online resources such as Wikipedia and Google. We have applied several heuristics for answer validation task and tested them against some popular World Wide Web based open domain Question Answering Systems over a collection of 500 questions collected from standard sources such as TREC, the Worldbook, and the Worldfactbook. We found that the proposed method is yielding promising results for automatic answer validation task.
Social networking is becoming necessity of the current generation because of its usefulness in se... more Social networking is becoming necessity of the current generation because of its usefulness in several ways like searching the user’s interest related people around the world, gathering information on different topics, and for many more purposes. In social network, there is abundant information available on different domains by means of variety of users but it is very difficult to find the user preference based information. Also it is very much possible that relevant information is available in different forms at the end of other users connected in the same network. In this paper, we are proposing a computationally efficient rough set based method for ranking of the documents. The proposed method first expands the user query using WordNet and domain Ontologies and then retrieves documents containing relevant information. The distinctive point of the proposed algorithm is to give more emphasis on the concept combination based on concept presence and its position instead of term frequencies to retrieve relevant information. We have experimented over a set of standard questions collected from TREC, Wordbook, WorldFactBook and retrieved documents using Google and our proposed method. We observed significant improvement in the ranking of retrieved documents.
Social networking portals like Twitter, Facebook,
LinkedIn etc. are getting popular day by day am... more Social networking portals like Twitter, Facebook, LinkedIn etc. are getting popular day by day among users' community and many more such portals are getting users attentions to cater their specific needs. Users write blogs on these social networking websites on a variety of topics as per their and other user’s interests. By means of social networking blogs, a large amount of interesting information is scattered on the Web which could be structured in a meaningful way for better services. The objective of this paper is to focus on categorization of blog content into ten demanding themes like Technology, Entertainment, News, Business, Health, Sports, Tourism, Widgets, Vehicles, and Products for effective retrieval of information from categorized blog content. Further, a user can also search by feeding specific query to retrieve information from blog. In this paper, we are proposing a WordNet and multiple Ontologies based blog content theme expansion approach and a concept combination based ranking algorithm for blog content based recommendation framework that considers original themes of blog content as an input and recommends conceptually related expanded themes of blog content. The distinctive point of this research is to use concept combination approach based on rough sets to categorize retrieved results for demanding themes as well as for user specific preferences. This kind of blog content categorization approach would be very effective to retrieve meaningful and conceptually related blog information written by a large number of users using different vocabularies. We have experimented with the contents of top blogs related to each theme and got very good results.
Question Answering Systems play a significant role to retrieve exact answers for user’s specific ... more Question Answering Systems play a significant role to retrieve exact answers for user’s specific questions. In answer retrieval process, they employ query expansion methods which play a major role to expand scope of original questions in correct sense. In this paper, we have carried out an extensive survey of few popular web-based open domain Question Answering Systems and critically evaluated their performances on a set of 300 questions from 30 different domains collected from standard resources including TREC to conclude our results. On the basis of findings, we have suggested an efficient query expansion framework that uses multiple ontologies retrieved from semantic web search engine such as Swoogle and combines them with WordNet to disambiguate the context. The proposed approach successfully constructs a conceptual query for user’s questions to retrieve relevant answers. We have experimented on a set of 300 questions to judge the effectiveness of the proposed approach.
Question Answering Systems, unlike search engines, are providing answers to the users’ questions ... more Question Answering Systems, unlike search engines, are providing answers to the users’ questions in succinct form which requires the prior knowledge of the expectation of the user. Question classification module of a Question Answering System plays a very important role in determining the expectations of the user. In the literature, incorrect question classification has been cited as one of the major factors for the poor performance of the Question Answering Systems and this emphasizes on the importance of question classification module designing. In this article, we have proposed a question classification method that exploits the powerful semantic features of the WordNet and the vast knowledge repository of the Wikipedia to describe informative terms explicitly. We have trained our system over a standard set of 5500 questions (by UIUC) and then tested it over five TREC question collections. We have compared our results with some standard results reported in the literature and observed a significant improvement in the accuracy of question classification. The question classification accuracy suggests the effectiveness of the method which is promising in the field of open-domain question classification. Judging the correctness of the answer is an important issue in the field of question answering. In this article, we are extending question classification as one of the heuristics for answer validation. We are proposing a World Wide Web based solution for answer validation where answers returned by open-domain Question Answering Systems can be validated using online resources such as Wikipedia and Google. We have applied several heuristics for answer validation task and tested them against some popular web based open-domain Question Answering Systems over a collection of 500 questions collected from standard sources such as TREC, the Worldbook, and the Worldfactbook. The proposed method seems to be promising for automatic answer validation task.
On the World Wide Web, Open domain Question Answering System is one of the emerging information r... more On the World Wide Web, Open domain Question Answering System is one of the emerging information retrieval systems which are becoming popular day by day to get succinct relevant answers in response of users’ questions. In this paper, we are addressing rough set based method for document ranking which is one of the major tasks in the representation of retrieved results and directly contributes towards accuracy of a retrieval system. Rough sets are widely used for document categorization, vocabulary reduction, and other information retrieval problems. We are proposing a computationally efficient rough set based method for ranking of the documents. The distinctive point of the proposed algorithm is to give more emphasis on presence and position of the concept combination instead of term frequencies. We have experimented over a set of standard questions collected from TREC, Wordbook, WorldFactBook using Google and our proposed method. We found 16% improvement in document ranking performance. Further, we have compared our method with online Question Answering System AnswerBus and observed 38% improvement in ranking relevant documents on top ranks. We conducted more experiments to judge the effectiveness of the information retrieval system and found satisfactory performance results.
Query expansion plays a very important role in enhancing the
performance of the Question Answerin... more Query expansion plays a very important role in enhancing the performance of the Question Answering Systems. There are several methods proposed for query expansion and the use of ontologies has been the latest and popular choice of the researchers because of its effectiveness in building conceptual query. However most of these query expansion methods have used either a single domain specific ontology or WordNet and none of research work has been reported the use of ontologies and WordNet together. In the context of Worldwide Web based Question Answering Systems, the use of single ontology or WordNet has not proved to be sufficient to retrieve wide variety of heterogeneous information. In this paper, we have proposed an efficient query expansion method that uses multiple ontologies retrieved from semantic web search engine such as Swoogle and combines them with WordNet to disambiguate the context. We have experimented on a set of 300 questions collected from TREC and other resources to judge the accuracy of the proposed method. We have shown results using Google as well as with respect to few existing popular web-based Question Answering Systems like START, AnswerBus, BrainBoost, and Inferret.
— In text mining finding the significance of each term is a text is an important problem. In a tw... more — In text mining finding the significance of each term is a text is an important problem. In a two-class setting, relevance frequency is used to compute the discriminating power of a term for text categorization. Relevance frequency has shown better performance than other term weighting methods on number of datasets. However, relevance frequency is based on intuitive considerations. In this paper, we develop a probabilistic model to explain relevance frequency. The model provides a theoretical foundation for relevance frequency.
Question Answering Systems (QAS) are tools to retrieve precise answers for user questions from a ... more Question Answering Systems (QAS) are tools to retrieve precise answers for user questions from a large set of text documents. Researchers from information retrieval and natural language processing community have put tremendous efforts to improve the performance of QASs across several languages. However, Hindi, the fourth most spoken language has not seen a proportional development in the field of question answering to an extent that information seekers accept QASs as a good alternative of search engines. In this chapter, a pipelined architecture for the development of QASs has been explained in the context of English and Hindi languages. This chapter also reviews the developments taking place in Hindi QASs while explaining the challenges faced by researchers in developing Hindi QASs. To encourage and support the new researchers in conducting researches in Hindi QASs, a list of techniques, tools and linguistic resources required to implement the components of a QAS are described in this chapter in a simple and persuasive manner. Finally, the future directions for research in Hindi QASs have been proposed.
In the past decade, we have witnessed a tremendous rise in the use of electronic devices in educa... more In the past decade, we have witnessed a tremendous rise in the use of electronic devices in education. Starting from nursery classes at the preschool level to the postgraduate programs at the universities, electronic devices are being used extensively to enhance and facilitate quality of education. Although the use of computer networks is an inherent feature of online learning, the traditional schools and universities are also making extensive use of network-connected electronic devices such as mobile phones, tablets, and computers. However, it is humanly impossible to analyze enormous volume of data generated from the active usage of devices connected through a large network. The educators and academic administrators can benefit from their counterparts in business and service industries where a complex system of methods and techniques, usually referred as data analytics or data mining, are being used to analyze a large influx of real-time data in decision-making. Researchers have started paying attention to the application of data mining and data analytics to handle big data generated in the educational sector. In the context of education, these techniques are specifically referred to as Educational Data Mining (EDM) and Learning Analytics (LA). Generally, EDM looks for new patterns in data and develops new algorithms and/or new models, while LA applies known predictive models in instructional systems. This chapter starts by describing major EDM and LA techniques used in handling big data in commercial and other activities. It will also provide a brief description of how EDM and LA are affecting the typical stakeholders of a higher education institution. Furthermore, the chapter will provide a detailed account of how these techniques are used to analyze the learning process of students, assessing their performance and providing them with detailed feedback in real-time. The technologies can also assist in planning administrative strategies to provide quality services to all stakeholders of an educational institution. Not all the stakeholders involved in providing education are experts of big data. However, in order to meet their analytical requirements, the researchers have developed easy-to-use data mining and visualization tools. In this chapter, we have provided the necessary details of some of these tools. The institutions/governments across the world are adopting EDM/LA to frame strategic policies, to understand the students learning behaviors etc. As case studies, we have also discussed some implementation of EDM and LA techniques in universities in different countries.
Question Answering Systems (QASs) have emerged as a good alternative for information seekers to r... more Question Answering Systems (QASs) have emerged as a good alternative for information seekers to retrieve precise information over the Internet. A good amount of research has been done to improve the performance of QASs across several languages, including European and Asian languages. However, Arabic, a morphologically rich Semitic language spoken by over 422 million people, has not seen similar development in the field of question answering. This article reviews the developments taking place in Arabic QASs as well as the challenges faced by researchers in developing Arabic QASs. After conducting an extensive literature survey of a number of English and Arabic QASs, this article classifies them according to several criteria. The most commonly used architecture for the development of an Arabic QAS, known as pipeline architecture, has been presented. In order to encourage and support the new researchers and scholars in conducting research in Arabic QASs, a list of techniques, tools, and computational linguistic resources, required to implement the components of the presented pipelined architecture, are described in this article in a simple and persuasive manner. Finally, the gap analysis between the research in Arabic and English QASs has been performed and accordingly, some future directions for research in Arabic QASs have been proposed.
—The social media is gaining a lot of importance among businesshouses, academicians, medical prac... more —The social media is gaining a lot of importance among businesshouses, academicians, medical practitioners, politicians, among others, due to its role in creating awareness about products, services, and socio-political views. The end users of these products, services, and views provide their feedbacks in the form of comments. An accurate determination of the sentiments of end users is crucial in designing policies and plans for products and services in future. As the processing power and storage capacities of computers have increased several folds, researchers can focus more on the accuracy of sentiment detection than consumption of computational resources. In this paper, we are applying a set of heuristics to analyse sentiments using freely available dictionary resources and open source tools. We have tested these heuristics over a large data set collected from standard sources. The experimental results are promising and opening new research directions in dictionary-based sentiment analysis.
The social media is gaining a lot of importance among businesshouses, academicians, medical pract... more The social media is gaining a lot of importance among businesshouses, academicians, medical practitioners, politicians, among others, due to its role in creating awareness about products, services, and socio-political views. The end users of these products, services, and views provide their feedbacks in the form of comments. An accurate determination of the sentiments of end users is crucial in designing policies and plans for products and services in future. As the processing power and storage capacities of computers have increased several folds, researchers can focus more on the accuracy of sentiment detection than consumption of computational resources. In this paper, we are applying a set of heuristics to analyse sentiments using freely available dictionary resources and open source tools. We have tested these heuristics over a large data set collected from standard sources. The experimental results are promising and opening new research directions in dictionary-based sentiment analysis.
Social media has become an integrated part of the modern life-as-it has become one of the main in... more Social media has become an integrated part of the modern life-as-it has become one of the main instruments for sharing information, helping people, getting in touch with other people. Billions of people are using social networking websites such as Facebook, Twitter, and LinkedIn and generating huge amounts of data every day. Teaching,-particularly in higher education, has become more student-centered triggering the need to employ social media technologies where students can be linked to their institutions to have access to teaching material besides news and announcements. Analyzing the behavior of the users on social media is a complex task. In this paper, we present empherical analyze of some of the important aspects of user's behavior on social networking highlighting many interesting aspects of this type of user's behavior. This paper is a first part of a series of research work investigating the implementation of mobile learning through the use of social media technologies in teaching and learning in higher education sector.
— Question Answering Systems have emerged as a good alternative to search engines where they prod... more — Question Answering Systems have emerged as a good alternative to search engines where they produce the desired information in a very precise way in the real time. However, one serious concern with the Question Answering system is that despite having answers of the questions in the knowledge base, they are not able to retrieve the answer due to mismatch between the words used by users and content creators. There has been a lot of research in the field of English and some European language Question Answering Systems to handle this issue. However, Arabic Question Answering Systems could not match the pace due to some inherent difficulties with the language itself as well as due to lack of tools available to assist the researchers. In this paper, we are presenting a method to add semantically equivalent keywords in the questions by using semantic resources. The experiments suggest that the proposed research can deliver highly accurate answers for Arabic questions.
The International Conference on Information and Communication Technology Research (ICTRC), 2015
Due to very fast growth of information in the last few decades, getting precise information in re... more Due to very fast growth of information in the last few decades, getting precise information in real time is becoming increasingly difficult. Search engines such as Google and Yahoo are helping in finding the information but the information provided by them are in the form of documents which consumes a lot of time of the user. Question Answering Systems have emerged as a good alternative to search engines where they produce the desired information in a very precise way in the real time. This saves a lot of time for the user. There has been a lot of research in the field of English and some European language Question Answering Systems. However, Arabic Question Answering Systems could not match the pace due to some inherent difficulties with the language itself as well as due to lack of tools available to assist the researchers. Question classification is a very important module of Question Answering Systems. In this paper, we are presenting a method to accurately classify the Arabic questions in order to retrieve precise answers. The proposed method gives promising results.
This paper presents a new non-iterative algorithm to optimize the generation schedule of thermal ... more This paper presents a new non-iterative algorithm to optimize the generation schedule of thermal power plants in a power system. The operating cost for different generators in the plant is modeled as exponential function. A generalized formula for n-thermal units considering transmission losses is proposed to find the optimum generation schedule. To demonstrate the effectiveness of the proposed algorithm, a sample system consisting of six thermal generators is considered. Performance of the proposed algorithm is compared with the quick method developed for quadratic cost functions. It is observed that the results obtained by the proposed algorithm are more accurate and the processor takes less time to run the algorithm. Therefore the proposed algorithm is suitable in real time applications where accuracy and time are the two main factors.
In order to increase web search effectiveness, Meta search engines are invented to combine result... more In order to increase web search effectiveness, Meta search engines are invented to combine results of multiple search engines as a result of larger coverage of indexed web. Meta search engine is a kind of system which is useful for internet users to take advantage of multiple search engines in searching information. Recently several approaches were developed using ontology and ranking measures. Accordingly, Meta search engine is developed here using ontology and semantic similarity measure. In order to bring semantic in keyword matching, a semantic similarity measure (SSM) is developed. Here, every concept sets are matched with the title sets using SSM that consider the hyponyms and hyponyms of the keywords presented in the title sets. Along with three different ranking measures relevant to contents, title sets and raking value given by the standard search engines are effectively combined to improve the effectiveness. Finally, the experimentation is carried out using different set of queries and the performance of the meta-search engine is evaluated using TREC-style average precision (TSAP) measure. The proposed semantic meta-search engine provides 80% TSAP which is high compared with existing search engine and meta-search engine.
Image segmentation is an important part in
image recognition systems and it has been successfully... more Image segmentation is an important part in image recognition systems and it has been successfully used in various fields such as medical imaging, finger ridge, retina, and face recognition, etc. In this paper, we are proposing a novel hybrid method for image segmentation to segment all constituent objects of the image under consideration using combination of fuzzy c-means (FCM) and boundary tracking mathematical modeling technique named level set method (LSM). In the proposed method, a contour is obtained by FCM method which serves as initial contour for improved LSM Method. Finally, experimental results validate the effectiveness of the proposed combined method for image segmentation.
Image segmentation is a growing field and it has been successfully applied in various fields such... more Image segmentation is a growing field and it has been successfully applied in various fields such as medical imaging, face recognition, etc. In this paper, we propose a method for image segmentation that combines a region based artificial intelligence technique named fuzzy c-means (FCM) and a boundary based mathematical modeling technique level set method (LSM). In the proposed method, the contour of the image is obtained by FCM method which serves as initial contour for LSM Method. The final segmentation is achieved using LSM which uses signed pressure force (spf) function for active control of contour.
An open domain question answering system is one
of the emerging information retrieval systems
ava... more An open domain question answering system is one of the emerging information retrieval systems available on the World Wide Web that is becoming popular day by day to get succinct and relevant answers in response of users’ questions. The validation of the correctness of the answer is an important issue in the field of question answering. In this paper, we are proposing a World Wide Web based solution for answer validation where answers returned by open domain Question Answering Systems can be validated using online resources such as Wikipedia and Google. We have applied several heuristics for answer validation task and tested them against some popular World Wide Web based open domain Question Answering Systems over a collection of 500 questions collected from standard sources such as TREC, the Worldbook, and the Worldfactbook. We found that the proposed method is yielding promising results for automatic answer validation task.
Social networking is becoming necessity of the current generation because of its usefulness in se... more Social networking is becoming necessity of the current generation because of its usefulness in several ways like searching the user’s interest related people around the world, gathering information on different topics, and for many more purposes. In social network, there is abundant information available on different domains by means of variety of users but it is very difficult to find the user preference based information. Also it is very much possible that relevant information is available in different forms at the end of other users connected in the same network. In this paper, we are proposing a computationally efficient rough set based method for ranking of the documents. The proposed method first expands the user query using WordNet and domain Ontologies and then retrieves documents containing relevant information. The distinctive point of the proposed algorithm is to give more emphasis on the concept combination based on concept presence and its position instead of term frequencies to retrieve relevant information. We have experimented over a set of standard questions collected from TREC, Wordbook, WorldFactBook and retrieved documents using Google and our proposed method. We observed significant improvement in the ranking of retrieved documents.
Social networking portals like Twitter, Facebook,
LinkedIn etc. are getting popular day by day am... more Social networking portals like Twitter, Facebook, LinkedIn etc. are getting popular day by day among users' community and many more such portals are getting users attentions to cater their specific needs. Users write blogs on these social networking websites on a variety of topics as per their and other user’s interests. By means of social networking blogs, a large amount of interesting information is scattered on the Web which could be structured in a meaningful way for better services. The objective of this paper is to focus on categorization of blog content into ten demanding themes like Technology, Entertainment, News, Business, Health, Sports, Tourism, Widgets, Vehicles, and Products for effective retrieval of information from categorized blog content. Further, a user can also search by feeding specific query to retrieve information from blog. In this paper, we are proposing a WordNet and multiple Ontologies based blog content theme expansion approach and a concept combination based ranking algorithm for blog content based recommendation framework that considers original themes of blog content as an input and recommends conceptually related expanded themes of blog content. The distinctive point of this research is to use concept combination approach based on rough sets to categorize retrieved results for demanding themes as well as for user specific preferences. This kind of blog content categorization approach would be very effective to retrieve meaningful and conceptually related blog information written by a large number of users using different vocabularies. We have experimented with the contents of top blogs related to each theme and got very good results.
Question Answering Systems play a significant role to retrieve exact answers for user’s specific ... more Question Answering Systems play a significant role to retrieve exact answers for user’s specific questions. In answer retrieval process, they employ query expansion methods which play a major role to expand scope of original questions in correct sense. In this paper, we have carried out an extensive survey of few popular web-based open domain Question Answering Systems and critically evaluated their performances on a set of 300 questions from 30 different domains collected from standard resources including TREC to conclude our results. On the basis of findings, we have suggested an efficient query expansion framework that uses multiple ontologies retrieved from semantic web search engine such as Swoogle and combines them with WordNet to disambiguate the context. The proposed approach successfully constructs a conceptual query for user’s questions to retrieve relevant answers. We have experimented on a set of 300 questions to judge the effectiveness of the proposed approach.
Question Answering Systems, unlike search engines, are providing answers to the users’ questions ... more Question Answering Systems, unlike search engines, are providing answers to the users’ questions in succinct form which requires the prior knowledge of the expectation of the user. Question classification module of a Question Answering System plays a very important role in determining the expectations of the user. In the literature, incorrect question classification has been cited as one of the major factors for the poor performance of the Question Answering Systems and this emphasizes on the importance of question classification module designing. In this article, we have proposed a question classification method that exploits the powerful semantic features of the WordNet and the vast knowledge repository of the Wikipedia to describe informative terms explicitly. We have trained our system over a standard set of 5500 questions (by UIUC) and then tested it over five TREC question collections. We have compared our results with some standard results reported in the literature and observed a significant improvement in the accuracy of question classification. The question classification accuracy suggests the effectiveness of the method which is promising in the field of open-domain question classification. Judging the correctness of the answer is an important issue in the field of question answering. In this article, we are extending question classification as one of the heuristics for answer validation. We are proposing a World Wide Web based solution for answer validation where answers returned by open-domain Question Answering Systems can be validated using online resources such as Wikipedia and Google. We have applied several heuristics for answer validation task and tested them against some popular web based open-domain Question Answering Systems over a collection of 500 questions collected from standard sources such as TREC, the Worldbook, and the Worldfactbook. The proposed method seems to be promising for automatic answer validation task.
On the World Wide Web, Open domain Question Answering System is one of the emerging information r... more On the World Wide Web, Open domain Question Answering System is one of the emerging information retrieval systems which are becoming popular day by day to get succinct relevant answers in response of users’ questions. In this paper, we are addressing rough set based method for document ranking which is one of the major tasks in the representation of retrieved results and directly contributes towards accuracy of a retrieval system. Rough sets are widely used for document categorization, vocabulary reduction, and other information retrieval problems. We are proposing a computationally efficient rough set based method for ranking of the documents. The distinctive point of the proposed algorithm is to give more emphasis on presence and position of the concept combination instead of term frequencies. We have experimented over a set of standard questions collected from TREC, Wordbook, WorldFactBook using Google and our proposed method. We found 16% improvement in document ranking performance. Further, we have compared our method with online Question Answering System AnswerBus and observed 38% improvement in ranking relevant documents on top ranks. We conducted more experiments to judge the effectiveness of the information retrieval system and found satisfactory performance results.
Query expansion plays a very important role in enhancing the
performance of the Question Answerin... more Query expansion plays a very important role in enhancing the performance of the Question Answering Systems. There are several methods proposed for query expansion and the use of ontologies has been the latest and popular choice of the researchers because of its effectiveness in building conceptual query. However most of these query expansion methods have used either a single domain specific ontology or WordNet and none of research work has been reported the use of ontologies and WordNet together. In the context of Worldwide Web based Question Answering Systems, the use of single ontology or WordNet has not proved to be sufficient to retrieve wide variety of heterogeneous information. In this paper, we have proposed an efficient query expansion method that uses multiple ontologies retrieved from semantic web search engine such as Swoogle and combines them with WordNet to disambiguate the context. We have experimented on a set of 300 questions collected from TREC and other resources to judge the accuracy of the proposed method. We have shown results using Google as well as with respect to few existing popular web-based Question Answering Systems like START, AnswerBus, BrainBoost, and Inferret.
Uploads
Papers by Santosh K Ray
information over the Internet. A good amount of research has been done to improve the performance of QASs across several
languages, including European and Asian languages. However, Arabic, a morphologically rich Semitic language spoken by over 422
million people, has not seen similar development in the field of question answering. This article reviews the developments taking place
in Arabic QASs as well as the challenges faced by researchers in developing Arabic QASs. After conducting an extensive literature
survey of a number of English and Arabic QASs, this article classifies them according to several criteria. The most commonly used
architecture for the development of an Arabic QAS, known as pipeline architecture, has been presented. In order to encourage and
support the new researchers and scholars in conducting research in Arabic QASs, a list of techniques, tools, and computational
linguistic resources, required to implement the components of the presented pipelined architecture, are described in this article in a
simple and persuasive manner. Finally, the gap analysis between the research in Arabic and English QASs has been performed and
accordingly, some future directions for research in Arabic QASs have been proposed.
plants in a power system. The operating cost for different generators in the plant is modeled as
exponential function. A generalized formula for n-thermal units considering transmission losses is
proposed to find the optimum generation schedule. To demonstrate the effectiveness of the proposed
algorithm, a sample system consisting of six thermal generators is considered. Performance of the
proposed algorithm is compared with the quick method developed for quadratic cost functions. It is
observed that the results obtained by the proposed algorithm are more accurate and the processor takes
less time to run the algorithm. Therefore the proposed algorithm is suitable in real time applications
where accuracy and time are the two main factors.
multiple search engines as a result of larger coverage of indexed web. Meta search engine is a kind of
system which is useful for internet users to take advantage of multiple search engines in searching
information. Recently several approaches were developed using ontology and ranking measures.
Accordingly, Meta search engine is developed here using ontology and semantic similarity measure. In
order to bring semantic in keyword matching, a semantic similarity measure (SSM) is developed. Here,
every concept sets are matched with the title sets using SSM that consider the hyponyms and hyponyms of
the keywords presented in the title sets. Along with three different ranking measures relevant to contents,
title sets and raking value given by the standard search engines are effectively combined to improve the
effectiveness. Finally, the experimentation is carried out using different set of queries and the performance
of the meta-search engine is evaluated using TREC-style average precision (TSAP) measure. The proposed
semantic meta-search engine provides 80% TSAP which is high compared with existing search engine and
meta-search engine.
image recognition systems and it has been successfully used in
various fields such as medical imaging, finger ridge, retina,
and face recognition, etc. In this paper, we are proposing a
novel hybrid method for image segmentation to segment all
constituent objects of the image under consideration using
combination of fuzzy c-means (FCM) and boundary tracking
mathematical modeling technique named level set method
(LSM). In the proposed method, a contour is obtained by FCM
method which serves as initial contour for improved LSM
Method. Finally, experimental results validate the
effectiveness of the proposed combined method for image
segmentation.
of the emerging information retrieval systems
available on the World Wide Web that is becoming
popular day by day to get succinct and relevant
answers in response of users’ questions. The
validation of the correctness of the answer is an
important issue in the field of question answering. In
this paper, we are proposing a World Wide Web
based solution for answer validation where answers
returned by open domain Question Answering
Systems can be validated using online resources such
as Wikipedia and Google. We have applied several
heuristics for answer validation task and tested them
against some popular World Wide Web based open
domain Question Answering Systems over a
collection of 500 questions collected from standard
sources such as TREC, the Worldbook, and the
Worldfactbook. We found that the proposed method
is yielding promising results for automatic answer
validation task.
more purposes. In social network, there is abundant information available on different domains by means of variety of users but it is very difficult to find the user preference based information. Also it is very much possible that relevant
information is available in different forms at the end of other users connected in the same network. In this paper, we are proposing a computationally efficient rough set based method for ranking of the documents. The proposed method first
expands the user query using WordNet and domain Ontologies and then retrieves documents containing relevant information. The distinctive point of the proposed algorithm is to give more emphasis on the concept combination based on concept
presence and its position instead of term frequencies to retrieve relevant information. We have experimented over a set of standard questions collected from TREC, Wordbook, WorldFactBook and retrieved documents using Google
and our proposed method. We observed significant improvement in the ranking of retrieved documents.
LinkedIn etc. are getting popular day by day among users'
community and many more such portals are getting users
attentions to cater their specific needs. Users write blogs on
these social networking websites on a variety of topics as per
their and other user’s interests. By means of social networking
blogs, a large amount of interesting information is scattered on
the Web which could be structured in a meaningful way for
better services. The objective of this paper is to focus on
categorization of blog content into ten demanding themes like
Technology, Entertainment, News, Business, Health, Sports,
Tourism, Widgets, Vehicles, and Products for effective
retrieval of information from categorized blog content.
Further, a user can also search by feeding specific query to
retrieve information from blog.
In this paper, we are proposing a WordNet and multiple
Ontologies based blog content theme expansion approach and
a concept combination based ranking algorithm for blog
content based recommendation framework that considers
original themes of blog content as an input and recommends
conceptually related expanded themes of blog content. The
distinctive point of this research is to use concept combination
approach based on rough sets to categorize retrieved results
for demanding themes as well as for user specific preferences.
This kind of blog content categorization approach would be
very effective to retrieve meaningful and conceptually related
blog information written by a large number of users using
different vocabularies. We have experimented with the
contents of top blogs related to each theme and got very good
results.
major role to expand scope of original questions in correct sense. In this paper, we have carried out an extensive survey of few popular web-based open domain Question Answering Systems and critically evaluated their performances on a set of 300 questions from 30 different domains collected from standard resources including TREC to conclude our results. On the basis of findings, we have suggested an efficient query expansion framework that uses multiple ontologies
retrieved from semantic web search engine such as Swoogle and combines them with WordNet to disambiguate the context. The proposed approach successfully constructs a conceptual query for user’s questions to retrieve relevant answers. We have experimented on a set of 300 questions to judge the effectiveness of the proposed approach.
form which requires the prior knowledge of the expectation of the user. Question classification module of a Question Answering System plays a very important role in determining the expectations of the user. In the literature, incorrect question classification has been cited as one of the major factors for the poor performance of the Question Answering Systems and this emphasizes on the importance
of question classification module designing. In this article, we have proposed a question classification method that exploits the powerful semantic features of the WordNet and the vast knowledge repository of the Wikipedia to describe informative terms explicitly. We have trained our system over a standard set of 5500 questions (by UIUC) and then tested it over five TREC question collections. We have compared our results with some standard results reported in the literature and observed a significant improvement in the accuracy of question classification. The question classification accuracy suggests the effectiveness of the method which is promising in the field of open-domain question classification. Judging the correctness of the answer is an important issue in the field of question answering. In this article, we are extending question classification as one of the heuristics for answer validation. We are proposing a World Wide Web based solution for answer validation where answers returned by open-domain
Question Answering Systems can be validated using online resources such as Wikipedia and Google. We have applied several heuristics for answer validation task and tested them against some popular web based open-domain Question Answering Systems over a collection of 500 questions collected from standard sources such as TREC, the Worldbook, and the Worldfactbook. The proposed method seems to be promising for automatic answer validation task.
categorization, vocabulary reduction, and other information retrieval problems. We are proposing a computationally efficient rough set based method for ranking of the documents. The distinctive point of the proposed algorithm is to give more emphasis on presence and position of the concept combination instead of term frequencies. We have experimented over a set of standard questions collected from TREC, Wordbook, WorldFactBook using Google and our proposed method. We found 16% improvement in document ranking performance. Further, we have compared our method with online Question Answering System AnswerBus and observed 38% improvement in ranking relevant documents on top ranks. We conducted more experiments to judge the effectiveness of the information retrieval system and found satisfactory performance results.
performance of the Question Answering Systems. There are several methods proposed for query expansion and the use of ontologies has been the latest and popular choice of the researchers because of its effectiveness in building
conceptual query. However most of these query expansion methods have used
either a single domain specific ontology or WordNet and none of research work has been reported the use of ontologies and WordNet together. In the context of Worldwide Web based Question Answering Systems, the use of single ontology or WordNet has not proved to be sufficient to retrieve wide variety of heterogeneous information. In this paper, we have proposed an efficient query expansion method that uses multiple ontologies retrieved from semantic web search engine such as Swoogle and combines them with WordNet to disambiguate the context. We have experimented on a set of 300 questions
collected from TREC and other resources to judge the accuracy of the proposed method. We have shown results using Google as well as with respect to few existing popular web-based Question Answering Systems like START, AnswerBus, BrainBoost, and Inferret.
information over the Internet. A good amount of research has been done to improve the performance of QASs across several
languages, including European and Asian languages. However, Arabic, a morphologically rich Semitic language spoken by over 422
million people, has not seen similar development in the field of question answering. This article reviews the developments taking place
in Arabic QASs as well as the challenges faced by researchers in developing Arabic QASs. After conducting an extensive literature
survey of a number of English and Arabic QASs, this article classifies them according to several criteria. The most commonly used
architecture for the development of an Arabic QAS, known as pipeline architecture, has been presented. In order to encourage and
support the new researchers and scholars in conducting research in Arabic QASs, a list of techniques, tools, and computational
linguistic resources, required to implement the components of the presented pipelined architecture, are described in this article in a
simple and persuasive manner. Finally, the gap analysis between the research in Arabic and English QASs has been performed and
accordingly, some future directions for research in Arabic QASs have been proposed.
plants in a power system. The operating cost for different generators in the plant is modeled as
exponential function. A generalized formula for n-thermal units considering transmission losses is
proposed to find the optimum generation schedule. To demonstrate the effectiveness of the proposed
algorithm, a sample system consisting of six thermal generators is considered. Performance of the
proposed algorithm is compared with the quick method developed for quadratic cost functions. It is
observed that the results obtained by the proposed algorithm are more accurate and the processor takes
less time to run the algorithm. Therefore the proposed algorithm is suitable in real time applications
where accuracy and time are the two main factors.
multiple search engines as a result of larger coverage of indexed web. Meta search engine is a kind of
system which is useful for internet users to take advantage of multiple search engines in searching
information. Recently several approaches were developed using ontology and ranking measures.
Accordingly, Meta search engine is developed here using ontology and semantic similarity measure. In
order to bring semantic in keyword matching, a semantic similarity measure (SSM) is developed. Here,
every concept sets are matched with the title sets using SSM that consider the hyponyms and hyponyms of
the keywords presented in the title sets. Along with three different ranking measures relevant to contents,
title sets and raking value given by the standard search engines are effectively combined to improve the
effectiveness. Finally, the experimentation is carried out using different set of queries and the performance
of the meta-search engine is evaluated using TREC-style average precision (TSAP) measure. The proposed
semantic meta-search engine provides 80% TSAP which is high compared with existing search engine and
meta-search engine.
image recognition systems and it has been successfully used in
various fields such as medical imaging, finger ridge, retina,
and face recognition, etc. In this paper, we are proposing a
novel hybrid method for image segmentation to segment all
constituent objects of the image under consideration using
combination of fuzzy c-means (FCM) and boundary tracking
mathematical modeling technique named level set method
(LSM). In the proposed method, a contour is obtained by FCM
method which serves as initial contour for improved LSM
Method. Finally, experimental results validate the
effectiveness of the proposed combined method for image
segmentation.
of the emerging information retrieval systems
available on the World Wide Web that is becoming
popular day by day to get succinct and relevant
answers in response of users’ questions. The
validation of the correctness of the answer is an
important issue in the field of question answering. In
this paper, we are proposing a World Wide Web
based solution for answer validation where answers
returned by open domain Question Answering
Systems can be validated using online resources such
as Wikipedia and Google. We have applied several
heuristics for answer validation task and tested them
against some popular World Wide Web based open
domain Question Answering Systems over a
collection of 500 questions collected from standard
sources such as TREC, the Worldbook, and the
Worldfactbook. We found that the proposed method
is yielding promising results for automatic answer
validation task.
more purposes. In social network, there is abundant information available on different domains by means of variety of users but it is very difficult to find the user preference based information. Also it is very much possible that relevant
information is available in different forms at the end of other users connected in the same network. In this paper, we are proposing a computationally efficient rough set based method for ranking of the documents. The proposed method first
expands the user query using WordNet and domain Ontologies and then retrieves documents containing relevant information. The distinctive point of the proposed algorithm is to give more emphasis on the concept combination based on concept
presence and its position instead of term frequencies to retrieve relevant information. We have experimented over a set of standard questions collected from TREC, Wordbook, WorldFactBook and retrieved documents using Google
and our proposed method. We observed significant improvement in the ranking of retrieved documents.
LinkedIn etc. are getting popular day by day among users'
community and many more such portals are getting users
attentions to cater their specific needs. Users write blogs on
these social networking websites on a variety of topics as per
their and other user’s interests. By means of social networking
blogs, a large amount of interesting information is scattered on
the Web which could be structured in a meaningful way for
better services. The objective of this paper is to focus on
categorization of blog content into ten demanding themes like
Technology, Entertainment, News, Business, Health, Sports,
Tourism, Widgets, Vehicles, and Products for effective
retrieval of information from categorized blog content.
Further, a user can also search by feeding specific query to
retrieve information from blog.
In this paper, we are proposing a WordNet and multiple
Ontologies based blog content theme expansion approach and
a concept combination based ranking algorithm for blog
content based recommendation framework that considers
original themes of blog content as an input and recommends
conceptually related expanded themes of blog content. The
distinctive point of this research is to use concept combination
approach based on rough sets to categorize retrieved results
for demanding themes as well as for user specific preferences.
This kind of blog content categorization approach would be
very effective to retrieve meaningful and conceptually related
blog information written by a large number of users using
different vocabularies. We have experimented with the
contents of top blogs related to each theme and got very good
results.
major role to expand scope of original questions in correct sense. In this paper, we have carried out an extensive survey of few popular web-based open domain Question Answering Systems and critically evaluated their performances on a set of 300 questions from 30 different domains collected from standard resources including TREC to conclude our results. On the basis of findings, we have suggested an efficient query expansion framework that uses multiple ontologies
retrieved from semantic web search engine such as Swoogle and combines them with WordNet to disambiguate the context. The proposed approach successfully constructs a conceptual query for user’s questions to retrieve relevant answers. We have experimented on a set of 300 questions to judge the effectiveness of the proposed approach.
form which requires the prior knowledge of the expectation of the user. Question classification module of a Question Answering System plays a very important role in determining the expectations of the user. In the literature, incorrect question classification has been cited as one of the major factors for the poor performance of the Question Answering Systems and this emphasizes on the importance
of question classification module designing. In this article, we have proposed a question classification method that exploits the powerful semantic features of the WordNet and the vast knowledge repository of the Wikipedia to describe informative terms explicitly. We have trained our system over a standard set of 5500 questions (by UIUC) and then tested it over five TREC question collections. We have compared our results with some standard results reported in the literature and observed a significant improvement in the accuracy of question classification. The question classification accuracy suggests the effectiveness of the method which is promising in the field of open-domain question classification. Judging the correctness of the answer is an important issue in the field of question answering. In this article, we are extending question classification as one of the heuristics for answer validation. We are proposing a World Wide Web based solution for answer validation where answers returned by open-domain
Question Answering Systems can be validated using online resources such as Wikipedia and Google. We have applied several heuristics for answer validation task and tested them against some popular web based open-domain Question Answering Systems over a collection of 500 questions collected from standard sources such as TREC, the Worldbook, and the Worldfactbook. The proposed method seems to be promising for automatic answer validation task.
categorization, vocabulary reduction, and other information retrieval problems. We are proposing a computationally efficient rough set based method for ranking of the documents. The distinctive point of the proposed algorithm is to give more emphasis on presence and position of the concept combination instead of term frequencies. We have experimented over a set of standard questions collected from TREC, Wordbook, WorldFactBook using Google and our proposed method. We found 16% improvement in document ranking performance. Further, we have compared our method with online Question Answering System AnswerBus and observed 38% improvement in ranking relevant documents on top ranks. We conducted more experiments to judge the effectiveness of the information retrieval system and found satisfactory performance results.
performance of the Question Answering Systems. There are several methods proposed for query expansion and the use of ontologies has been the latest and popular choice of the researchers because of its effectiveness in building
conceptual query. However most of these query expansion methods have used
either a single domain specific ontology or WordNet and none of research work has been reported the use of ontologies and WordNet together. In the context of Worldwide Web based Question Answering Systems, the use of single ontology or WordNet has not proved to be sufficient to retrieve wide variety of heterogeneous information. In this paper, we have proposed an efficient query expansion method that uses multiple ontologies retrieved from semantic web search engine such as Swoogle and combines them with WordNet to disambiguate the context. We have experimented on a set of 300 questions
collected from TREC and other resources to judge the accuracy of the proposed method. We have shown results using Google as well as with respect to few existing popular web-based Question Answering Systems like START, AnswerBus, BrainBoost, and Inferret.