Papers by Faisal Alkhateeb
Computer Science Journal of Moldova, 2014
Artificial Bee Colony (ABC) is a swarm-based metaheuristic for continuous optimization. Recent wo... more Artificial Bee Colony (ABC) is a swarm-based metaheuristic for continuous optimization. Recent work hybridized this algorithm with other metaheuristics in order to improve performance. The work in this paper, experimentally evaluates the use of different mutation operators with the ABC algorithm. The introduced operator is activated according to a determined probability called mutation rate (MR). The results on standard benchmark function suggest that the use of this operator improves performance in terms of convergence speed and quality of final obtained solution. It shows that Power and Polynomial mutations give best results. The fastest convergence was for the mutation rate value (MR=0.2).
Lack of data/input validation and is a critical challenge because it might cause failures in the ... more Lack of data/input validation and is a critical challenge because it might cause failures in the software, and can also break the security upon web applications such as an unauthorized access to data. Traditional security solutions do not properly address this problem. For example, Firewalls and SSL are not adequate security for a web application. One the most common web application vulnerabilities that are known to date is SQL injection. In this paper, a new data validation service which is based upon semantic web technologies has been developed to prevent the security vulnerabilities at the application level and to secure the web system even if the input validation modules are bypassed. Our semanitc architecture consists of the following components: RDFa annotation for elements of web pages, interceptor, RDF extractor, RDF parser, and data validator. Principally, The prototype of the proposued service indicate that the proposed data validation service might provide a detection, an...
Abstract—This paper presents a new framework for adding semantics into e-learning system. The pro... more Abstract—This paper presents a new framework for adding semantics into e-learning system. The proposed approach relies on two principles. The first principle is the automatic addition of semantic information when creating the mathematical contents. The second principle is the collaborative tagging and annotation of the e-learning contents and the use of an ontology to categorize the e-learning contents. The proposed system encodes the mathematical contents using presentation MathML with RDFa annotations. The system allows students to highlight and annotate specific parts of the e-learning contents. The objective is to add meaning into the e-learning contents, to add relationships between contents, and to create a framework to facilitate searching the contents. This semantic information can be used to answer semantic queries (e.g., SPARQL) to retrieve information request of a user. This work is implemented as an embedded code into Moodle e-learning system. Keywords- Semantic Web; Mat...
So far we have considered expressing information within the XML language. XML is not a very expre... more So far we have considered expressing information within the XML language. XML is not a very expressive language: it basically provides a structure which stores information and this information has to be extracted in order to be accessed and exploited. It does not allows to express knowledge in terms of descriptions which will be used for guiding this interpretation. XML provides a database point of view about knowledge representation in which the power of what can be done with the language depends on the power of the query language (Xpath, Xquery, etc.) and the representation is a store. In the context of the semantic web, this structure is provided with a semantics specifying how to interpret the structure. This allows the definition of general statements which will apply to the relevant data whatever its syntactic form. Doing so requires the definition of a logic, specifying the semantics of statements (or formulas) and identifying the possible models of a document or a set of doc...
Pattern Recognition and Image Analysis
This paper provides a thorough evaluation of a set of six important Arabic OCR systems available ... more This paper provides a thorough evaluation of a set of six important Arabic OCR systems available in the market; namely: Abbyy FineReader, Leadtools, Readiris, Sakhr, Tesseract and NovoVerus. We tes...
RDF is a knowledge representation language dedicated to the annotation of resources within the fr... more RDF is a knowledge representation language dedicated to the annotation of resources within the framework of the semantic web. Among the query languages for querying an RDF knowledge base, some, such as SPARQL, are based on the formal semantics of RDF and the concept of consequence semantic, others, inspired by the work in data bases, uses regular expressions making it possible to search the paths in the graph associated with the knowledge base. In order to combine the expressivity of these two approaches, we define a mixed language, called PRDF (for "Paths RDF") in which the arcs of a graph can be labeled by regular expressions. We define the syntax and the semantics of these objects, and propose a correct and complete algorithm which, by a kind of homomorphism, calculates the semantic consequence between an RDF graph and a PRDF graph. This algorithm is the coeur of the extension of the SPARQL query language which we propose and implemented: a PSPARQL query allows to query...
This paper describes the notion of query answering in a distributed knowledge based system, and g... more This paper describes the notion of query answering in a distributed knowledge based system, and gives methods for computing these answers in certain cases. More precisely, given a distributed system (DS) of ontologies and ontology mappings (or bridge rules) written in Distributed Description Logics (DDL), distributed answers are defined for queries written in terms of one particular ontology. These answers may contain individuals from different ABoxes. To compute these answers, the paper provides an algorithm that reduce the problem of distributed query answering to local query answering. This algorithm is proved correct but not complete in the general case
Pattern Recognition and Image Analysis, 2017
This paper provides a thorough evaluation of a set of six important Arabic OCR systems available ... more This paper provides a thorough evaluation of a set of six important Arabic OCR systems available in the market; namely: Abbyy FineReader, Leadtools, Readiris, Sakhr, Tesseract and NovoVerus. We tes...
Criminals could break the client-side input validation modules. Bypassing input validation is a s... more Criminals could break the client-side input validation modules. Bypassing input validation is a serious challenge because it might cause failures in the software, and can also break the security upon web applications such as an unauthorized access to data. Even the criminals can not bypass the client and/or server input validation, web application flaws, such as cross-site scripting or SQL injection, now account for more than two thirds of the reported web security vulnerabilities. In this paper, we present a new data validation service which is based upon semantic web technologies to prevent the security vulnerabilities at the application level and to secure the web system even if the input validation modules are bypassed. The architecture of the service consists of the following components: RDFa annotation for elements of web pages, interceptor, RDFa extractor, RDF parser, and data validator.
RDF is a knowledge representation language dedicated to the annotation of resources within the Se... more RDF is a knowledge representation language dedicated to the annotation of resources within the Semantic Web. Though RDF itself can be used as a query language for an RDF knowledge base (using RDF semantic consequence), the need for added expressivity in queries has led to define the SPARQL query language. SPARQL queries are defined on top of graph patterns that are basically RDF graphs with variables. SPARQL queries remain limited as they do not allow queries with unbounded sequences of relations (e.g. "does there exist a trip from town A to town B using only trains or buses?"). We show that it is possible to extend the RDF syntax and semantics defining the PRDF language (for Path RDF) such that SPARQL can overcome this limitation by simply replacing the basic graph patterns with PRDF graphs, effectively mixing RDF reasoning with database-inspired regular paths. We further extend PRDF to CPRDF (for Constrained Path RDF) to allow expressing constraints on the nodes of trave...
The Journal of Supercomputing
International Journal of Computer Applications in Technology
Proceedings of the 12th International Conference on Web Information Systems and Technologies, 2016
In this paper, we extend the SPARQL triple patterns to include two operators (the negation and th... more In this paper, we extend the SPARQL triple patterns to include two operators (the negation and the wild-card). We define the syntax and the semantics of these operators, in particular, when using them in the predicate position of SPARQL triple patterns. The use of the negation and wild-card operators and thus the semantics are different from the literature. Then, we show that these two operators could be used to enhance the evaluation performance of some SPARQL queries and to add extra expressiveness.
International Journal on Document Analysis and Recognition (IJDAR)
Optical character recognition (OCR) is the process of recognizing characters automatically from s... more Optical character recognition (OCR) is the process of recognizing characters automatically from scanned documents for editing, indexing, searching, and reducing the storage space. The resulted text from the OCR usually does not match the text in the original document. In order to minimize the number of incorrect words in the obtained text, OCR post-processing approaches can be used. Correcting OCR errors is more complicated when we are dealing with the Arabic language because of its complexity such as connected letters, different letters may have the same shape, and the same letter may have different forms. This paper provides a statistical Arabic language model and post-processing techniques based on hybridizing the error model approach with the context approach. The proposed model is language independent and non-constrained with the string length. To the best of our knowledge, this is the first end-to-end OCR post-processing model that is applied to the Arabic language. In order to train the proposed model, we build Arabic OCR context database which contains 9000 images of Arabic text. Also, the evaluation of the OCR post-processing system results is automated using our novel alignment technique which is called fast automatic hashing text alignment. Our experimental results show that the rule-based system improves the word error rate from 24.02% to become 20.26% by using a training data set of 1000 images. On the other hand, after this training, we apply the rule-based system on 500 images as a testing dataset and the word error rate is improved from 14.95% to become 14.53%. The proposed hybrid OCR post-processing system improves the results based on using 1000 training images from a word error rate of 24.02% to become 18.96%. After training the hybrid system, we used 500 images for testing and the results show that the word error rate enhanced from 14.95 to become 14.42. The obtained results show that the proposed hybrid system outperforms the rule-based system.
Journal of King Saud University - Computer and Information Sciences
Abstract One of the major problems that is usually associated with any optimization algorithm inc... more Abstract One of the major problems that is usually associated with any optimization algorithm including the Cuckoo Search (CS) algorithm is the premature convergence to suboptimal solutions. This problem normally occurs when the optimization operators of CS are not able to maintain the diversity of the solutions over multiple generations. One possible solution to the problem of premature convergence of CS is to hybridize it with other search techniques to reduce the likelihood of premature convergence. However, the hybrid CS algorithms normally require more computations than the original CS algorithm. The β-hill climbing algorithm, a variation of the Hill climbing algorithm, is capable of reaching better solutions in a shorter time than many popular local search algorithms. This paper proposes a new hybrid CS algorithm (CSBHC) that intelligently combines the CS algorithm with the β-hill climbing algorithm. In order to balance between the computational time and effectiveness of CSBHC, the β-hill climbing algorithm is called at each iteration of CSBHC based on an exponentially decreasing probability (i.e., the probability function used in Simulated Annealing). The proposed algorithm was evaluated and compared to popular hybrid CS algorithms using 16 standard benchmark functions. The experimental results suggest that the proposed algorithm produces more accurate results in a shorter running time compared to the original CS and other approaches.
Journal of Intelligent Systems
Simulated annealing (SA) proved its success as a single-state optimization search algorithm for b... more Simulated annealing (SA) proved its success as a single-state optimization search algorithm for both discrete and continuous problems. On the contrary, cuckoo search (CS) is one of the well-known population-based search algorithms that could be used for optimizing some problems with continuous domains. This paper provides a hybrid algorithm using the CS and SA algorithms. The main goal behind our hybridization is to improve the solutions generated by CS using SA to explore the search space in an efficient manner. More precisely, we introduce four variations of the proposed hybrid algorithm. The proposed variations together with the original CS and SA algorithms were evaluated and compared using 10 well-known benchmark functions. The experimental results show that three variations of the proposed algorithm provide a major performance enhancement in terms of best solutions and running time when compared to CS and SA as stand-alone algorithms, whereas the other variation provides a min...
Arabian Journal for Science and Engineering
The selection process is an important part of any optimization algorithm. Usually, an efficient s... more The selection process is an important part of any optimization algorithm. Usually, an efficient selection process should balance between exploration of the search space and exploitation of the current knowledge about the best solutions. Cuckoo search (CS) is a simple yet powerful optimization algorithm inspired by the parasitic reproduction behavior of some cuckoo species. At each iteration of the original CS algorithm, the selection process is triggered in three places: (i) cuckoo selection where a cuckoo is selected from the population of n nests (stored solutions) based on a uniformly random function, (ii) host selection where a nest is chosen randomly from the n nests and (iii) greedy selection of a portion of the n nests for replacements with new randomly generated solutions. This paper proposes several variations of the CS algorithm by replacing the uniformly random-based selection method (used in step i) with existing randomized selection schemes, namely greedy, proportional, exponential, $$\varepsilon $$ε-greedy, softmax and reinforcement learning selection schemes. The proposed variations were evaluated and compared using twenty well-known benchmark functions (12 test functions from CEC 2005). The experimental results show that the proposed variations outperform the original CS algorithm.
International Journal of Computer Applications in Technology
A good library is one that has all of its resources accessible to all kinds of people, e.g. peopl... more A good library is one that has all of its resources accessible to all kinds of people, e.g. people with print disabilities. For this purpose, librarians try to provide books in several types of formats to accommodate different users. For example, e-books and digital talking books (DTB) are now available and can be used by a wider spectrum of users. Several systems can be found for transforming a book into DTB for people where the mother language is English. Such systems transform a book into DTB by encoding the document using DAISY format. However, Arabic DTBs are sparse since the work done so far in this domain is not well understood. In fact, Arabic language is very hard to process and such characteristics provide another complex dimension to a system that is able to transform an Arabic book into a DTB. In this paper, we propose a framework for an Arabic DTB that uses DAISY format. The proposed system includes an image-to-text converter, context-injector, text-to-audio-generator, and finally a DAISY generator. The implemented system is tested over ten users whose mother language is Arabic. The results were promising and we also provided a set of recommendations that can enhance the quality of the system.
Uploads
Papers by Faisal Alkhateeb