Data mining, Information Extraction, Deep Web
95 Followers
Recent papers in Data mining, Information Extraction, Deep Web
DLP is a data security technology that detects and prevents data breach incidents by monitoring data in-use, in-motion and at-rest. It has been widely applied for regulatory compliances, data privacy and intellectual property... more
رایانه ها به هنگام ظهور این وعده را دادند که به عنوان یک مخزن دانش و خرد باشند، اما در عوض حجم عظيمی از داده ها را به سوی ما روانه ساختند وب کاوی فرآیند کشف اطلاعات و دانش از داده های وب می باشد. در وب کاوی این داده ها از سمت سرور ، مشتری... more
Deep Web is the data on the internet that is not accessible by popular search engines. It is much greater than the Surface Web we use. Deep Web grants anonymity, and with it, come the horrors of underground misuse. It is a shady (and... more
Breve spiegazione ed analisi del Deep Web e dei suoi contenuti, con riferimenti al funzionamento di TOR e agli Hidden Services.
Resumen El Presente documento, fue hecho con el propósito de dar a conocer la parte conceptual, qué contiene y cómo se accede a esa parte enorme y oculta bajo la superficie del iceberg llamado " información " que existe en la Red. Se... more
Özet İnternet şüphesiz insanlık tarihinde devrim niteliğinde bir buluş ve gelişimi de halen devam etmektedir. İnsanların birçoğu iletişim, sosyal medya, alışveriş, siyasi ve sosyal gündem takibi ve daha fazlası için interneti... more
Objective: The paper analyzes money laundering through crypto-assets and offers a legal perspective on how this new technology can be used to commit these felonies. The study intends to shed light on the matter, helping to visualize how... more
O presente estudo teve por objetivo propor um processo de mineração de conteúdos em mídias sociais para auxiliar na gestão de destinos turísticos composto por sete fases, elaborado com base nas metodologias propostas por Neves (2013),... more
The Internet as the whole is a network of multiple computer networks and their massive infrastructure. The web is made up of accessible websites through search engines such as Google, Firefox, etc. and it is known as the Surface Web. The... more
Ogólne zasady poszukiwania informacji w internecie - Ogólne zasady poszukiwania informacji w internecie - Sposoby dostępu do zasobów Deep Web - Zasoby naukowe w Deep Web - Pozyskiwanie danych, publikacji i treści (w tym naukowych) z... more
This article tries to explain our rule-based Arabic Named Entity recognition (NER) and classification system. It is based on lists of classified proper names (PN) and particularly on syntactico-semantic patterns resulting in fine... more
Cyber and its related technologies such as Internet was introduced to the world only in late 1980s, and today it is unimaginable to think of a life without these all pervasive technologies. Despite being ubiquitous around the world, cyber... more
When we type the usual www acronym in an electronic device (computer, smartphone, tablet, among others) and then, the address of a webpage, in a matter of seconds we have access to all the information which, without the revolution of... more
Software project estimation is important for allocating resources and planning a reasonable work schedule. Estimation models are typically built using data from completed projects. While organizations have their historical data... more
Arguably the biggest challenge in analyzing English tense is to account for the double access interpretation, which arises when a present tensed verb is embedded under a past attitude—e.g. "John said that Mary is pregnant".... more
Coreference resolution plays an important role in Information Extraction.This paper covers the investigation of two strategies based on a mention-pair resolver using Decision Tree classifier on structured and unstructured dataset,... more
In this paper, we outline our work on developing a disk-based infrastructure for efficient visualization and graph exploration operations over very large graphs. The proposed platform, called graphVizdb, is based on a novel technique for... more
ABSTRCT Market Situation is something that will provide valuable benefits to increase the productivity of selling a product both conventionally and online, Indonesian e-commerce map data in the second quarter shows that the increase in... more
Semantically, objects in unstructured document are related each other to perform a certain entity relation. This certain entity relation such: drug-drug interaction through their compounds, buyer-seller relationship through the goods or... more
With the rapid growth of users in social networking services, data is generated in thousands of terabytes every day. Practical frameworks for data extraction from social networking sites have not been well investigated yet. In this paper,... more
ABSTRAK Berdasarkan situs wearesocial.com menunjukan data perkembangan pengguna internet sebesar 4.437 miliar yang disertai 75 % pembelian produk online, 82 % pencarian produk dan kunjungan on-line retail 92% yang kemudian memicu sebuah... more
The Islamic State of Iraq and Syria (ISIS) has made great use of the Internet and online social media sites to spread its message and encourage others, particularly young people, to support the organization; to travel to the Middle East... more
Due to the enormous amount of data stored in databases and within other several information resources warehouses, there're increased needs to new technology to extract the hidden valuable knowledge from this data. This knowledge became... more
In Natural Language Processing, Parts-of-Speech tagging plays a vital role in text processing for any sort of language processing and understanding by machine. In each of the quarter of machine translation, information retrieval or speech... more
Aquilo que muitos conhecem popularmente como Internet, caracteriza-se, socioculturalmente como ciberespaço e possui territorialidade própria, bem como suas próprias práticas culturais, identificadas como cibercultura. Tendo em vista que,... more
Natural Language Processing is a programmed approach to analyze text that is based on both a set of theories and a set of technologies. This forum aims to bring together researchers who have designed and build software that will analyze,... more
El artículo aborda la pendulación entre libre circulación de la información y la privacidad vs. la seguridad y el control, que se produce por los fuertes intereses que intervienen en ambos polos. Frente al caso de la Deep Web, por su... more
Web is a wide, various and dynamic environment in which different users publish their documents. Web-mining is one of data mining applications in which web patterns are explored. Studies on web mining can be categorized into three... more
The Internet may be free, but service provider’s indispensable to access services are not, to the extent that while the complexity and burden of the sites increases, it is becoming more and more expensive to surf the net. Blocking access... more
In this paper, we present a system for personality recognition that exploits linguistic cues and does not require supervision for evaluation. We run the system on a dataset sampled from a popular Social Network: FriendFeed. We adopted the... more
The gigantic growth of information on the Internet makes discovery information challenging and time consuming. We are encircled by a plethora of data in the form of blogs, papers, reviews, and comments on different websites. Recommender... more
Attended Software Freedom Kosova 2016 conference. Held in Pristina, Kosovo (October 2016). Received a full speaker grant from the conference organizers and presented a lecture titled “Using public library computers anonymously in order to... more
The World Wide Web organizes information in semi-structured HTML documents. For a template-based web page that contains a list of items, information schema can be implied and structured data can be extracted with a query, i.e. a (web)... more
We present an architecture for the integration of shallow and deep NLP components which is aimed at flexible combination of different language technologies for a range of practical current and future applications. In particular, we... more
ste artigo apresenta uma análise dos dados coletados na Deep Web, discutindo suas expressões a partir dos conceitos de Disciplina (FOUCAULT, 2010) e Dialética do Esclarecimento (ADORNO; HORKHEIMER,... more
There are several methods and available tools for terminology extraction, but the quality of the extracted terms is not always high. Hence, an important consideration in terminology extraction is to assess the quality of the extracted... more
Popular Personalities have multiple name aliases addressed in different documents of the web. An exact textual web identification of a person is useful in information retrieval, sentiment analysis, relation extraction and name... more