Weather Prediction Using CPT+ Algorithm: Proposed Scheme
Weather Prediction Using CPT+ Algorithm: Proposed Scheme
Weather Prediction Using CPT+ Algorithm: Proposed Scheme
Introduction
Weather forecasting is a vital application in meteorology and has been one of the most
scientifically and technologically challenging problems around the world in the last century.
Weather forecasting entails predicting how the present state of the atmosphere will change.
Present weather conditions are obtained by ground observations, observations from ships and
aircraft, radio-sounds, Doppler radar, and satellites.
This information is sent to meteorological centers where the data are collected, analyzed, and
made into a variety of charts, maps, and graphs. Modern high-speed computers transfer the many
thousands of observations onto surface and upper-air maps. Computers draw the lines on the
maps with help from meteorologists, who correct for any errors. A final map is called an
analysis. Computers not only draw the maps but predict how the maps will look sometime in the
future. The forecasting of weather by computer is known as numerical weather prediction.
Climate is the long-term effect of the sun's radiation on the rotating earth's varied surface and
atmosphere. The Day-by-day variations in a given area constitute the weather, whereas climate is
the long-term synthesis of such variations. Weather is measured by thermometers, rain gauges,
barometers, and other instruments, but the study of climate relies on statistics. Nowadays, such
statistics are handled efficiently by computers. A simple, long-term summary of weather changes,
however, is still not a true picture of climate. To obtain this requires the analysis of daily,
monthly, and yearly patterns.
Climate change is a significant and lasting change in the statistical distribution of weather
patterns over periods ranging from decades to millions of years. It may be a change in average
weather conditions or the distribution of events around that average (e.g., more or fewer extreme
weather events). The
term is sometimes used to refer specifically to climate change caused by human activity, as
opposed to changes in climate that may have resulted as part of Earth's natural processes.
Climate change today is synonymous with anthropogenic global warming. Within scientific
journals, however, global warming refers to surface temperature increases, while climate change
includes global warming and everything else that increasing greenhouse gas amounts will affect.
Proposed Scheme
In this section we present a model for lossless weather prediction that is CPT+. Given a set of
training sequences, the problem of sequence prediction consists in finding the next element of a
target sequence by only observing its previous items. The number of applications associated with
this problem is extensive. It includes applications such as web page pre-fetching, consumer
product recommendation, weather forecasting and stock market prediction. The literature on this
subject is extensive and there are many different approaches. Two of the most popular are PPM
(Prediction by Partial Matching) and DG (Dependency Graph) . Over the years, these models
have been greatly improved in terms of time or memory efficiency but their performance remains
more or less the same in terms of prediction accuracy. Markov Chains are also widely used for
sequence prediction. However, they assume that sequences are Markovian. Other approaches
exist such as neural networks and association rules. But all these approaches build prediction
lossy models from training sequences. Therefore, they do not use all the information available in
training sequences for making predictions. In this paper, we propose a novel approach for
sequence prediction that use the whole information from training sequences to perform
predictions. The hypothesis is that it would increase prediction accuracy.
A Decision Tree
A Decision Tree is a flow-chart-like tree structure. Each internal node denotes a test on an
attribute. Each branch represents an outcome of the test. Leaf nodes represent class distribution.
The decision tree structure provides an explicit set of if-then rules (rather than abstract
mathematical equations), making the results easy to interpret. In the tree structures, leaves
represent classifications and branches represent conjunctions of features that lead to those
classifications. In decision analysis, a decision tree can be used visually and explicitly to
represent decisions and decision making. The concept of information gain is used to decide the
splitting value at an internal node. The splitting value that would provide the most information
gain is chosen. Formally, information gain is defined by entropy. In other to improve the
accuracy and generalization of classification and regression trees, various techniques were
introduced like boosting and pruning.
Compact Prediction Tree
The Compact Prediction Tree (CPT) is a recently proposed prediction model [5]. Its main
distinctive characteristics with respect to other prediction models are that (1) CPT stores a
compressed representation of training sequences with no loss or a small loss and (2) CPT
measures the similarity of a sequence to the training sequences to perform a prediction. The
similarity measure is noise tolerant and thus allows CPT to predict the next items of
subsequences that have not been previously seen in training sequences, whereas other proposed
models such as PPM and All-K-order-markov cannot perform prediction in such case. The
training process of CPT takes as input a set of training sequences and generates three distinct
structures: (1) a Prediction Tree (PT), (2) a Lookup Table (LT) and (3) an Inverted Index. During
training, sequences are considered one by one to incrementally build these three structures.
SOFTWARE ENVIRONMENT
Java
Java is a general-purpose computer programming language that is concurrent, class-
based, object-oriented, and specifically designed to have as few implementation dependencies as
possible. It is intended to let application developers "write once, run anywhere" (WORA),
meaning that compiled Java code can run on all platforms that support Java without the need for
recompilation. Java applications are typically compiled to bytecode that can run on any Java
virtual machine (JVM) regardless of computer architecture. As of 2016, Java is one of the most
popular programming languages in use, particularly for client-server web applications, with a
reported 9 million developers.Java was originally developed by James Gosling at Sun
Microsystems (which has since been acquired by Oracle Corporation) and released in 1995 as a
core component of Sun Microsystems' Java platform. The language derives much of its syntax
from C and C++, but it has fewer low-level facilities than either of them.
The original and reference implementation Java compilers, virtual machines, and class
libraries were originally released by Sun under proprietary licences. As of May 2007, in
compliance with the specifications of the Java Community Process, Sun relicensed most of its
Java technologies under the GNU General Public License. Others have also developed
alternative implementations of these Sun technologies, such as the GNU Compiler for Java
(bytecode compiler), GNU Classpath (standard libraries), and IcedTea-Web (browser plugin for
applets).
The latest version is Java 8, which is the only version currently supported for free by
Oracle, although earlier versions are supported both by Oracle and other companies on a
commercial basis.
Eclipse
Eclipse is an integrated development environment (IDE) used in computer programming,
and is the most widely used Java IDE. It contains a base workspace and an extensible plug-in
system for customizing the environment. Eclipse is written mostly in Java and its primary use is
for developing Java applications, but it may also be used to develop applications in other
programming languages through the use of plugins, including: Ada, ABAP, C, C++, COBOL, D,
Fortran, Haskell, JavaScript, Julia, Lasso, Lua, NATURAL, Perl, PHP, Prolog, Python, R, Ruby
(including Ruby on Rails framework), Rust, Scala, Clojure, Groovy, Scheme, and Erlang. It can
also be used to develop documents with LaTeX (through the use of the TeXlipse plugin) and
packages for the software Mathematica. Development environments include the Eclipse Java
development tools (JDT) for Java and Scala, Eclipse CDT for C/C++ and Eclipse PDT for PHP,
among others.
The initial codebase originated from IBM VisualAge. The Eclipse software development kit
(SDK), which includes the Java development tools, is meant for Java developers. Users can
extend its abilities by installing plug-ins written for the Eclipse Platform, such as development
toolkits for other programming languages, and can write and contribute their own plug-in
modules. Since Equinox, plug-ins can be plugged/stopped dynamically and are known as (OSGI)
bundles
CloudSim Simulation Framework
2. A self contained platform for modeling clouds, service brokers, provisioning and
allocation policies.
3. Support for simulation of network connections among the simulated system elements.
6. Flexibility to switch between space shared and time shared allocation of processing cores
to virtualized services.
MySQL
MySQL is written in C and C++. Its SQL parser is written in yacc, but it uses a home-
brewed lexical analyzer. MySQL works on many system platforms, including AIX, BSDi,
FreeBSD, HP-UX, eComStation, i5/OS, IRIX, Linux, macOS, Microsoft Windows, NetBSD,
Novell NetWare, OpenBSD, OpenSolaris, OS/2 Warp, QNX, Oracle Solaris, Symbian, SunOS,
SCO OpenServer, SCO UnixWare, Sanos and Tru64. A port of MySQL to OpenVMS also exists.
The MySQL server software itself and the client libraries use dual-licensing distribution.
They are offered under GPL version 2, beginning from 28 June 2000 (which in 2009 has been
extended with a FLOSS License Exception) or to use a proprietary license.
INPUT DESIGN
Input design is one of the most important phase of the system design. Input design is the
process where the input received in the system are planned and designed, so as to get necessary
information from the user, eliminating the information that is not required. The aim of the input
design is to ensure the maximum possible levels of accuracy and also ensures that the input is
accessible that understood by the user. The input design is the part of overall system design,
which requires very careful attention. If the data going into the system is incorrect then the
processing and output will magnify the errors.
In this admin login form we used labels to display the text and textbox to get the
username and password. The admin have a unique username and password. In this form the
username and password is correct, the admin can access this website.
In this module admin add the Employee and his details such as name, department, phone
number etc,. Admin can edit the employee details
In this form the admin can perform the weather prediction operation by applying the
decision tree algorithm.
In this form the admin can perform the weather prediction by applying the cpt+ algorithm
The output form of the system is either by screen or by hard copies. Output design aims
at communicating the results of the processing of the users. The reports are generated to suit the
needs of the users .The reports have to be generated with appropriate levels. In our project
outputs are generated by asp as html pages. As its web application output is designed in a very
user-friendly this will be through screen most of the time.
The overall objective in the development of database technology has been to treat data as
an organizational resource and as an integrated whole. DBMS allow data to be protected and
organized separately from other resources. Database is an integrated collection of data. The most
significant form of data as seen by the programmers is data as stored on the direct access storage
devices. This is the difference between logical and physical data. Database files are the key
source of information into the system. It is the process of designing database files, which are the
key source of information to the system. The files should be properly designed and planned for
collection, accumulation, editing and retrieving the required information.
Data Flow
Process
Storage
Testing Methodologies
Testing
The purpose of testing is to discover errors. Testing is the process of trying to discover
every conceivable fault or weakness in a work product. It provides a way to check the
functionality of components, sub assemblies, assemblies and/or a finished product It is the
process of exercising software with the intent of ensuring that the
Software system meets its requirements and user expectations and does not fail in an
unacceptable manner. There are various types of test. Each test type addresses a specific testing
requirement.
Types Of Tests
Unit testing
In this system the Unit testing is performed by sepating the whole projects into units such
as function, blocks and classes etc. this involves the design of test cases that validate that the
internal program logic is functioning properly, and that program inputs produce valid outputs. All
decision branches and internal code flow should be validated. It is the testing of individual
software units of the application .it is done after the completion of an individual unit before
integration.
This is a structural testing, that relies on knowledge of its construction and is invasive.
Unit tests perform basic tests at component level and test a specific business process, application,
and/or system configuration. Unit tests ensure that each unique path of a business process
performs accurately to the documented specifications and contains clearly defined inputs and
expected results.
Integration testing
Integration tests are performed to test the integration between forms. i.e whether the
form integration is performing correct or not. For example the login page integrate with the next
form once the validation is complete. This test is designed to test integrated software components
to determine if they actually run as one program.
Testing is event driven and is more concerned with the basic outcome of screens or fields.
Integration tests demonstrate that although the components were individually satisfaction, as
shown by successfully unit testing, the combination of components is correct and consistent.
Integration testing is specifically aimed at exposing the problems that arise from the
combination of components.
TEST CASES:
TC01 admin should Admin enter valid admin System should System Pass
enter the user user name accept the data accepts the
name data
Admin should Admin enter in maha System should System not Pass
enter the user valid user name not accept the accepts the
name data data
admin
TC02 admin should Admin enters valid System should System Pass
enter the password password accept the data accepts the
data
hhg
Admin should Admin enters in System should System not Pass
enter the password valid password not accept the accepts the
data data
System should
TC03 Admin should Admin clicks the Sign redirect to the System Pass
click the login login button up home page. redirects to
button the home
page.
Screen Name: new user Registration