Predicting Hourly Boarding Demand of Bus Passengers 3.6.2
Predicting Hourly Boarding Demand of Bus Passengers 3.6.2
Predicting Hourly Boarding Demand of Bus Passengers 3.6.2
The tap-on smart-card data provides a valuable source to learn passengers’ boarding
behaviour and predict future travel demand. However, when examining the smart-
card records (or instances) by the time of day and by boarding stops, the positive
instances (i.e. boarding at a specific bus stop at a specific time) are rare compared to
negative instances (not boarding at that bus stop at that time). Imbalanced data has
been demonstrated to significantly reduce the accuracy of machine learning models
deployed for predicting hourly boarding numbers from a particular location. This
paper addresses this data imbalance issue in the smart-card data before applying it to
predict bus boarding demand. We propose the deep generative adversarial nets
(Deep-GAN) to generate dummy travelling instances to add to a synthetic training
dataset with more balanced travelling and non-travelling instances. The synthetic
dataset is then used to train a deep neural network (DNN) for predicting the
travelling and non-travelling instances from a particular stop in a given time window.
The results show that addressing the data imbalance issue can significantly improve
the predictive model’s performance and better fit ridership’s actual profile.
Comparing the performance of the Deep-GAN with other traditional resampling
methods shows that the proposed method can produce a synthetic training dataset
with a higher similarity and diversity and, thus, a stronger prediction power. The
paper highlights the significance and provides practical guidance in improving the
data quality and model performance on travel behaviour prediction and individual
travel behaviour analysis.
INTRODUCTION
Smart card data has emerged in recent years and provide a comprehensive, and cheap
source of information for planning and managing public transport systems. This
paper presents a multi-stage machine learning framework to predict passengers’
boarding stops using smart card data.
The framework addresses the challenges arising from the imbalanced nature of the
data (e.g. many non-travelling data) and the ‘many-class’ issues (e.g. many possible
boarding stops) by decomposing the prediction of hourly ridership into three stages:
whether to travel or not in that one-hour time slot, which bus line to use, and at
which stop to board. A simple neural network architecture, fully connected networks
(FCN), and two deep learning architectures, recurrent neural networks (RNN) and
long short-term memory networks (LSTM) are implemented. The proposed approach
is applied to a real-life bus network.
We show that the data imbalance has a profound impact on the accuracy of
prediction at individual level. At aggregated level, FCN is able to accurately predict
the rideship at individual stops, it is poor at capturing the temporal distribution of
ridership. RNN and LSTM are able to measure the temporal distribution but lack the
ability to capture the spatial distribution through bus lines.
Disadvantages
• The data generated by SMOTE and ADASYN are susceptible to outliers. They may
generate some data in the majority data space due to minority outlier instances
(usually noisy data), causing blurred classification borderlines and making the
learning difficulties of the classification model.
• The under-sampling methods usually have to pay the price of losing parts of the
information of the majority of data because they have to remove a part of the data.
Although the Easy Ensemble and Balance Cascade tried to solve the problem of lost
information, they increased the number of models tens of times, significantly
increasing the computational burden.
• Little study has noticed the loss caused by the data imbalance issue in the public
transport system. There is also no research to validate the efficiency of the existing
resampling methods on imbalanced data in the boarding prediction task.
Proposed System
• The data imbalance issue in the public transport system has received little attention,
and this study is the first to focus on this issue and propose a deep learning approach,
Deep-GAN, to solve it.
• This study compared the differences in similarity and diversity between the real and
synthetic travelling instanced generated from Deep-GAN and other over-sampling
methods. It also compared different resampling methods for the improvement of data
quality by evaluating the performance of the next travel behaviour prediction model.
This is the first validation and evaluation of the performance of different data
resampling methods based on real data in the public transport system.
• This paper innovatively modelled individual boarding behaviour, which is
uncommon in other travel demand prediction tasks. Compared to the popular
aggregated prediction, this individual-based model is able to provide more details on
the passengers’ behaviour, and the results will benefit the analysis of the similarities
and heterogeneities.
Advantages
SYSTEM REQUIREMENTS
SOFTWARE REQUIREMENTS:
Front-End : Python.
Back-End : Django-ORM
In this module, the Service Provider has to login by using valid user name and
password. After login successful he can do some operations such as
Browse and Train & Test Data Sets, View Trained and Tested Accuracy in Bar
Chart, View Trained and Tested Accuracy Results, View Prediction Of Hourly
Boarding Demand Type, View Hourly Boarding Demand Type Ratio,
Download Trained Data Sets, View Hourly Boarding Demand Type Ratio
Results, View All Remote Users,
Remote User
In this module, there are n numbers of users are present. User should register
before doing any operations. Once user registers, their details will be stored to
the database. After registration successful, he has to login by using authorized
user name and password. Once Login is successful user will do some
operations like REGISTER AND LOGIN, Predicting Hourly Boarding
Demand Type, VIEW YOUR PROFILE.
Decision tree classifiers
Decision tree classifiers are used successfully in many diverse areas. Their most
important feature is the capability of capturing descriptive decision making
knowledge from the supplied data. Decision tree can be generated from training
sets. The procedure for such generation based on the set of objects (S), each
belonging to one of the classes C1, C2, …, Ck is as follows:
Step 1. If all the objects in S belong to the same class, for example Ci, the
decision tree for S consists of a leaf labeled with this class
Step 2. Otherwise, let T be some test with possible outcomes O1, O2,…, On.
Each object in S has one outcome for T so the test partitions S into subsets S1,
S2,… Sn where each object in Si has outcome Oi for T. T becomes the root of
the decision tree and for each outcome Oi we build a subsidiary decision tree by
invoking the same procedure recursively on the set Si.
Gradient boosting
Gradient boosting is a machine learning technique used
in regression and classification tasks, among others. It gives a prediction model in the
form of an ensemble of weak prediction models, which are typically decision trees.[1]
[2]
When a decision tree is the weak learner, the resulting algorithm is called gradient-
boosted trees; it usually outperforms random forest.A gradient-boosted trees model is
built in a stage-wise fashion as in other boosting methods, but it generalizes the other
methods by allowing optimization of an arbitrary differentiable loss function.
Example
This program computes binary logistic regression and multinomial logistic regression
on both numeric and categorical independent variables. It reports on the regression
equation as well as the goodness of fit, odds ratios, confidence limits, likelihood, and
deviance. It performs a comprehensive residual analysis including diagnostic residual
reports and plots. It can perform an independent variable subset selection search,
looking for the best regression model with the fewest independent variables. It
provides confidence intervals on predicted values and provides ROC curves to help
determine the best cutoff point for classification. It allows you to validate your
results by automatically classifying rows that are not used during the analysis.
Naïve Bayes
While the Naive Bayes classifier is widely used in the research world, it is not
widespread among practitioners which want to obtain usable results. On the one
hand, the researchers found especially it is very easy to program and implement it, its
parameters are easy to estimate, learning is very fast even on very large databases, its
accuracy is reasonably good in comparison to the other approaches. On the other
hand, the final users do not obtain a model easy to interpret and deploy, they does not
understand the interest of such a technique.
Thus, we introduce in a new presentation of the results of the learning process. The
classifier is easier to understand, and its deployment is also made easier. In the first
part of this tutorial, we present some theoretical aspects of the naive bayes classifier.
Then, we implement the approach on a dataset with Tanagra. We compare the
obtained results (the parameters of the model) to those obtained with other linear
approaches such as the logistic regression, the linear discriminant analysis and the
linear SVM. We note that the results are highly consistent. This largely explains the
good performance of the method in comparison to others. In the second part, we use
various tools on the same dataset (Weka 3.6.0, R 2.9.2, Knime 2.1.1, Orange 2.0b
and RapidMiner 4.6.0). We try above all to understand the obtained results.
Random Forest
Random forests or random decision forests are an ensemble learning method for
classification, regression and other tasks that operates by constructing a multitude of
decision trees at training time. For classification tasks, the output of the random
forest is the class selected by most trees. For regression tasks, the mean or average
prediction of the individual trees is returned. Random decision forests correct for
decision trees' habit of overfitting to their training set. Random forests generally
outperform decision trees, but their accuracy is lower than gradient boosted trees.
However, data characteristics can affect their performance.
The first algorithm for random decision forests was created in 1995 by Tin Kam
Ho[1] using the random subspace method, which, in Ho's formulation, is a way to
implement the "stochastic discrimination" approach to classification proposed by
Eugene Kleinberg.
An extension of the algorithm was developed by Leo Breiman and Adele Cutler, who
registered "Random Forests" as a trademark in 2006 (as of 2019, owned by Minitab,
Inc.).The extension combines Breiman's "bagging" idea and random selection of
features, introduced first by Ho[1] and later independently by Amit and Geman[13]
in order to construct a collection of decision trees with controlled variance.
Random forests are frequently used as "blackbox" models in businesses, as they
generate reasonable predictions across a wide range of data while requiring little
configuration.
SVM
Start
Login
Yes No
Status
Register
Predicting and Boarding
Hourly Login Demand Type,
Logout
Flow Chart : Service Provider
Start
Login
Yes No
Status
Log Out
View Trained and Tested Accuracy
Results
Service Provider
Login, Browse and Train & Test Data Sets, View Trained and Tested
Accuracy in Bar Chart, View Trained and Tested Accuracy Results, View
Methods Prediction Of Hourly Boarding Demand Type, View Hourly Boarding
Demand Type Ratio, Download Trained Data Sets, View Hourly Boarding
Demand Type Ratio Results, View All Remote Users,
Fid, Trip ID, Route ID, Stop I D, Stop Name, Week Beginning, Number Of
Members Boarding’s, Prediction.
Login
Register
Login,
Login Register
(), Reset (),
Methods
Register (), Reset ()
Methods Register ().
User Name, Password
User Name, Password. User Name, Password, E-
Members mail, Mobile, Address, DOB,
Members
Gender, Pin code, Image
Remote User
Tweet Servervvv
Fid, Trip ID, Route ID, Stop I D, Stop Name, Week Beginning, Number Of
Members Boarding’s, Prediction. Tweet Server
Tweet Server
Tweet Server
Tweet Server
Use case Browse and Train & Test Data
Sets,
Service
Provider
Predicting Hourly Boarding Remote User
Demand Type,
Python is Interpreted: Python is processed at runtime by the interpreter. You do not need
to compile your program before executing it. This is similar to PERL and PHP.
Python is Interactive: You can actually sit at a Python prompt and interact with the
interpreter directly to write your programs.
Python is Object-Oriented: Python supports Object-Oriented style or technique of
programming that encapsulates code within objects.
Python is a Beginner's Language: Python is a great language for the beginner-level
programmers and supports the development of a wide range of applications from simple
text processing to WWW browsers to games.
Python is derived from many other languages, including ABC, Modula-3, C, C++, Algol-68,
SmallTalk, and Unix shell and other scripting languages.
Python is copyrighted. Like Perl, Python source code is now available under the GNU General
Public License (GPL).
Python is now maintained by a core development team at the institute, although Guido van
Rossum still holds a vital role in directing its progress.
Easy-to-learn: Python has few keywords, simple structure, and a clearly defined syntax.
This allows the student to pick up the language quickly.
Easy-to-read: Python code is more clearly defined and visible to the eyes.
Easy-to-maintain: Python's source code is fairly easy-to-maintain.
A broad standard library: Python's bulk of the library is very portable and cross-platform
compatible on UNIX, Windows, and Macintosh.
Interactive Mode: Python has support for an interactive mode which allows interactive
testing and debugging of snippets of code.
Portable: Python can run on a wide variety of hardware platforms and has the same
interface on all platforms.
Extendable: You can add low-level modules to the Python interpreter. These modules
enable programmers to add to or customize their tools to be more efficient.
Databases: Python provides interfaces to all major commercial databases.
GUI Programming: Python supports GUI applications that can be created and ported to
many system calls, libraries and windows systems, such as Windows MFC, Macintosh,
and the X Window system of Unix.
Scalable: Python provides a better structure and support for large programs than shell
scripting.
- Subtraction Subtracts right hand operand from left hand operand. a–b=-
10
% Modulus Divides left hand operand by right hand operand and b%a=
returns remainder 0
2.2ASSIGNMENT OPERATOR
& Binary AND Operator copies a bit to the result if it exists in both (a & b)
operands (means
0000 1100)
^ Binary XOR It copies the bit if it is set in one operand but not both. (a ^ b) = 49
(means
0011 0001)
~ Binary Ones It is unary and has the effect of 'flipping' bits. (~a ) = -61
Complement (means
1100 0011
in 2's
complement
form due to
a signed
binary
number.
<< Binary Left Shift The left operands value is moved left by the number of bits a << 2 =
specified by the right operand. 240 (means
1111 0000)
>> Binary Right The left operands value is moved right by the number of a >> 2 = 15
Shift bits specified by the right operand. (means
0000 1111)
and Logical If both the operands are true then condition (a and b)
AND becomes true. is true.
not Logical Used to reverse the logical state of its operand. Not(a
NOT and b) is
false.
not in Evaluates to true if it does not finds a variable in the x not in y, here
specified sequence and false otherwise. not in results in a
1 if x is not a
member of
sequence y.
Operator Description
~+- Complement, unary plus and minus (method names for the last two are
+@ and -@)
3.1 LIST
The list is a most versatile data type available in Python which can be written as a list of comma-
separated values (items) between square brackets. Important thing about a list is that items in a list
need not be of the same type.
Creating a list is as simple as putting different comma-separated values between square brackets.
For example −
list2 = [1, 2, 3, 4, 5 ];
1 cmp(list1, list2)
2 len(list)
3 max(list)
4 min(list)
5 list(seq)
1 list.append(obj)
3 list. extend(seq)
4 list.index(obj)
5 list.insert(index, obj)
6 list.pop(obj=list[-1])
7 list.remove(obj)
8 list.reverse()
9 list.sort([func])
3.2 TUPLES
A tuple is a sequence of immutable Python objects. Tuples are sequences, just like lists. The
differences between tuples and lists are, the tuples cannot be changed unlike lists and tuples use
parentheses, whereas lists use square brackets.
Creating a tuple is as simple as putting different comma-separated values. Optionally we can put
these comma-separated values between parentheses also. For example −
tup1 = ();
To write a tuple containing a single value you have to include a comma, even though there is only
one value −
tup1 = (50,);
Like string indices, tuple indices start at 0, and they can be sliced, concatenated, and so on.
tup1[0]: physics
tup2[1:5]: [2, 3, 4, 5]
Updating Tuples:
Tuples are immutable which means you cannot update or change the values of tuple elements. We
are able to take portions of existing tuples to create new tuples as the following example
demonstrates −
To explicitly remove an entire tuple, just use the del statement. For example:
1
cmp(tuple1, tuple2):Compares elements of both tuples.
2
len(tuple):Gives the total length of the tuple.
3
max(tuple):Returns item from the tuple with max value.
4
min(tuple):Returns item from the tuple with min value.
5
tuple(seq):Converts a list into tuple.
3.2 DICTIONARY
Each key is separated from its value by a colon (:), the items are separated by commas, and the
whole thing is enclosed in curly braces. An empty dictionary without any items is written with just
two curly braces, like this: {}.
Keys are unique within a dictionary while values may not be. The values of a dictionary can be of
any type, but the keys must be of an immutable data type such as strings, numbers, or tuples.
dict['Name']: Zara
dict['Age']: 7
Updating Dictionary
We can update a dictionary by adding a new entry or a key-value pair, modifying an existing
entry, or deleting an existing entry as shown below in the simple example −
Result −
dict['Age']: 8
dict['School']: DPS School
To explicitly remove an entire dictionary, just use the del statement. Following is a simple
example –
1 cmp(dict1, dict2)
2 len(dict)
Gives the total length of the dictionary. This would be equal to the number of items in
the dictionary.
3 str(dict)
4 type(variable)
Returns the type of the passed variable. If passed variable is dictionary, then it would
return a dictionary type.
3 dict.fromkeys():Create a new dictionary with keys from seq and values set to value.
A function is a block of organized, reusable code that is used to perform a single, related action.
Functions provide better modularity for your application and a high degree of code reusing. Python
gives you many built-in functions like print(), etc. but you can also create your own functions.
These functions are called user-defined functions.
Defining a Function
Simple rules to define a function in Python.
Function blocks begin with the keyword def followed by the function name and parentheses
( ( ) ).
Any input parameters or arguments should be placed within these parentheses. You can
also define parameters inside these parentheses.
The first statement of a function can be an optional statement - the documentation string of
the function or docstring.
The code block within every function starts with a colon (:) and is indented.
The statement return [expression] exits a function, optionally passing back an expression to
the caller. A return statement with no arguments is the same as return None.
Calling a Function
Defining a function only gives it a name, specifies the parameters that are to be included in the
function and structures the blocks of code.Once the basic structure of a function is finalized, you
can execute it by calling it from another function or directly from the Python prompt. Following is
the example to call printme() function −
Function Arguments
You can call a function by using the following types of formal arguments:
Required arguments
Keyword arguments
Default arguments
Variable-length arguments
Scope of Variables
All variables in a program may not be accessible at all locations in that program. This depends on
where you have declared a variable.
The scope of a variable determines the portion of the program where you can access a particular
identifier. There are two basic scopes of variables in Python −
This means that local variables can be accessed only inside the function in which they are
declared, whereas global variables can be accessed throughout the program body by all functions.
When you call a function, the variables declared inside it are brought into scope. Following is a
simple example −
return total;
sum( 10, 20 );
Result −
A module allows you to logically organize your Python code. Grouping related code into a module
makes the code easier to understand and use. A module is a Python object with arbitrarily named
attributes that you can bind and reference.Simply, a module is a file consisting of Python code. A
module can define functions, classes and variables. A module can also include runnable code.
Example:
The Python code for a module named aname normally resides in a file named aname.py. Here's an
example of a simple module, support.py
return
When the interpreter encounters an import statement, it imports the module if the module is present
in the search path. A search path is a list of directories that the interpreter searches before
importing a module. For example, to import the module support.py, you need to put the following
command at the top of the script −
A module is loaded only once, regardless of the number of times it is imported. This prevents the
module execution from happening over and over again if multiple imports occur.
Packages in Python
A package is a hierarchical file directory structure that defines a single Python application
environment that consists of modules and sub packages and sub-sub packages.
Consider a file Pots.py available in Phone directory. This file has following line of source code −
def Pots():
Similar way, we have another two files having different functions with the same name as above −
Phone/__init__.py
To make all of your functions available when you've imported Phone,to put explicit import
statements in __init__.py as follows −
from G3 import G3
After you add these lines to __init__.py, you have all of these classes available when you import
the Phone package.
import Phone
Phone.Pots()
Phone.Isdn()
Phone.G3()
RESULT:
I'm 3G Phone
In the above example, we have taken example of a single functions in each file, but you can keep
multiple functions in your files. You can also define different Python classes in those files and
then you can create your packages out of those classes.
This chapter covers all the basic I/O functions available in Python.
The simplest way to produce output is using the print statement where you can pass zero or more
expressions separated by commas. This function converts the expressions you pass into a string
and writes the result to standard output as follows −
Result:
Python provides two built-in functions to read a line of text from standard input, which by default
comes from the keyboard. These functions are −
raw_input
input
The raw_input([prompt]) function reads one line from standard input and returns it as a string
(removing the trailing newline).
str = raw_input("Enter your input: ");
print "Received input is : ", str
This prompts you to enter any string and it would display same string on the screen. When I typed
"Hello Python!", its output is like this −
The input([prompt]) function is equivalent to raw_input, except that it assumes the input is a valid
Python expression and returns the evaluated result to you.
This would produce the following result against the entered input −
Until now, you have been reading and writing to the standard input and output. Now, we will see
how to use actual data files.
Python provides basic functions and methods necessary to manipulate files by default. You can do
most of the file manipulation using a file object.
Before you can read or write a file, you have to open it using Python's built-in open() function.
This function creates a file object, which would be utilized to call other support methods
associated with it.
Syntax
file object = open(file_name [, access_mode][, buffering])
file_name: The file_name argument is a string value that contains the name of the file that
you want to access.
access_mode: The access_mode determines the mode in which the file has to be opened,
i.e., read, write, append, etc. A complete list of possible values is given below in the table.
This is optional parameter and the default file access mode is read (r).
buffering: If the buffering value is set to 0, no buffering takes place. If the buffering value
is 1, line buffering is performed while accessing a file. If you specify the buffering value as
an integer greater than 1, then buffering action is performed with the indicated buffer size.
If negative, the buffer size is the system default(default behavior).
Modes Description
r Opens a file for reading only. The file pointer is placed at the beginning of the file. This is the
default mode.
rb Opens a file for reading only in binary format. The file pointer is placed at the beginning of the
file. This is the default mode.
r+ Opens a file for both reading and writing. The file pointer placed at the beginning of the file.
rb+ Opens a file for both reading and writing in binary format. The file pointer placed at the
beginning of the file.
w Opens a file for writing only. Overwrites the file if the file exists. If the file does not exist,
creates a new file for writing.
wb Opens a file for writing only in binary format. Overwrites the file if the file exists. If the file does
not exist, creates a new file for writing.
w+ Opens a file for both writing and reading. Overwrites the existing file if the file exists. If the file
does not exist, creates a new file for reading and writing.
wb+ Opens a file for both writing and reading in binary format. Overwrites the existing file if the file
exists. If the file does not exist, creates a new file for reading and writing.
a Opens a file for appending. The file pointer is at the end of the file if the file exists. That is, the
file is in the append mode. If the file does not exist, it creates a new file for writing.
ab Opens a file for appending in binary format. The file pointer is at the end of the file if the file
exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for
writing.
a+ Opens a file for both appending and reading. The file pointer is at the end of the file if the file
exists. The file opens in the append mode. If the file does not exist, it creates a new file for
reading and writing.
ab+ Opens a file for both appending and reading in binary format. The file pointer is at the end of the
file if the file exists. The file opens in the append mode. If the file does not exist, it creates a new
file for reading and writing.
Once a file is opened and you have one file object, you can get various information related to that
file.
file.softspace Returns false if space explicitly required with print, true otherwise.
Example
# Open a file
fo = open("foo.txt", "wb")
print "Name of the file: ", fo.name
print "Closed or not : ", fo.closed
print "Opening mode : ", fo.mode
print "Softspace flag : ", fo.softspace
The close() method of a file object flushes any unwritten information and closes the file object,
after which no more writing can be done.Python automatically closes a file when the reference
object of a file is reassigned to another file. It is a good practice to use the close() method to close
a file.
Syntax
fileObject.close();
Example
# Open a file
fo = open("foo.txt", "wb")
print "Name of the file: ", fo.name
# Close opend file
fo.close()
Result −
The file object provides a set of access methods to make our lives easier. We would see how to
use read() and write() methods to read and write files.
The write() method writes any string to an open file. It is important to note that Python strings can
have binary data and not just text.The write() method does not add a newline character ('\n') to the
end of the string Syntax
fileObject.write(string);
Here, passed parameter is the content to be written into the opened file. Example
# Open a file
fo = open("foo.txt", "wb")
fo.write( "Python is a great language.\nYeah its great!!\n");
The above method would create foo.txt file and would write given content in that file and finally it
would close that file. If you would open this file, it would have following content.
The read() method reads a string from an open file. It is important to note that Python strings can
have binary data. apart from text data.
Syntax
fileObject.read([count]);
Here, passed parameter is the number of bytes to be read from the opened file. This method starts
reading from the beginning of the file and if count is missing, then it tries to read as much as
possible, maybe until the end of file.
Example
# Open a file
fo = open("foo.txt", "r+")
str = fo.read(10);
print "Read String is : ", str
# Close opend file
fo.close()
File Positions
The tell() method tells you the current position within the file; in other words, the next read or
write will occur at that many bytes from the beginning of the file.
32
The seek(offset[, from]) method changes the current file position. The offset argument indicates
the number of bytes to be moved. The from argument specifies the reference position from where
the bytes are to be moved.
If from is set to 0, it means use the beginning of the file as the reference position and 1 means use
the current position as the reference position and if it is set to 2 then the end of the file would be
taken as the reference position.
Example
# Open a file
fo = open("foo.txt", "r+")
str = fo.read(10);
print "Read String is : ", str
Python os module provides methods that help you perform file-processing operations, such as
renaming and deleting files.
To use this module you need to import it first and then you can call any related functions.
The rename() Method
The rename() method takes two arguments, the current filename and the new filename.
Syntax
os.rename(current_file_name, new_file_name)
Example
import os
You can use the remove() method to delete files by supplying the name of the file to be deleted as
the argument.
Syntax
os.remove(file_name)
Example
#!/usr/bin/python
import os
Directories in Python
All files are contained within various directories, and Python has no problem handling these too.
The os module has several methods that help you create, remove, and change directories.
The mkdir() Method
You can use the mkdir() method of the os module to create directories in the current directory.
You need to supply an argument to this method which contains the name of the directory to be
created.
Syntax
os.mkdir("newdir")
Example
#!/usr/bin/python
import os
You can use the chdir() method to change the current directory. The chdir() method takes an
argument, which is the name of the directory that you want to make the current directory.
Syntax
os.chdir("newdir")
Example
#!/usr/bin/python
import os
Example
import os
The rmdir() method deletes the directory, which is passed as an argument in the method.
Syntax:
os.rmdir('dirname')
Example
Following is the example to remove "/tmp/test" directory. It is required to give fully qualified
name of the directory, otherwise it would search for that directory in the current directory.
import os
# This would remove "/tmp/test" directory.
os.rmdir( "/tmp/test" )
There are three important sources, which provide a wide range of utility methods to handle and
manipulate files & directories on Windows and Unix operating systems. They are as follows −
File Object Methods: The file object provides functions to manipulate files.
OS Object Methods: This provides methods to process files as well as directories.
Python provides two very important features to handle any unexpected error in your
Python programs and to add debugging capabilities in them −
Exception Handling: This would be covered in this tutorial. Here is a list
standard Exceptions available in Python: Standard Exceptions.
Assertions: This would be covered in Assertions in Python
EXCEPTION DESCRIPTION
NAME
StopIteration Raised when the next() method of an iterator does not point to any
object.
StandardError Base class for all built-in exceptions except StopIteration and
SystemExit.
ArithmeticError Base class for all errors that occur for numeric calculation.
ZeroDivisionError Raised when division or modulo by zero takes place for all numeric
types.
EOFError Raised when there is no input from either the raw_input() or input()
function and the end of file is reached.
KeyError Raised when the specified key is not found in the dictionary.
IOError Raised when an input/ output operation fails, such as the print
statement or the open() function when trying to open a file that does
IOError
not exist.
SystemError Raised when the interpreter finds an internal problem, but when this
error is encountered the Python interpreter does not exit.
ValueError Raised when the built-in function for a data type has the valid type
of arguments, but the arguments have invalid values specified.
RuntimeError Raised when a generated error does not fall into any category.
What is Exception?
An exception is an event, which occurs during the execution of a program that
disrupts the normal flow of the program's instructions. In general, when a Python
script encounters a situation that it cannot cope with, it raises an exception. An
exception is a Python object that represents an error.
When a Python script raises an exception, it must either handle the exception
immediately otherwise it terminates and quits.
Handling an exception
If you have some suspicious code that may raise an exception, you can defend your
program by placing the suspicious code in a try: block. After the try: block, include
an except: statement, followed by a block of code which handles the problem as
elegantly as possible.
The Python standard for database interfaces is the Python DB-API. Most Python database
interfaces adhere to this standard.
You can choose the right database for your application. Python Database API supports a wide
range of database servers such as −
GadFly
mSQL
MySQL
PostgreSQL
Microsoft SQL Server 2000
Informix
Interbase
Oracle
Sybase
The DB API provides a minimal standard for working with databases using Python structures and
syntax wherever possible. This API includes the following:
TESTING METHODOLOGIES
o Unit Testing.
o Integration Testing.
o User Acceptance Testing.
o Output Testing.
o Validation Testing.
Unit Testing
Unit testing focuses verification effort on the smallest unit of Software design that is the
module. Unit testing exercises specific paths in a module’s control structure to
ensure complete coverage and maximum error detection. This test focuses on each module
individually, ensuring that it functions properly as a unit. Hence, the naming is Unit Testing.
During this testing, each module is tested individually and the module interfaces are
verified for the consistency with design specification. All important processing path are tested for
the expected results. All error handling paths are also tested.
Integration Testing
Integration testing addresses the issues associated with the dual problems of verification
and program construction. After the software has been integrated a set of high order tests are
conducted. The main objective in this testing process is to take unit tested modules and builds a
program structure that has been dictated by design.
The following are the types of Integration Testing:
2. Bottom-up Integration
This method begins the construction and testing with the modules at the lowest level in the
program structure. Since the modules are integrated from the bottom up, processing required for
modules subordinate to a given level is always available and the need for stubs is eliminated. The
bottom up integration strategy may be implemented with the following steps:
The low-level modules are combined into clusters into clusters that perform a
specific Software sub-function.
A driver (i.e.) the control program for testing is written to coordinate test case input and
output.
The cluster is tested.
Drivers are removed and clusters are combined moving upward in the program
structure
The bottom up approaches tests each module individually and then each module is module is
integrated with a main module and tested for functionality.
User Acceptance of a system is the key factor for the success of any system. The system
under consideration is tested for user acceptance by constantly keeping in touch with the
prospective system users at the time of developing and making changes wherever required. The
system developed provides a friendly user interface that can easily be understood even by a person
who is new to the system.
After performing the validation testing, the next step is output testing of the proposed
system, since no system could be useful if it does not produce the required output in the specified
format. Asking the users about the format required by them tests the outputs generated or displayed
by the system under consideration. Hence the output format is considered in 2 ways – one is on
screen and another in printed format.
Text Field:
The text field can contain only the number of characters lesser than or equal to its size. The
text fields are alphanumeric in some tables and alphabetic in other tables. Incorrect entry always
flashes and error message.
Numeric Field:
The numeric field can contain only numbers from 0 to 9. An entry of any character flashes
an error messages. The individual modules are checked for accuracy and what it has to perform.
Each module is subjected to test run along with sample data. The individually tested modules
are integrated into a single system. Testing involves executing the real data information is used in
the program the existence of any program defect is inferred from the output. The testing should be
planned so that all the requirements are individually tested.
A successful test is one that gives out the defects for the inappropriate data and produces
and output revealing the errors in the system.
Preparation of Test Data
Taking various kinds of test data does the above testing. Preparation of test data plays a
vital role in the system testing. After preparing the test data the system under study is tested
using that test data. While testing the system by using test data errors are again uncovered and
corrected by using above testing steps and corrections are also noted for future use.
Live test data are those that are actually extracted from organization files. After a system is
partially constructed, programmers or analysts often ask users to key in a set of data from their
normal activities. Then, the systems person uses this data as a way to partially test the system. In
other instances, programmers or analysts extract a set of live data from the files and have them
entered themselves.
It is difficult to obtain live data in sufficient amounts to conduct extensive testing. And,
although it is realistic data that will show how the system will perform for the typical processing
requirement, assuming that the live data entered are in fact typical, such data generally will not test
all combinations or formats that can enter the system. This bias toward typical values then does not
provide a true systems test and in fact ignores the cases most likely to cause system failure.
Artificial test data are created solely for test purposes, since they can be generated to test all
combinations of formats and values. In other words, the artificial data, which can quickly be
prepared by a data generating utility program in the information systems department, make possible
the testing of all login and control paths through the program.
The most effective test programs use artificial test data generated by persons other than
those who wrote the programs. Often, an independent team of testers formulates a testing plan,
using the systems specifications.
The package “Virtual Private Network” has satisfied all the requirements specified as per
software requirement specification and was accepted.
Whenever a new system is developed, user training is required to educate them about the
working of the system so that it can be put to efficient use by those for whom the system has been
primarily designed. For this purpose the normal working of the project was demonstrated to the
prospective users. Its working is easily understandable and since the expected users are people who
have good knowledge of computers, the use of this system is very easy.
7.3 MAINTAINENCE
This covers a wide range of activities including correcting code and design errors. To
reduce the need for maintenance in the long run, we have more accurately defined the user’s
requirements during the process of system development. Depending on the requirements, this
system has been developed to satisfy the needs to the largest possible extent. With development in
technology, it may be possible to add many more features based on the requirements in future. The
coding and designing is simple and easy to understand which will make maintenance easier.
TESTING STRATEGY :
A strategy for system testing integrates system test cases and design techniques into a well
planned series of steps that results in the successful construction of software. The testing strategy
must co-operate test planning, test case design, test execution, and the resultant data collection and
evaluation .A strategy for software testing must accommodate low-level tests that are necessary
to verify that a small source code segment has been correctly implemented as well as high level
tests that validate major system functions against user requirements.
Software testing is a critical element of software quality assurance and represents the ultimate
review of specification design and coding. Testing represents an interesting anomaly for the
software. Thus, a series of testing are performed for the proposed system before the system is
ready for user acceptance testing.
SYSTEM TESTING:
Software once validated must be combined with other system elements (e.g. Hardware,
people, database). System testing verifies that all the elements are proper and that overall system
function performance is
achieved. It also tests to find discrepancies between the system and its original objective, current
specifications and system documentation.
UNIT TESTING:
In unit testing different are modules are tested against the specifications produced during
the design for the modules. Unit testing is essential for verification of the code produced during the
coding phase, and hence the goals to test the internal logic of the modules. Using the detailed
design description as a guide, important Conrail paths are tested to uncover errors within the
boundary of the modules. This testing is carried out during the programming stage itself. In this
type of testing step, each module was found to be working satisfactorily as regards to the expected
output from the module.
In Due Course, latest technology advancements will be taken into consideration. As part
of technical build-up many components of the networking system will be generic in nature so that
future projects can either use or interact with this. The future holds a lot to offer to the
development and refinement of this project.
Implementation:-
This project we made by Django , so for implemation use this command inside the project
Paste url in the Browzer so u will get this User inter face
CONCLUSION
The motivation of this study was because we have faced the challenge of imbalanced
data when we used the real world bus smart-card data to prediction the boarding
behavior of passengers at a time window. In this research, we proposed a Deep-GAN
to over-sample the travelling instances and to re-balance the rate of travelling and
non-travelling instances in the smart-card dataset in order to improve a DNN based
prediction model of individual boarding behavior. The performance of Deep-GAN
was evaluated by applying the models on real-world smart-card data collected from
seven bus lines in the city of Changsha, China. Comparing the different imbalance
ratios in the training dataset, we found out that in general, the performance of the
model improves with more imbalanced data and the most significant improvement
comes at a 1:5 ratio between positive and negative instances. From the perspective of
prediction accuracy of the hourly distribution of bus ridership, the high rate of
imbalance will cause misleading load profiles and the absolutely balanced data may
over predict the ridership during peak hours. Comparison of different resembling
methods reveals that both over-sampling and under-sampling benefits the
performance of the model. Deep- GAN has the best recall score and its precision
scores best among the over-sampling methods. Although the performance of the
predictive model trained by the Deep-GAN-data is not significantly beyond other
resembling methods, the Deep- GAN also presented a powerful ability to improve
the quality of training dataset and the performance of predictive models, especially
when the under-sampling is not suitable for the data.
The contributions of this study are:
• The data imbalance issue in the public transport system has received little attention,
and this study is the first to focus on this issue and propose a deep learning approach,
Deep-GAN, to solve it.
• This study compared the differences in similarity and diversity between the real
and synthetic travelling instanced generated from Deep-GAN and other over-
sampling methods. It also compared different resembling methods for the
improvement of data quality by evaluating the performance of the next travel
behavior prediction model. This is the first validation and evaluation of the
performance of different data resembling methods based on real data in the public
transport system.
• This paper innovatively modeled individual boarding behavior, which is
uncommon in other travel demand prediction tasks. Compared to the popular
aggregated prediction, this individual-based model is able to provide more details on
the passengers’ behavior, and the results will benefit the analysis of the similarities
and heterogeneities.