Social Media Tourism Project

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

SOCIAL MEDIA TOURISM

PROJECT

JASMEEN KAUR
PROBLEM STATEMENT
An aviation company that provides domestic as well as international trips to the customers now wants to apply a
targeted approach instead of reaching out to each of the customers. This time they want to do it digitally instead of tele
calling. Hence they have collaborated with a social networking platform, so they can learn the digital and social behavior
of the customers and provide the digital advertisement on the user page of the targeted customers who have a high
propensity to take up the product. Propensity of buying tickets is different for different login devices. Hence, you have to
create 2 models separately for Laptop and Mobile. [Anything which is not a laptop can be considered as mobile phone
usage.] The advertisements on the digital platform are a bit expensive; hence, you need to be very accurate while
creating the models.

UNDERSTANDING AND NEED OF THE STUDY/PROJECT


In our day to day life we spend most of the time on social media browsing different websites, online shopping
and various pages for entertainment, which simply proves that most of our free time we like to invest on
internet. All the brands promote their products on social media as no other platform is as booming as social
media as it has become part of nearly everyone’s daily routine.
Predictive analytics is going to become increasingly popular — by analyzing the big data from social media,
companies will be able to identify the hallmarks of a customer who is about to cease business with said
company, and take efforts to correct it. They can also identify common behaviors between different customers
to see what makes them more likely to become a power user or be completely on board with the company’s
ideals. They can then use this information to convert social media users into customers of their business. So
here we have problem statement related to aviation industry, we will analyze the given data using python and
will perform univariate, bivariate analysis and EDA . In this project we will try to predict and analyze the digital
and social behavior of the customers and provide the digital advertisement on the user page of the targeted
customers who have a high propensity to take up the product.
Since buying tickets is different for different login devices so we will create 2 models separately for Laptop and Phones .

1|Page
DATA REPORT

TOTAL DATA
(11760, 17)

From the data given we can observe that some of the column are in object type this means some character in
there in data this is bad data we have to clean this and convert these data into int/float. There are some
missing values in some features we have to treat them as well. We have "*" in data, we have to either convert
this into missing values or # we can replace this with mode. We will also drop user id feature.

DATA PRE-PROCESSING
In the column “preferred_location_type” we can see "Tours and Travel" is repeat as some difference Tours
Travel we have to clean this and append to one of the attribute.
In the column “yearly_avg_Outstation_checkins” we have
"*"" in data, we have to either convert this into missing values or we can replace this with mode.
For such features, data cleansing is needed.
AFTER PRE-PROCESSING THE DATA

2|Page
preferred_location_type
BEFORE

AFTER

yearly_avg_Outstation_checkins

3|Page
Since buying tickets is different for different login devices so we will create 2 models separately for Laptop and Mobile.

Converting all variables other than laptop to Phones

4|Page
TREATING MISSING VALUES
Based on the percentage of missing value we can use different imputing techniques . if missing values is minimal we can
impute with simple imputer like mean, mode, median # If missing values percentage is larger we need to impute with
some advanced techniques like KNN imputation.

As max missing values is less than 5% we can impute them. In our dataset we have float and object missing values where
we can impute float with median and object with mode ,we have 4 float and 3 object data type for imputation so
Replacing NULL values in Numerical Columns using Median .

AFTER TREATING MISSING VALUES

CHECKING OUTLIERS

5|Page
Here we can observe that 2 features contain outliers so we will be taking help of Inter quartile range to treat outliers.

AFTER TREATING OUTLIERS

Checking skewness

6|Page
DATA VISUALIZATION: UNIVARIATE ANALYSIS
Numeric Data

7|Page
8|Page
Categorical Data
In case of categorical variable we are interested to know the frequencies of levels .we can observe the
frequencies in terms of count plot for categorical variables analyzing categorical variable frequencies levels
using seaborn count plots which gives the counts of observations in each category.

We can see here that probability of buying ticket for next month is less.

 We can see here that of the people prefer booking from Mobile Phones.

9|Page
 we can observe here that the most visited location is beach, financial and least visited place is hill station.

 We can observe that user mostly travel along with 3 and 4 family member.

 we can observe here that most of user do not follow company page.

10 | P a g e
BIVARIATE ANALYSIS
 Here we can observe that the people who don’t follow company page have high average view on company page
and people who follow company page has less view.

 Here we can observe that user who travel out since last outstation has higher probability of taking product.

11 | P a g e
PAIRPLOT

12 | P a g e
CORRELATION HEATMAP

13 | P a g e
BUSINESS INSIGHTS
 We observed that user mostly travel in the group of 3 or 4 so I would recommend that company should make
offers for the users who are travelling in group of 3 and 4 so that we can retain most of customers.
 We can observe here that the most visited location is beach, financial and least visited place is hill station so the
company should provide offers and discount based on most common locations.
 Yes we have observed correctly that data is heavily imbalanced and we will use smote to treat it.
 We also observe that the people who don’t follow company page have high average view on company page and
people who follow company page has less view this means our social media team is not effective to gain online
presence so I would recommend that social media campaigns should be there so that we can grab attention of
social media mob as it clearly impact business.
 Since buying ticket probability is less for next month via online, the company should advertise more on social
media on different platform analyzing the platforms which are used more by the public.

14 | P a g e

You might also like