T I M e S T A M P G R o U P L A N D I N G - P A G e C o N V e R T e D

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

1.

 Now, read in the ab_data.csv data. Store it in df. Use your dataframe to answer the


questions in Quiz 1 of the classroom.

a. Read in the dataset and take a look at the top few rows here:
In [3]:
# take a look at the first five rows of the data
df = pd.read_csv('/home/kesci/input/ab_testing_data9749/ab_data.csv')
df.head()
Out[3]:

l
a
t c
n
u i o
d
s m g n
i
e e r v
n
r s o e
g
_ t u r
_
i a p t
p
d m e
a
p d
g
e

0 8 2 c o 0
5 0 o l
1 1 n d
1 7 t _
0 - r p
4 0 o a
1 l g
- e
2
1

2
2
:
1
1
:
4
8
.
5
5
l
a
t c
n
u i o
d
s m g n
i
e e r v
n
r s o e
g
_ t u r
_
i a p t
p
d m e
a
p d
g
e

6
7
3
9

1 8 2 c o 0
0 0 o l
4 1 n d
2 7 t _
2 - r p
8 0 o a
1 l g
- e
1
2

0
8
:
0
1
:
4
5
.
1
5
9
7
3
l
a
t c
n
u i o
d
s m g n
i
e e r v
n
r s o e
g
_ t u r
_
i a p t
p
d m e
a
p d
g
e

2
0
1
7
-
0
1
-
1
t
1 n
r
6 e
e
6 1 w
a
1 6 _
2 t 0
5 : p
m
9 5 a
e
0 5 g
n
: e
t
0
6
.
1
5
4
2
1
3
l
a
t c
n
u i o
d
s m g n
i
e e r v
n
r s o e
g
_ t u r
_
i a p t
p
d m e
a
p d
g
e

2
0
1
7
-
0
1
-
0
t
8 n
r
8 e
e
5 1 w
a
3 8 _
3 t 0
5 : p
m
4 2 a
e
1 8 g
n
: e
t
0
3
.
1
4
3
7
6
5

4 8 2 c o 1
6 0 o l
4 1 n d
l
a
t c
n
u i o
d
s m g n
i
e e r v
n
r s o e
g
_ t u r
_
i a p t
p
d m e
a
p d
g
e

7
-
0
1
-
2
1

0
1 _
t
9 : p
r
7 5 a
o
5 2 g
l
: e
2
6
.
2
1
0
8
2
7

b. Use the below cell to find the shape of rows in the dataset.
In [4]:
# take a look at the shape of the data
df.shape
Out[4]:
(294478, 5)
In [5]:
# take a look at the data types of the columns
df.dtypes
Out[5]:
user_id int64
timestamp object
group object
landing_page object
converted int64
dtype: object

c. The number of unique users in the dataset.


In [6]:
# the number of the unique user
# len(df['user_id'].unique())
df['user_id'].nunique()
Out[6]:
290584

In [7]:
# take a look at the description of the data
# df.describe()

d. The proportion of users converted.


In [8]:
# the proportion of user converted
df[df['converted'] == 1]['user_id'].nunique() / df['user_id'].nunique()
Out[8]:
0.12104245244060237

You might also like