Lab04 Discrete Distributions - Ipynb
Lab04 Discrete Distributions - Ipynb
Lab04 Discrete Distributions - Ipynb
[],"collapsed_sections":[]},"kernelspec":{"name":"python3","display_name":"Python
3"},"language_info":{"name":"python"}},"cells":[{"cell_type":"markdown","source":
["**Objective:**\n","The objective of this notebook to see how we can generate data
from binomial and poisson distribution and how to visualize their plots.\n","\
n","In addition we will implement some examples from the class slides\n","\n","We
are going to introduce another plotting library,
[seaborn](https://seaborn.pydata.org/)"],"metadata":{"id":"geVz25xgzrau"}},
{"cell_type":"markdown","source":["# Introduction to Scipy"],"metadata":
{"id":"fGg8sz8x59C4"}},{"cell_type":"markdown","source":["SciPy is an open-source
library used for solving mathematical, scientific, engineering, and technical
problems. We are mainly going to use `scipy.stats` which focuses on Statistics and
random numbers"],"metadata":{"id":"VjmGWNJvl4q-"}},
{"cell_type":"markdown","source":["## **Importing Important
Librairies**"],"metadata":{"id":"rBeDPm-1bJDw"}},
{"cell_type":"code","execution_count":null,"metadata":{"id":"WDgR778k-
fjh"},"outputs":[],"source":["import numpy as np\n","import matplotlib.pyplot as
plt\n","import seaborn as sns"]},{"cell_type":"code","source":["from scipy.stats
import binom \n","n = 20 \n","p = 0.12\n","binom.sf(2, n, p)\n","# binom.cdf(2, n,
p)\n","# binom.cdf(k,n,p) - Cumulative distribution function - for less than or
equal to k\n","# binom.pmf(k,n,p) - Probability mass function - for exact two
defects\n","# binom.sf(k,n,p) - for more than 2 (similar to 1-cdf)\n","#
binom.mean(n,p) - for mean of the distribution\n","# binom.var(n,p) - for variance
of the distribution\n","# binom.std(n,p) - for standard deviation of the
distribution"],"metadata":{"id":"Q7wVR1Ka5_Aj"},"execution_count":null,"outputs":
[]},{"cell_type":"markdown","source":["# **Discrete Distribution
Functions**"],"metadata":{"id":"yVfm9654-hxi"}},{"cell_type":"markdown","source":
["##**Binomial Distribution**"],"metadata":{"id":"0IepIqxj-tEZ"}},
{"cell_type":"markdown","source":["We will first start with creating random
distributions and then we will use stat norm"],"metadata":{"id":"CODApjZEmPlw"}},
{"cell_type":"markdown","source":["### Random"],"metadata":{"id":"sUUliq0tmdaP"}},
{"cell_type":"code","source":["np.random.binomial(1, 0.5) # Flipping a single
unbiased coin"],"metadata":{"id":"unXr-g6N4vMr"},"execution_count":null,"outputs":
[]},{"cell_type":"code","source":["np.random.binomial(2, 0.5)"],"metadata":
{"id":"Qp1_70cd4yXy"},"execution_count":null,"outputs":[]},
{"cell_type":"code","source":["two_coins = np.random.binomial(2, 0.5,
1000)"],"metadata":{"id":"S1hETeID40vx"},"execution_count":null,"outputs":[]},
{"cell_type":"code","source":["# np.random.seed(555) # By using seed you can
make sure that each time you get the same numbers\
n","sns.countplot(x=two_coins)"],"metadata":
{"id":"lonLFXfn42NT"},"execution_count":null,"outputs":[]},
{"cell_type":"markdown","source":["### Exercise: Try a do the same if we have 10
coins:\n"],"metadata":{"id":"etdi_-6S5Jkb"}},{"cell_type":"code","source":["# n =
10, probability = 0.5 \n"],"metadata":
{"id":"hRtfvACI5RNa"},"execution_count":null,"outputs":[]},
{"cell_type":"markdown","source":["###Stats (biom)\n"," "],"metadata":
{"id":"LyGapMtQ8WAy"}},{"cell_type":"code","source":["n = 20\n","p = 12\
n","binom.mean(n, p)\n","x = np.arange(0, 21)\n","y= binom.pmf(x, 20,
0.12)"],"metadata":{"id":"fKtDHVpK-r5d"},"execution_count":null,"outputs":[]},
{"cell_type":"code","source":["sns.set_theme(style=\"whitegrid\")\
n","sns.barplot(x=x, y=y)"],"metadata":
{"id":"4uwAHiCA6q_N"},"execution_count":null,"outputs":[]},
{"cell_type":"code","source":["n = 20\n","p = 0.12\n","y_cumul = binom.cdf(x, n,
p)\n","\n","sns.set_theme(style=\"whitegrid\")\n","sns.barplot(x=x,
y=y_cumul)"],"metadata":{"id":"MZUQi3RI6uJ3"},"execution_count":null,"outputs":[]},
{"cell_type":"markdown","source":["### Example\n","\n"],"metadata":
{"id":"EjfomftPbW3G"}},{"cell_type":"code","source":["# setting the values\n","# of
n and p\n","n = 5\n","p = 0.1\n","# defining list of r values\n","r_values =
list(range(n))\n","# list of pmf values\n","dist = np.zeros(5)\n","dist[0] =
binom.pmf(0, n, p) \n","dist[1] = binom.pmf(1, n, p) \n","dist[2] = binom.pmf(2, n,
p) \n","dist[3] = binom.pmf(3, n, p) \n","dist[4] = binom.pmf(4, n, p) \n","#
plotting the graph \n","sns.barplot(x=r_values, y=dist)"],"metadata":
{"id":"ZfBdO9iybWgW"},"execution_count":null,"outputs":[]},
{"cell_type":"markdown","source":["### Exercise: A manufacturer has 12% defects
rate in production. The buyer decides to test 20 random pieces and will accept the
supplier if there are 2 or less defectives. What is the probability of getting
accepted? "],"metadata":{"id":"XAGp2cba5hJw"}},{"cell_type":"code","source":["#use
size = 10\n"],"metadata":{"id":"uHieOx1U5qcJ"},"execution_count":null,"outputs":
[]},{"cell_type":"markdown","source":["##**Poisson Distribution**"],"metadata":
{"id":"nIPqWjPt-y74"}},{"cell_type":"markdown","source":["### Using Ramdom from
Numpy"],"metadata":{"id":"Y1tLrJxR2s3F"}},{"cell_type":"code","source":
["np.random.poisson(3.6) #with lamba=3.6"],"metadata":
{"id":"apqFq09063Un"},"execution_count":null,"outputs":[]},
{"cell_type":"code","source":["np.random.poisson(3.6, 10) #with
lamba=3.6"],"metadata":{"id":"q1jHVnNY64CL"},"execution_count":null,"outputs":[]},
{"cell_type":"code","source":["q_size = np.random.poisson(6, 1000)"],"metadata":
{"id":"sP4qRZIS68zL"},"execution_count":null,"outputs":[]},
{"cell_type":"code","source":["sns.set_theme(style=\"whitegrid\")\
n","sns.countplot(x=q_size)\n","# We calculated the probability of getting 7 people
in the que as 0.0424"],"metadata":{"id":"St6KW-3I6-
Xh"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["###
Using Poisson from Stats (from Scipy)\n","\n","**Different Views of the Poisson
Distrubution**"],"metadata":{"id":"8eMsw9Mt-5NR"}},{"cell_type":"code","source":
["from scipy.stats import poisson\n","k = np.arange(30)\n","plt.plot(k,
poisson.cdf(k, 6)) # mean = 6\n","plt.title('Poisson distribition - CDF')\
n","plt.xlabel('X')\n","plt.ylabel('P(X)')"],"metadata":{"id":"Jv-ihZ9C-
57h"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["#
Exercise 2\n","\n","The manager of an online shopping website has determined that
an average of 5 customers per minute make a purchase on Saturdays.\n","\n","What is
the probability that during a one-minute interval on Saturday exactly 8 purchases
will be made?\n"],"metadata":{"id":"xbWdRuuC6j3Z"}},{"cell_type":"code","source":
["mean = 5\n","# P(x = 8)"],"metadata":
{"id":"uB4blIQ_68yX"},"execution_count":null,"outputs":[]},
{"cell_type":"markdown","source":["What is the probability that more than 2
purchases will be made during a one-minute interval on Saturday?"],"metadata":
{"id":"Pvfiz0tJ8YPx"}},{"cell_type":"code","source":["mean = 5\n","# P(x > 2) = 1 -
P( x <= 2)"],"metadata":{"id":"5UyYQ-Bk8hNS"},"execution_count":null,"outputs":[]},
{"cell_type":"markdown","source":["https://github.com/thomas-haslwanter/
statsintro_python/blob/master/ipynb/6_distDiscrete.ipynb"],"metadata":
{"id":"IHjUy00Ya63s"}}]}