BIO401 Best File for Mid Term by Jawad Masroor (J Biology)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

By Jawad Masroor (Subscribe to J Biology YouTube Channel and Learn Biology with Concepts and Tricks

BIO401 Best File For MID Term Preparation

Course Name : Biostatistics


Course Code: BIO401
Total Lectures : 88
Mid Term Syllabus : Lecture No 01 to 44

Final Term Syllabus : Lecture No 45 to 88

Define biostatistics:
The use of statistical methods to solve the biological problems. It is also called biometry.

Population and Sample:


Population refers to the entire group of individuals that share a common characteristic and are
the focus of a study at a particular time. In population the measurable quantity is called
parameter.
While Sample is a subset of the population that is selected for the actual analysis. Sampling is
often done when it is impossible to study the entire population. In sample the measurable
quantity is called statistic.

Finite Population (Countable):


• The finite population is also known as a countable population in which the population can be
counted.
• For statistical analysis, the finite population is more advantageous.
• Example: employees of a company.

Infinite Population (Uncountable):


• The population in which the counting of units
in the population is not possible.
• It is less advantageous in statistical analysis.
• Example: Number of germs in the patient’s body.

Parameter:
A parameter is a numerical characteristic of a population. It is a fixed value that describes some
aspect of the population. For example, the average blood pressure of all adults in a specific region
is a parameter.

Statistic:
A statistic is a numerical characteristic of a sample. It is used to estimate or infer the
corresponding parameter of the population. For instance, the average blood pressure of a sample
of 100 adults from the same region is a statistic.

Data:
Data refers to information collected, observed, or measured from population or sample during
the study. This information can take various forms, including numerical values, categorical labels,
By Jawad Masroor (Subscribe to J Biology YouTube Channel and Learn Biology with Concepts and Tricks

or descriptive details, and it serves as the foundation for statistical analysis. Biostatisticians use
data to draw conclusions and derive meaningful insights about biological phenomena, health
outcomes, or medical interventions.
There are two types of data:

1. Primary Data:
• This refers to data that is collected firsthand directly from the source.
• Common methods of collecting primary data include surveys, interviews, experiments, and
observations.
• Primary data is original, providing a high degree of relevance and specificity to the research
question.
• It is expensive and take more time.

2. Secondary Data:
• The that has already been collected by someone else for a different purpose.
• Researchers utilize existing sources, such as published studies, government reports, or datasets,
to obtain secondary data.
• Secondary data is less specific to the researcher's needs and may not perfectly align with the
study objectives.
• It is not expensive and time-saving way to access a large volume of information.

Variable:
A variable is a characteristic that can be measured and that can assume different values.
There are two types of variable:

1. Qualitative variable:
Qualitative variables are non-numeric and describe qualities or characteristics. Examples include
gender, blood type, place of birth, ethnic group, type of drug etc.

2. Quantitative variable:
A quantitative variable is one that can be measured and expressed numerically.
They have further two types:

a) Discrete Variable: Discrete variables are numeric variables that can only take specific, distinct
values, often counted in whole numbers and cannot be subdivided into smaller units. Examples
include the number of patients in a clinic, the number of red blood cells.

b) Continuous Variable: Continuous variables are numeric variables that can take any value within
a range and can be measured with a high level of precision. These variables can be further
subdivided into smaller and smaller units, and they are often associated with measurements
rather than counts. Examples include height, weight, blood pressure, or temperature.
By Jawad Masroor (Subscribe to J Biology YouTube Channel and Learn Biology with Concepts and Tricks

Sampling:
The process of selecting a subset from a larger population for study. There are two types:

1. Probability sampling:
In this type random selection take place and every element of the population has equal chance to
be the part of selected sample. There many ways to do this:
a) Simple Random Sampling:
Every individual in the population has an equal chance of being selected, and each selection is
independent of others.

b) Stratified Random Sampling:


The population is divided into subgroups or strata based on certain characteristics, and then
random samples are taken from each stratum.

c) Cluster Sampling:
The population is divided into clusters, and then a random sample of clusters is selected. All
members within the chosen clusters are included in the sample.
• Stratified sampling, which involves selecting from every subgroup, cluster sampling involves
selecting entire groups.

d) Systematic Sampling:
- A random starting point is selected, and then every kth element in the population is chosen for
the sample.

e) Multistage Sampling:
Involves multiple stages of sampling. It often combines various sampling methods, such as cluster
sampling followed by simple random sampling within selected clusters.

2. Non- Probability Sampling:


The sampling technique where not every member of the population has an equal chance of being
included in the sample.
Some common types of non-probability sampling methods include:
a) Convenience Sampling:
Participants are selected based on availability.

b) Purposive Sampling:
Choose based on the purpose of the study.

c) Quota Sampling:
Choose on the base of some pre-set standard. Proportion of the sample should be same as
population.

d) Snowball Sampling:
First participants give the future participants. An initial set of participants is selected through
purposeful sampling. These participants are then asked to refer others who might meet the study
criteria.
By Jawad Masroor (Subscribe to J Biology YouTube Channel and Learn Biology with Concepts and Tricks

Mean:
The mean, or average, is calculated by summing up all the values in a dataset and dividing the sum
by the number of values.
Example:
Consider the dataset {12, 15, 18, 22, 25}.
Mean = (12 + 15 + 18 + 22 + 25) / 5 = 92 / 5 = 18.4

Mode:
The mode is the value that appears most frequently in a dataset.
Example:
For the dataset {5, 8, 12, 8, 15, 8, 22}, the mode is 8 because it appears more frequently than any
other value.

Median:
The median is the middle value when a dataset is ordered. If the dataset has an even number of
values, the median is the average of the two middle values.
Example:
For the dataset {10, 15, 18, 22, 25}, the median is 18. For the dataset {10, 15, 18, 22}, the median
is (15 + 18) / 2 = 16.5.

Range:
The range is the difference between the maximum and minimum values in a dataset.
Example:
For the dataset {8, 12, 5, 15, 22}, the range is 22 (maximum) - 8 (minimum) = 14

Branches/ Types of Biostatistics:

1. Inferential Statistics:
Inferential statistics, on the other hand, go beyond the immediate data and use probability theory
to make inferences or predictions about a population based on a sample of data. This includes
techniques like hypothesis testing, confidence intervals, and regression analysis. Inferential
statistics help draw conclusions and make generalizations about populations by analyzing a subset
of the data.

2. Descriptive Statistics:
Descriptive statistics involve methods for summarizing and presenting data in a meaningful way.
This includes measures of central tendency like the mean, median, and mode, as well as measures
of dispersion like range and standard deviation. Descriptive statistics aim to provide a clear and
concise overview of the main features of a dataset.

• Descriptive statistics is performed using Tabular and Graphical Methods.

i) Tubular Method: It is used to summarize the data in table form. The most frequent method is
used in tabular form is frequency table.
ii) Graphical Methods: It is visual way of parenting data using charts and graphs. The most
frequent methods are used for this purpose are Histogram, Box, Whisker Plot, Scatter plot etc.
By Jawad Masroor (Subscribe to J Biology YouTube Channel and Learn Biology with Concepts and Tricks

Frequency Distribution:
Frequency refers to the number of times a particular value occurs in a dataset and Frequency
Distribution is a method used to organize and summarize the data in tables and give more insight
of the data.
There are three types of frequency distribution.
1. Categorical Frequency Distribution: It is used for qualitative data where categories have no any
order such as colour, gender etc.
2. Ungrouped Frequency Distribution: It is used for the quantitative data mainly dealing with
small dataset and each value is treated separately.
3. Grouped Frequency Distribution: Also used for quantitative data but deals with larger data set
where data grouped into many interval or classes.

 Relative Frequency = frequency of the class / Total


 Cumulative Relative Frequency = sum of previous relative Frequencies + current class
frequency.

In categorical and ungrouped frequency distribution we make a table. Write the data (avoid
repetition) and use Tally Marks for counting and write frequency in 3rd column. We can also find
relative frequency and Cumulative Relative Frequency in 4th and 5th columns using above
formulas.

Steps for Grouped Frequency Distribution:


1. Write the data in order array.
2. Find the smallest value (Xo) and largest value (Xm)
3. Find the range which is difference between largest and smallest value.
4. Find width (class interval) denoted by h: divide range by number of chosen classes (Often 5 to
15 classes to be assumed).
5. Make number of classes from smallest value to largest value based on class interval.
6. Now draw tally mark and find the frequency of the dataset.

Frequency Histogram:
After the above process if we make bars for visual display of the data then it is called frequency
histogram. The height of the bar represent the frequency.

Best video for frequency distribution explanation:


https://youtu.be/riYpfCIuOUo?si=EsjjuDC4qWKHF5ek

Box and Whisker Plot Method:


This method is used for graphical presentation of data.
Step:
1. Arrange the data in ascending order.
2. Find mean (50% of data), lower quartile (25%), upper quartile (75%) of the data.
3. Draw a number line that include smallest and largest value of the data.
4. Draw three vertical lines at the lower quartile, mean and upper quartile above the number
line.
5. Join the lines for lower quartile and upper quartile to form a box
6. Draw the line from box at left side for smallest value and a line from box to right for largest
By Jawad Masroor (Subscribe to J Biology YouTube Channel and Learn Biology with Concepts and Tricks

value.
7. The difference between upper and lower quartile is called interquartile range (IQR).

Video explanation:
https://youtu.be/7jP3TowC9U4?si=4BJoYzmpV5zHt7KH

Permutations:
The arrangements of objects in a specific order.
Case 1:
When all objects are distinct:
Formula: nPr = n!/(n-r)!

Example:
A club consists of four members. How many ways are there of selecting three officers: president,
secretary and treasurer? It is evident that the order, in which 3 officers are to be chosen, is of
significance.
n=4
r=3
Put in formula:
4P3 = 4! / (4 - 3)!
4x3x2x1 = 24

Case 2:
When some objects are repeated:
Formula:
P = n! / n1! x n2! x n3! ......

Example:
Find number of permutations in the word "BOOKKEEPER,"
• Total number of letter (n) = 10
•O=2
•K=2
•E=3
P = 10! / 2! x 2! x 3!
P = 151200

Combination :
The selections of elements without considering the order.
Formula:
nCr = n! / r! (n-r)!
Example:
A group of 5 people, and you want to form a committee of 2 people. Find number of
combination.
n=5
r=2
C = 5! / 2! (5-2) !
C = 10 combinations
By Jawad Masroor (Subscribe to J Biology YouTube Channel and Learn Biology with Concepts and Tricks

Video Explanation:
Short video : https://youtu.be/GZATQeI8dJo?si=H_IXfmUA7GrShg6y
Long video complete concept: https://youtu.be/Tr-TVt5JAWY?si=3Ozrz9wWSERRsnxZ

Mean Deviation:
Example : Set of data: [2, 4, 6, 8, 10]. Find mean deviation.

1. Calculate the Mean:


2 + 4 + 6 + 8 + 10 / 5 = 6

2. Find the Deviation for Each Data Point:


• Deviation for 2: (2 - 6 = -4)
• Deviation for 4: (4 - 6 = -2)
• Deviation for 6: (6 - 6 = 0)
• Deviation for 8: (8 - 6 = 2)
• Deviation for 10: (10 - 6 = 4)

3. Sum of Deviations:
[ (-4) + (-2) + 0 + 2 + 4 = 0 ]

Variance:
1. Square the mean deviation and then sum up and divide by number of dataset.
16 + 4 + 0 + 4 + 16 / 5 = 8

Standard Deviation:
Take the square root of variance value
√8 = 2.83

Coefficient of standard deviation = Standard deviation / Mean x 100


2.83 / 6 x 100 = 47.16

Video Explanation:
https://youtu.be/tXv1KVoQrWA?si=y_ZS0-JQSc62x5d_

https://youtu.be/179ce7ZzFA8?si=77AQC7y8qFUGmlqr

Probability = Number of favorable outcomes/ Total number of possible outcomes


Example:
Tossing a fair six-sided die. Find the Probability of Rolling an Even Number:
Possible outcomes = 2,4,6
Total Sides = 6
P = 3/6 = 1/3

Video Explanation: https://youtube.com/shorts/6oCvo2BULbU?si=JEx8UtCxynvw889l

You might also like