Sampling and Estimation

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Sampling and Estimation

Source: CFA® Program Curriculum Level I Volume I

1. Suppose we take a random sample of 30 companies in an industry with 200 companies. We


calculate the sample mean of the ratio of cash flow to total debt for the prior year. We find that
this ratio is 23 percent. Subsequently, we learn that the population cash flow to total debt ratio
(taking account of all 200 companies) is 26 percent. What is the explanation for the discrepancy
between the sample mean of 23 percent and the population mean of 26 percent?
A. Sampling error.
B. Bias.
C. A lack of consistency.

2. The best approach for creating a stratified random sample of a population involves:
A. drawing an equal number of simple random samples from each subpopulation.
B. selecting every kth member of the population until the desired sample size is reached.
C. drawing simple random samples from each subpopulation in sizes proportional to the relative
size of each subpopulation.

3. A population has a non- normal distribution with mean μ and variance 𝜎 . The sampling
distribution of the sample mean computed from samples of large size from that population will
have:
A. the same distribution as the population distribution.
B. its mean approximately equal to the population mean.
C. its variance approximately equal to the population variance.

4. A sample mean is computed from a population with a variance of 2.45. The sample size is 40. The
standard error of the sample mean is closest to:
A. 0.039.
B. 0.247.
C. 0.387.

5. An estimator with an expected value equal to the parameter that it is intended to estimate is
described as:
A. efficient.
B. unbiased.
C. consistent.

6. If an estimator is consistent, an increase in sample size will increase the:


A. accuracy of estimates.
B. efficiency of the estimator.
C. unbiasedness of the estimator.

7. For a two- sided confidence interval, an increase in the degree of confidence will result in:
A. a wider confidence interval.
B. a narrower confidence interval.
C. no change in the width of the confidence interval.

1|Page
8. As the t-distribution’s degrees of freedom decrease, the t-distribution most likely:
A. exhibits tails that become fatter.
B. approaches a standard normal distribution.
C. becomes asymmetrically distributed around its mean value.

9. For a sample size of 17, with a mean of 116.23 and a variance of 245.55, the width of a 90%
confidence interval using the appropriate t-distribution is closest to:
A. 13.23.
B. 13.27.
C. 13.68.

10. For a sample size of 65 with a mean of 31 taken from a normally distributed population with a
variance of 529, a 99% confidence interval for the population mean will have a lower limit closest
to:
A. 23.64.
B. 25.41.
C. 30.09.

11. An increase in sample size is most likely to result in a:


A. wider confidence interval.
B. decrease in the standard error of the sample mean.
C. lower likelihood of sampling from more than one population.

12. A report on long- term stock returns focused exclusively on all currently publicly traded firms in
an industry is most likely susceptible to:
A. look- ahead bias.
B. survivorship bias.
C. intergenerational data mining.

13. Which sampling bias is most likely investigated with an out- of- sample test?
A. Look- ahead bias
B. Data- mining bias
C. Sample selection bias

14. Which of the following characteristics of an investment study most likely indicates time- period
bias?
A. The study is based on a short time- series.
B. Information not available on the test date is used.
C. A structural change occurred prior to the start of the study’s time series.

CFA Website Questions


15. A mutual fund manager wants to create a fund based on a high-grade corporate bond index. She
first distinguishes between utility bonds and industrial bonds; she then, for each segment, defines
maturity intervals of less than 5 years, 5 to 10 years, and greater than 10 years. For each segment
and maturity level, she classifies the bonds as callable or noncallable. She then randomly selects
bonds from each of the subpopulations she has created. For the manager’s sample, which of the
following best describes the sampling approach?
A. Simple random

2|Page
B. Systematic
C. Stratified random

16. The sampling error is best described as the:


A. sample standard deviation divided by the square root of the sample size.
B. difference between the observed value of a statistic and the quantity it is intended to
estimate.
C. sum of squared deviations from the mean divided by the sample size minus one.

17. An analyst collects data relating to five commonly used measures of leverage and interest
coverage for a randomly chosen sample of 300 firms. The data comes from those firms ’fiscal year
2012 annual reports. These data are best characterized as:
A. time-series data.
B. cross-sectional data.
C. longitudinal data

18. A sample of 240 managed portfolios has a mean annual return of 0.11 and a standard deviation
of returns of 0.23. The standard error of the sample mean is closest to:
A. 0.01485.
B. 0.00096.
C. 0.00710.

19. An analyst gathered the following information about a stock index:


Mean net income for all companies in the index $2.4 million
Standard deviation of net income for all companies in the index $3.2 million
If the analyst takes a sample of 36 companies from the index, the standard error of the sample
mean is closest to:
A. $400,000.
B. $533,333.
C. $88,889.

20. In n generating an estimate of a population parameter, a larger sample size is most likely to
improve the estimator’s:
A. consistency.
B. unbiasedness.
C. efficiency.

21. All else held constant, the width of a confidence interval for a population mean is most likely to
be smaller if the sample size is:
A. larger and the degree of confidence is lower.
B. larger and the degree of confidence is higher.
C. smaller and the degree of confidence is lower.

22. Which of the following, holding all else constant, will most likely increase the width of the
confidence interval for a parameter estimate?
A. Reduction in the degree of confidence
B. Increase in the sample size

3|Page
C. Use of the t-distribution rather than the normal distribution to establish the confidence
interval

23. The following information applies to a sample:


 The point estimate of the population mean is 12.5.
 The t-statistic (tα/2) used in calculating the 90% confidence interval is 1.67.
 The sample size is 64.
 The sample standard deviation is 5.
The 90% confidence interval for the population mean is closest to:
A. 11.98 to 13.02.
B. 12.37 to 12.63.
C. 11.46 to 13.54.

24. An increase in which of the following items will most likely result in a wider confidence interval for
the population mean?
A. Reliability factor
B. Sample size
C. Degrees of freedom

25. Use the following values from a student’s t-distribution to establish a 95% confidence interval for
the population mean given a sample size of 10, a sample mean of 6.25, and a sample standard
deviation of 12. Assume that the population from which the sample is drawn is normally
distributed and that the population variance is not known.
Degrees of Freedom p = 0.10 p = 0.05 p = 0.025 p = 0.01
9 1.383 1.833 2.262 2.821
10 1.372 1.812 2.228 2.764
11 1.363 1.796 2.201 2.718
The 95% confidence interval is closest to a:
A. lower bound of −2.20 and an upper bound of 14.70.
B. lower bound of −0.71 and an upper bound of 13.21.
C. lower bound of −2.33 and an upper bound of 14.83.

26. Survivorship bias is most likely an example of which bias?


A. Sample selection
B. Data mining
C. Look-ahead

Solutions
1. A is correct. The discrepancy arises from sampling error. Sampling error exists whenever one fails
to observe every element of the population, because a sample statistic can vary from sample to
sample. As stated in the reading, the sample mean is an unbiased estimator, a consistent
estimator, and an efficient estimator of the population mean. Although the sample mean is an
unbiased estimator of the population mean—the expected value of the sample mean quals the
population mean—because of sampling error, we do not expect the sample mean to exactly equal
the population mean in any one sample we may take.

2. C is correct. Stratified random sampling involves dividing a population into subpopulations based
on one or more classification criteria. Then, simple random samples are drawn from each

4|Page
subpopulation in sizes proportional to the relative size of each subpopulation. These samples are
then pooled to form a stratified random sample.

3. B is correct. Given a population described by any probability distribution (normal or non- normal)
with finite variance, the central limit theorem states that the sampling distribution of the sample
mean will be approximately normal, with the mean approximately equal to the population mean,
when the sample size is large.

4. B is correct. Taking the square root of the known population variance to determine the population
standard deviation (σ) results in:
σ = √2.45 = 1.565

The formula for the standard error of the sample mean (𝜎 ), based on a known sample size (n), is:

5. B is correct. An unbiased estimator is one for which the expected value equals the parameter it is
intended to estimate.

6. A is correct. A consistent estimator is one for which the probability of estimates close to the value
of the population parameter increases as sample size increases. More specifically, a consistent
estimator’s sampling distribution becomes concentrated on the value of the parameter it is
intended to estimate as the sample size approaches infinity.

7. A is correct. As the degree of confidence increases (e.g., from 95% to 99%), a given confidence
interval will become wider. A confidence interval is a range for which one can assert with a given
probability 1 – α, called the degree of confidence, that it will contain the parameter it is intended
to estimate.

8. A is correct. A standard normal distribution has tails that approach zero faster than the t-
distribution. As degrees of freedom increase, the tails of the t-distribution become less fat and the
t-distribution begins to look more like a standard normal distribution. But as degrees of freedom
decrease, the tails of the t-distribution become fatter.

9. B is correct. The confidence interval is calculated using the following equation:

5|Page
Therefore, the interval spans 109.5943 to 122.8656, meaning its width is equal to approximately
13.271. (This interval can be alternatively calculated as 6.6357 × 2).

10. A is correct. To solve, use the structure of Confidence interval = Point estimate ± Reliability factor
× Standard error, which, for a normally distributed population with known variance, is represented
by the following formula:

11. B is correct. All else being equal, as the sample size increases, the standard error of the sample
mean decreases and the width of the confidence interval also decreases.

12. B is correct. A report that uses a current list of stocks does not account for firms that failed,
merged, or otherwise disappeared from the public equity market in previous years. As a
consequence, the report is biased. This type of bias is known as survivorship bias.

13. B is correct. An out- of- sample test is used to investigate the presence of datamining bias. Such a
test uses a sample that does not overlap the time period of the sample on which a variable,
strategy, or model was developed.

14. A is correct. A short time series is likely to give period- specific results that may not reflect a longer
time period.

15. C is correct. In stratified random sampling, one divides the population into subpopulations and
randomly samples from within the subpopulations.

16. B is correct. The sampling error is the difference between the observed value of a statistic and the
quantity it is intended to estimate.

17. B is correct. Data on some characteristics of companies at a single point in time are cross-sectional
data.

18. A is correct. For a sample, the standard error of the mean is 𝜎 = where s is the sample

.
standard deviation and n is the sample size), which here is :𝜎 = = 0.01485

19. B is correct. The standard error of the sample mean is equal to the population standard deviation
(σ) divided by the square root of the number of observations in the sample (n):

, ,
𝜎 = = = 533,333
√ √

6|Page
20. A is correct. A consistent estimator is one for which the probability of estimates close to the value
of the population parameter increases as the sample size increases. Unbiasedness and efficiency
are properties of an estimator’s sampling distribution that hold for any size sample.

21. A is correct. As the degree of confidence is increased, the confidence interval becomes wider. A
larger sample size decreases the width of a confidence interval.

22. C is correct. Reflecting the uncertainty of the unknown variance, confidence intervals based on
the t-distribution will be larger than those using the normal distribution because t > z for any
sample size, n, with the exception of n = ∞. Larger sample sizes and reduced confidence levels,
holding all else constant, both reduce the width of a confidence interval.

5
23. C is correct. 12.5±1.67× = 12.5±1.04375 = 11.45625 to 13.54
√64

24. A is correct. An increase in the reliability factor (the degree of confidence) increases the width of
the confidence interval. Increasing the sample size and increasing the degrees of freedom both
shrink the confidence interval.

25. C is correct. With a sample size of 10, there are 9 degrees of freedom. The confidence interval
concept is based on a two-tailed approach. For a 95% confidence interval, 2.5% of the distribution
will be in each tail. Thus, the correct t-statistic to use is
12
2.262. => 6.25 ± 2.262 × = 6.25 ± 8.58369 = -2.33 to 14.83
√10

26. A is correct. Sample selection bias often results when a lack of data availability leads to certain
data being excluded from the analysis. Survivorship bias is an example of sample selection bias.

7|Page

You might also like