Team Project Task List N. 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Team Project

Task list n. 2

Now that you have a clear guideline on what you want to do, you can start working on
your project. In this task list we are going to give you step by step instructionson how to
proceed, from collecting the data until their analysis.

But take into account that for your presentation you have not to follow this step by step
procedure but articulate a coherent presentation. That is, from all the tasks that you will
do with this task list, you have to extract the most interesting results and you have to
present them as if you were showing them to job or department coworkers. Therefore it
will not make any sense that yu start your presentation saying: At task 1 we have done...
and at task 2 we have done... but “The product that we are studying is ... and we have
analyzed the following characteristics... and we find out the these are the main results...”
Let's start then with the different steps that you have to follow.

 Design of the survey

Now that you know which information you need from your subjects, you have to think
on how you will obtain this information. This tasks is know as survey design.

Task 1: Using all what you know from the course and your discussions about the theme
of your study construct a clear and complete for your survey. Make sure to specify:

 The chosen group of subjects for the product that you are analyzing (students,
professors, citizen in general, ...)
 The number of individuals you are going to interview
HINT: Remember that the larger the sample size the better the precision of your
results and more interesting your study.
 The questions you are going to ask your subjects in relation to the variables you
have chosen. Make sure to specify the units of measure, when necessary.
HINT: Tell your subjects to give as precise as possible answers to the numerical
variables. The key of the study that you are doing about the consumption of a
product or a service is to understand the variability of the consumer behavior of
your subject, but this should not be caused by a lack of precision in the answers
of your subjects.
 Give examples and guide your subjects so that they can better answer the
answers, but without influencing the response of your subjects.
HINT: You can give options for the categorical variables, but do not do it for the
numerical variables so as not to influence the results.

 Managing the data


Once you have conducted your survey you should prepare the data in a format
which is suitable to perform your statistical analysis. You should enter your data
in an Excel spreadsheet so that you can later import it to other statistical
packages.

TASK 2: Prepare an Excel spreadsheet for data entry specifying where the different
variables are entered and the codification of your categorical variables.

 Random sample collection

It is of crucial importance that the sample that you get is authentically random if you
want to collect a sample that is useful and makes sense. For instance, it is not advisable
that you ask your friends to fill the survey (Do you know the reason why you would get a
very poor study?) Ideally you should not either conduct your survey only at the UPF,
since you will only interview people related to UPF and this would be a special selection,
unless your target group is exactly is this one. Remember that the more you randomize
your sample more interesting and representative the results of your study are going to
be.

Consequently we suggest the following procedure:

Step 1: Before you start read the chapter of the Moore textbook on data collection of
random samples. Try to understand especially Table B at the book Appendix.

Step 2: You will interview a minimum of 50 individuals. Choose any line of Table B and
start writing down the last two digits of the next 100 numbers that you find going to the
right until the end of the line and continuing in the next lines. If you arrive to the bottom
of the table continue from the first line.

For instance if you start at line 102 the first 5 numbers are:

76 50 00 27 54 ...

Once you have 100 numbers of two digits discard duplications (and also 00 if it appears
as in the above example) and keep the first 50 numbers of 2 digits that you have
chosen. If you finish up with less than 50 different numbers (which is unlikely to happen
as they are not so many duplications) continue where you had left at the list until
completing 50. This procedure will allow you to end up with 50 different numbers
between 1 and 99. Sort them in ascending order (from smaller to larger).

Suppose that you finish with a list that starts with the numbers 04, 05 and 10.

Step 3: Choose a localization in Barcelona or any other large city where you can find a
sufficient frequency of people, but not too large, and so that they are not all of the same
type (all workers, all students, all from the same firm, all tourists, …) and plan to be there
at a certain time during the day. Once you arrive prepare yourself to interview the first
individual. Start counting people passing by and using your list of numbers the first
person you will try to interview is the fourth one (04 is the first number in the list). The
second individual will be the fifth. The third individual will be the 10th and so on.
Continue the procedure until you complete your 50 interviews.

It is very likely that the person you are trying to interview does not want to participate.
This is not a problem. Simply wait for the next person that is in your list. In case you
cannot find 50 responses from the first 99 persons that pass by (in other words if you
exhaust the list of 50 numbers that you had extracted) start counting another time from 1
to 99 using the list to figure out who to interview until you complete your 50 cases
(adapt this procedure to the number of cases that you have decided to collect and to the
group of subjects you want to interview but make sure your sample is random, for
instance it is valid to interview only students but in this case it is not valid to interview
only law students or only female students).

Tell people that your are students doing a term project for your course work and that
they will help you a lot if they participate in the interview, that it will be very short (only 6
or so questions) and that they will do you a big favor.

Obviously nobody will control that you do your sample right. But believe me, if you creat
a fictitious sample, inventing the data or bypassing the procedure proposed here, you
will end up paying it, because afterwards it will be very hard to complete your study or it
will give absurd results and this will impact in your grade.

I. Data Analysis

Before you start this list of tasks...


 you should have collected the data discussed with the professor
 you should have entered your data organized into variables in a
spreadsheet

The next logic step of your data analysis is to get a general vision about the
properties of your data, particularly about the distribution of the variables that you
have chosen.

In this task list you will then:

 Represent graphically your data in histograms, boxplots and stemplots for


the numerical data and piecharts and bar charts for your categorical data.
 Provide numerical summaries for your data.
 Interpret both forms of representation of the data, understand is
interconnection and the limitations they present.

You will use Excel, R or Stata for your data analysis. Any statistical software that
you use is what we call a black box, in other words we do not know what the
computer is doing with the data that you enter. It is crucial to understand the
concepts behind what the computer is doing and interpret each graph and each
numerical summary computed in the list of tasks. The grade of your presentation
depends a lot from the degree of understanding of the techniques used that you
show while presenting the results.

Take also into account that your presentation will not last very long, during the
seminars of the corresponding week, so that all the groups can talk. This will
require you to produce visual and numerical results that are clear and good
answers to the questions set up in the list of tasks, but you are not supposed to
go task by task reviewing all the analyses that you have made, you should
instead think about the main results and construct a coherent and
articulated story around your main results.

You should be now in front of the spreadsheet of your data. Make sure you have
this file in a safe place that you cannot erase and save a copy of everything you
are doing in an USB stick or a rewritable CD-ROM. This is a basic rule of working
with data, do allways backups of your files. This is a golden rule and forgetting
it may put your job in danger.

If you have followed the tutorials you now know perfectly how to enter data into
the different programs. In this task you will only need the most simple applications
of these programs, and we will analyze each numerical variable and each
categorical variable separately.

Let's go! We are going to start with the categorical variables.

II. Categorical variables

Complete this section for each categorical variable that you have collected.

TASK 1: Construct a frequency table, with absolute and relative frequencies of the
analyzed variable. Interpret this table briefly.

TASK 2: Represent your data in a bar diagram and in a pie chart.

TASK 3: What is the relation between these frequency tables and these graphical
representations? Which one do you find more useful and why?

III. Numerical variables

Complete this section for each one of your three numerical variables.

TASK 4: Construct a frequency table, with absolute and relative frequencies of the
variable analyzed. Interpret this table briefly.
TASK 5: Construct a histogram of variable. Interpret it briefly. Comment all the main
patterns of this histogram:

 Center
 Spread
 Symmetry
 Outliers

TASK 6: Construct a stemplot of the data. Make sure to specify the stems adequately
and the unit of the leaves. Comment on your elections and about the graph itself. How
does the stemplots compare to the histograms that you had constructed?

TASK 7: Construct a table wit the numerical summaries, including all of the following:

 Measures of center: mean and median


 Measures of position: minimum, maximum and quartiles
 Measures of spread: standard deviation and interquartilic range
 Measures of symmetry: assymetry measure and median/mean comparison

Continue by answering the following questions:

 Interpret your measures of center. It is likely that they are different. What is the
meaning of this difference? What does it imply for the form of the distribution?
This is reflected in the form of the histogram and of the stemplot of each
variable?
 What do the spread measure of your distribution tell you? Are all these measures
equally adequate for all the numerical variables that you have?
 Interpret your measures of position. How do they relate with the histogram and
the stemplot?
 Is the distribution of your variables symmetrical or skewed (and if skewed in
which direction)? Can you think of any intuitive reason why your variable is
symmetrical or skewed?
 Do you have any outlier in the variable? If so, do you have any explanation for
the presence of this outlier?

Prepare a brief written report based on these tasks (at most 5 pages including graphs
and tables9. This report will have to be uploaded to Aula Global (one by team).

Prepare a brief presentation where you have to talk a couple of minutes each
component of the group (everybody has to talk). Take into account for this presentation:

 The time will be strictly approximately 10 minutes for the whole team, depending
on the number of teams that have to talk, the seminar instructor will set it up and
it has to be respected strictly.
 Take into account in the presentation the criteria for presenting in front of an
audience based on the ideas that you can find in Aula Global.

You might also like