User
User
User
User’s Manual
TUM70127A
COPYRIGHT 1997 BY QUANTIME LIMITED
Quantime Limited
Maygrove House
67 Maygrove Road
london
NW6 2EG
England
Contents
List of figures
1 Introduction....................................................................................................1
1.1 What Quantum does ................................................................................................... 1
1.2 Stages in a Quantum run............................................................................................. 1
5 Expressions ...................................................................................................23
5.1 Arithmetic expressions ............................................................................................. 23
Combining arithmetic expressions ........................................................................... 24
Counting the number of codes in a column.............................................................. 25
Generating a random number................................................................................... 27
Contents / i
Quantum v5e User’s Manual
ii / Contents
Quantum v5e User’s Manual
9 Flow control................................................................................................107
9.1 Statements of condition – if .................................................................................... 107
9.2 Statements of condition – else ................................................................................ 109
9.3 Routing around statements ..................................................................................... 110
9.4 continue................................................................................................................... 111
9.5 Loops ...................................................................................................................... 111
do with individually specified numeric values....................................................... 112
do with numeric ranges .......................................................................................... 113
do with codes.......................................................................................................... 114
Nested loops ........................................................................................................... 115
Routing with loops ................................................................................................. 115
9.6 Rejecting records .................................................................................................... 116
9.7 Jumping to the tabulation section ........................................................................... 117
9.8 Stopping the processing of data by the edit ............................................................ 118
9.9 Canceling the run.................................................................................................... 119
9.10 Going temporarily to the tab section ...................................................................... 120
Contents / iii
Quantum v5e User’s Manual
iv / Contents
Quantum v5e User’s Manual
Contents / v
Quantum v5e User’s Manual
vi / Contents
Quantum v5e User’s Manual
22 Table texts...................................................................................................365
22.1 Table titles .............................................................................................................. 365
Titles for T-statistics tables only ............................................................................ 368
22.2 Underlining titles .................................................................................................... 368
22.3 Printing text at the foot of a table ........................................................................... 369
22.4 Printing text at the bottom of a page....................................................................... 371
22.5 Table numbers ........................................................................................................ 372
22.6 Page numbers.......................................................................................................... 374
22.7 Controlling justification for individual tables......................................................... 376
25 Weighting....................................................................................................395
25.1 Weighting methods ................................................................................................. 395
Contents / vii
Quantum v5e User’s Manual
viii / Contents
Quantum v5e User’s Manual
30 Z, T and F tests...........................................................................................501
30.1 Z - tests ................................................................................................................... 501
One-sample Z-test on proportions.......................................................................... 501
Two-sample Z-test on proportions ......................................................................... 502
Z-Test on sub-sample proportions.......................................................................... 504
Z-Test on overlapping samples .............................................................................. 506
30.2 T-tests and F-tests ................................................................................................... 507
One-sample and paired T-test ................................................................................ 507
Two-sample T-test.................................................................................................. 511
Contents / ix
Quantum v5e User’s Manual
x / Contents
Quantum v5e User’s Manual
Contents / xi
Quantum v5e User’s Manual
xii / Contents
Quantum v5e User’s Manual
40.2 Converting Quantum data and programs with qtspss ............................................. 656
How qtspss works................................................................................................... 656
Preparing the Quantum run .................................................................................... 660
Running quantum and qtspss.................................................................................. 660
40.3 Converting Quantum data and programs with nqtspss ........................................... 660
How nqtspss works................................................................................................. 661
Preparing the Quantum run .................................................................................... 666
Running Quantum and nqtspss............................................................................... 666
40.4 Converting Quantum data and programs with qtsas............................................... 669
How qtsas works .................................................................................................... 669
Interface between Quantum and sas ...................................................................... 669
Preparing the Quantum run .................................................................................... 676
Running Quantum and qtsas .................................................................................. 676
40.5 Converting Quantum data and programs with nqtsas............................................. 676
40.6 SPSS output from nqtsas ........................................................................................... 677
Contents / xiii
Quantum v5e User’s Manual
Index
xiv / Contents
List of figures
List of figures / xv
Quantum v5e User’s Manual
The Quantum User’s Manual is written for spec. writers and Quanvert database
administrators, or similar people, who will be preparing data for use with Quanvert
Menus, Quanvert Text or Quanvert for Windows.
Important notes and cross references are marked with special symbols so they stand out
from the rest of the text.
Chapters
Chapters 1 to 3 give you an overview of the language and explain the basic concepts of
Quantum spec. writing.
Chapters 4 to 32 describe the keywords which make up the Quantum language and
include many hints and suggestions for writing efficient programs, as well as examples
of how to tackle particular spec. writing problems.
Chapter 33 describes files you may need to create in order to use certain Quantum
facilities.
Chapter 36 discusses many of the files created during a run and draws your attention to
those of particular interest.
Chapters 37 to 39 introduce a number of useful programs which you run outside of the
main Quantum program. In these chapters you’ll learn about creating a table of contents
for a run, converting the standard tabulation output into a file suitable for printing on a
PostScript printer, as well as finding out about a number of utility programs for cleaning
the project directory at the end of a run.
Chapter 40 tells you how to convert tables into a comma-delimited ASCII file, or convert
a Quantum program and data file into a SAS or SPSS data set and description.
Appendix A lists limits built into the Quantum source (some of which may be changed).
Appendix B contains a list of compilation error messages with suggestions as to why you
may see them and how to solve the problem which caused them to appear.
Appendix D explains how you can use Quantum with data that contains characters in the
extended ASCII character set.
Appendix F offers suggestions on how you can check whether a particularly large job will
run on your computer.
Appendix G contains a look-up table for options in the tabulation section showing the
statement(s) on which each keyword may be used.
Words which are keywords in the Quantum language are normally printed in italics in
the text. In the main description of each keyword, the keyword is shown in bold the first
time it is mentioned.
When showing the syntax of a statement, as in the Quick Reference sections, all
keywords are printed in bold. Parameters, such as question texts or responses, whose
values are user-defined are shown in italics. Optional parameters are enclosed in square
brackets, that is, [ ].
The ☞ symbol marks a reference for further reading related to the current topic.
Quantum has been designed with market researchers in mind so its syntax and grammar
are similar to English. Nevertheless, it is still a computer language and as such should be
used with precision and understanding.
• to provide you with enough information about how Quantum works to enable you to
carry out a specific task.
• to help you work out what went wrong when errors occur or when your output is not
what you expected.
• generate tables
Any Quantum run may perform as many or as few of these tasks as you like, but for each
run the basic format is the same.
First, the data is read onto a disk. Data on disk can come from punched cards or tape. It
may also be entered directly via a terminal by a telephone interviewer using Quancept or
a data entry clerk using a data entry package.
Introduction – Chapter 1 / 1
Quantum v5e User’s Manual
Next, the tasks to be performed are defined using the Quantum language, which is
described in chapter 2 to chapter 32.
Then, Quantum translates these tasks into instructions that the computer can understand.
Finally, the computer itself uses this program to run your job.
Quantum comprises two sections – an edit and tabulation section. The edit section checks
and validates the data, generates lists and reports, corrects data, produces new data files,
and recodes data and creates new variables. The tabulation section produces tables and
performs statistical calculations.
Quantum reads the records in the data file one at a time and passes them through the
various parts of the Quantum program. As long as there are records remaining in the data
file, the loop of ‘read a record → edit → tabulate’ is repeated; once the last record has
been processed, the tables are ready for printing.
If errors occur at any point in a Quantum run an error message is printed telling you what
is wrong.
2 / Chapter 1 – Introduction
2 Your Quantum program
Your Quantum program is the basic requirement for any Quantum run. It tells the
computer what tasks it has to perform. All Quantum programs are written in the Quantum
language which both you and the computer can understand. When writing in this
language you must take care that you say exactly what you mean; otherwise your output
may not be quite what you expect. The computer cannot guess at what you mean it to do;
it only does what you tell it.
All Quantum programs are stored in separate files on the computer. Each file has a unique
name which may be made up of any characters on your keyboard, but you are advised to
use only letters and numbers in your filenames.
*include edit
a;dsp;spechar=–*;decp=0;flush
*include tabs
*include axes
where the file called edit contains editing instructions, the file called tabs contains
statements defining the tables required, and axes contains statements which define the
individual rows and columns each table is to have. The a statement lists characteristics
that all tables are to have.
In the following sections we will explain briefly the types of statements you can use.
Edit statement
Quantum edit statements contain a Quantum keyword and other texts and numbers.
Statements in the edit section may start in any column. A line may contain one or more
statements, as long as each statement is separated by a semicolon.
Edit statements may be preceded by a label number of up to five digits allowing them to
be referenced by other parts of the program, for example:
Here we are adding the number in column 56 to those in columns 57 and 58 and saving
the result in a variable called ‘total’. If this value is greater than eight we go to statement
100, otherwise we continue with the statement immediately after the if line.
Quantum offers you the ability to check and verify your data prior to tabulation. Suppose
your questionnaire contains a series of questions to be answered only by people buying
a specific brand of tea. You may want to check that everyone who didn’t buy tea has a
blank in all columns related to tea. On the other hand, if they did buy a specific brand of
tea, you could check whether the codes in the following columns were within a specific
range.
The statement that you would use for this type of test is require. To perform the test given
as an example, we might write:
This says that if column 24 contains a ’1’, then columns 25 to 30 must not be blank,
otherwise, if column 24 does not contain a ’1’, then columns 25 to 30 must all be blank.
More generalized checking facilities exist which enable you to produce frequency
distributions of numeric data (e.g., how many respondents have the number 201 in
columns 13 to 15) or holecounts (marginals) which show the broad pattern of coding
across all columns in the data. Words associated with these are list and count.
When errors are found in the data, you have several courses of action open to you. You
may:
if (c224’5’) write
Incidentally, many of the statements mentioned in this section may be used for other
purposes, rather than just to deal with errors.
Quantum offers you many aids to efficient programming. Repetitive checks may be
specified once with instructions to Quantum to repeat them a given number of times or
until a certain condition is satisfied. The word associated with loops of this kind is do.
There are two sorts of routing: you may either go to another edit statement (go to) or you
may send the record straight on to the tabulation section (return).
Tabulation statements
Tabulation statements tell Quantum which tables are required and how to create them.
They consist of a start letter or keyword to identify the type, and may be followed by
other keywords, numbers or text. They are used to define rows and columns (elements),
the variables that are to be cross-tabulated (axes) and finally, the tables themselves.
There are also statements for weighting your data and for creating tables by manipulating
the contents of tables created previously in the current run or even in other runs.
Writing in the Quantum language is very easy but as with all computer languages it needs
to be done with care and precision to obtain the required results.
The characters and symbols that you may use in Quantum are:
Where symbols have two meanings, the meaning required will become clear in the
context in which the symbol is employed.
Quantum is a ‘free-format’ language which means that within reason you may enter your
program however you like. Statements occupy columns 1 to 200 of successive lines and
may be written in upper or lower case or a combination. Thus:
The exception to this is text in tables, where the text is printed on the tables in the same
case as you write it in your Quantum program. Additionally, you must set up table text
so that it fits on the paper when you print your tables. Therefore, if you want the table
title to be printed on two lines, you must write it on two lines in your program.
Generally, spaces are allowed anywhere in a Quantum program except within Quantum
keywords.
As we mentioned earlier, Quantum has separate edit and tabulation sections which may
or may not be in the same file. If your program contains an edit, it must precede the
tabulation statements and must be enclosed by the words ed and end, each on a separate
line, thus:
ed
.
edit statements
.
end
Errors will occur if either of these words is missing. If there is no edit these statements
are not needed.
3.3 Comments
Comment statements insert comments or information into the Quantum program. They
do not affect the way your program works because they are ignored when the program is
run to produce tables.
It is a good idea to put comment statements in your program in case someone else has to
take over your job or alternatively to remind yourself what you are doing and why. For
example:
3.4 Continuation
Any Quantum statement may be continued over several lines by starting the second and
subsequent lines with + or ++, depending on where the statement is split.
A single plus sign is used when the statement is split between keywords. This assumes
that a semicolon appears at the end of each continued line, whether or not there is actually
one there. Take the statement:
This could be split in three places with a single plus sign for a continuation:
if (c132’12’.and.t5.gt.50)
+write $t5 incorrect$
+else
+write ofil
We have omitted the semicolons at the end of each line, but it would not be wrong to
leave them in.
The double plus sign introduces an internal continuation of a long statement over several
lines. Statements may be split between lexics; that is, between keywords, conditions, lists
of numbers, and so on, but not in the middle of any of these. In our previous example, we
could write:
if (c132’12’.and.
++t5.gt.50) write $t5 incorrect$; else; write ofil
A double plus is needed here because we have split an expression in which one parameter
is dependent on the other. The statement on the first line means nothing on its own,
neither does the second line, hence the ++. We could equally well have split the
expression before the .and. or before or after the .gt.. To split it between t and 5, or in any
other similar place, is incorrect because the two characters by themselves do not mean
anything.
Quick Reference
To have possible syntax errors (i.e., ones which Quantum can process even though they
are not quite perfect) treated as fatal, type:
check_
nocheck_
When the Quantum compiler is checking your program and finds an error it flags the
incorrect statement with an explanatory error message and continues with the next
statement. If any of these errors are fatal – that is, Quantum cannot convert your
statement into C code – the run will be terminated.
Sometimes Quantum finds statements which are not quite correct, but which it can still
convert into C. In these cases the compiler flags the statement with the message ‘Possible
syntax error’ and continues as if nothing were wrong. You can choose to have this type
of error treated as fatal and have the run terminated at the end of the compilation by
entering the statement check_ (note the underscore at the end) at the start of your edit.
The statement nocheck_ causes possible syntax errors to be flagged but ignored, and this
is the default.
Quick Reference
To have more or less than the default of 20 error messages displayed on your screen, type:
errprint n
When the Quantum compiler finds errors in your program, it copies them to the
compilation listing file. It also displays the first twenty messages on your screen. You
may increase or decrease this number by placing the statement:
errprint n
at the top of your main program file, before the edit and tabulation sections.
n is the number of messages you want to see on your screen: it must be an integer. Thus:
errprint 5
prints the first five error messages on the screen and in the listing file, and then any others
only in the file.
data constants
integer numbers
real numbers
Individual constants
Quick Reference
To refer to one or more codes in a single column, type:
’codes’
Red 1
Yellow 2
Blue 3
Green 4
Black 5
White 6
coded into one column. If my favorite color is green, this will appear in the data file as a
4 in the appropriate column, just as if your favorite color is red, there will be a 1 in that
column.
To refer to these answers inside your Quantum program (maybe we only want our table
to include those respondents whose favorite color is blue), type in the code enclosed in
single quotes:
’3’
You will also have to tell Quantum which column to look in.
☞ To find out how to refer to columns, see the section entitled "Data variables" later
in this chapter.
Several codes may be combined in the same column and are called multicodes.
Throughout this manual when we talk of multicodes or multicoding we mean two or more
codes in the same column. Suppose the next question asks me to choose three colors from
the same list; I pick yellow, black and white. If these answers were all coded in the same
column (a multicoded column), we would refer to them by typing:
or any other variation of those three codes. Quantum does not care what order you enter
the codes in.
If you have a series of consecutive codes in the order &–01234567890–& you may either
type each code separately or you may enter the first and last codes separated by a slash
(/) meaning ‘through’, as shown below:
As you can see, the last two examples mean exactly the same thing. However, the
notations ’0/&’ and ’0–&’ are not the same: ’0/&’ means ’01234567890–&’ whereas ’0–
&’ is ’0’, ’–’ and ’&’ only.
Some combinations of codes represent ASCII characters; that is, they represent characters
which you can type on your screen:
The only time you would use letters rather than codes (i.e., ’A’ rather than ’&1’) is when
the questionnaire tells you that a column should contain a letter.
Sometimes we may need to write a notation for ‘no codes’ – for instance, if my favorite
color does not appear in the list of choices. To do this, we write ’ ’ (i.e., a blank enclosed
in single quotes).
✎ The notation ’ ’ is a special case since blank is not really a code. If you type a blank
inside single quotes with any other characters Quantum will follow its usual rule of
ignoring spaces. This means that references of the form ’12’ are read as ’12’.
Quick Reference
To refer to a string of codes in a field of columns, type:
$codes$
When data constants are single-coded or the multicodes correspond to ASCII characters
(e.g., ’A’, ’B’) they may be strung together. Strings of data constants are sometimes
called literals or column fields. Strings are enclosed in dollar signs, with the component
single codes losing their single quotes. For example:
The first string is five columns long with 1 in the first column, 2 in the second, 3 in the
third, and so on. The third string is six columns wide with the fourth column being blank.
• when the answers to a question are represented by codes of more than 1 digit. For
example, in a car ownership survey the car make and model owned may be
represented by a 3-digit code. To pick up respondents owning a particular type of car
you would need to check whether the relevant columns contained the code for that
car. For instance, to look for owners of Ford Escorts you might ask Quantum to
search for the string $132$ in a particular field of columns.
4.2 Numbers
Whole numbers
Quantum can deal with whole numbers (integers) in the range −2,147,483,647 to
+2,147,483,647.
Your data will contain whole numbers whenever there are questions requiring numeric
responses: for example, the question ‘How many children do you have?’ can only be
answered with a whole number. If the respondent has three children, the number 3 will
appear in the appropriate column in his data record, whereas a respondent with five
children will have a five in that column instead.
Whole numbers are also used if you want to perform arithmetic calculations during the
run, for instance to multiply a field by a number. You can find out more about arithmetic
in Quantum by reading chapter 5.
Real numbers
Real numbers are numbers containing decimal points. To be valid, they must have at least
one digit on either side of the decimal point:
Quantum deals with real numbers of any size with accuracy up to six significant figures.
Numbers with more than six significant figures have the sixth figure rounded up or down
depending on the value of the remaining figures. Here are some examples of rounding:
There are three types of variables – data, integer and real – each used for storing different
types of information. You may create your own variables with names representing the
type of information stored (e.g., the variable called meals might contain a count of the
number of meals eaten during the day) or you may use the ones offered automatically by
Quantum.
Sometimes it is useful for a series of variables to have the same name. Each variable may
then be addressed by its position in the group. This arrangement is known as an array.
Arrays are discussed further in the following sections.
Data variables
Quick Reference
To refer to a single data variable in the C array, type:
cnumber
c(start_pos,end_pos)
To refer to it, use the same notation as above but replace the c with the variable’s name.
At the start of every job, Quantum provides you with an array of 1,000 data cells called
C. This array is sometimes referred to as the C matrix. The individual cells are called
C-variables. Each C-variable stores one ‘column’ of data. Quantum reads data from your
data file into this array: we will discuss exactly how it does this in chapter 6. For the time
being, let’s say we have a very small questionnaire which uses 43 columns to store the
data. Quantum will read the data for each respondent into cells 1 to 43 of the C array, one
respondent at a time. The codes from column 1 of the data are copied into cell 1 of the C
array, the codes from column 2 of the data are copied into cell 2, and so on. When
Quantum has finished with that respondent’s data it clears out the cells in the C matrix
and reads the data for the next respondent, placing it in cells 1 to 43 of the array.
We can access this data by defining the columns whose contents we wish to inspect or
change. Let’s take the questions about color that we mentioned earlier. The printed
questionnaire tells us that the respondent’s favorite color will be coded into column 15.
To look at this column we would write:
c15 or c(15)
The C may be in upper or lower case, and the parentheses around the column number are
optional. To refer to column 43 we would write:
c43 or c(43)
Now suppose we want to look at a field of columns such as the questionnaire serial
number in columns 1 to 5. All we have to do is tell Quantum that the serial number is in
a field starting in column 1 and ending in column 5, as follows:
c(1,5)
C-variables are reset to blank before a new respondent’s data is read. Thus, you can be
certain that Quantum never muddles the contents of column 10 for the first respondent
with those of c10 for the second respondent.
As we mentioned above, you may create your own data variables to store specific pieces
of data. For instance, in a shopping survey we may want to store data about visits to
Sainsburys in an array called ‘sains’ and data about visits to Safeways in an array called
‘safe’.
Before we can use these arrays, we must create them. If each array is to contain 100 cells
or column of data, we would write:
where the s at the end of each statement causes Quantum to recognize that, for example,
safe1 is the same as safe(1), just as it knows that c15 and c(15) refer to the same column
of data. If you created the arrays without the s, then Quantum would not recognize safe1
as being the same as safe(1).
Data variables which you create remain blank until you copy data into them. If the data
about visits to Sainsburys is stored in columns 30 to 45, then we might copy this into cells
30 to 45 of the array called sains. If we then want to use this data we can write statements
which refer to sains30 to sains45. Unless you subsequently change the data in
sains(30,45), each time you refer to one of those cells it is exactly the same as referring
to c30, c45, and so on, in the C array, and to columns 30, 45, and so on, in the data file.
In this simple example, there is not much to be gained (apart from an immediate
improvement in readability) by using your own data variables. However, when you have
many columns of data per respondent, or a complicated Quantum program, named data
variables can be very useful for improving readability and also for providing simple yet
powerful facilities for data manipulation.
To find out more about creating and using named data variables, read chapter 14. Here
are some further examples:
Integer variables
Quick Reference
To define an integer variable, type:
name[cell_number]
Integer variables store whole numbers. Strings of integer variables are called integer
arrays, and each cell in the array may store any whole number from −2,147,483,647 to
+2,147,483,647.
At the start of each run, Quantum provides an array of 200 integer variables called T. The
first cell in this array is the integer variable t1 which maystore any value within the given
range; the second cell in the array is the integer variable called t2 which may also store
any value within the given range.
To illustrate the difference between a data variable and an integer variable, let’s suppose
that our data contains the value of the respondent’s car to the nearest whole pound. If the
value is £6,000, this will take up 4 columns in the data (assuming that we are only
concerned with the digits) – that is, four data variables, the first of which will contain the
6, and the other three of which will all contains zeroes.
If we placed this same value in an integer variable, we would only need one variable to
store the whole value because each variables can store values in the range
±2,147,483,647.
We have already mentioned that Quantum provides an integer array of 200 integer
variables. You may create your own arrays using statements similar to those shown
above for data variables. Suppose you have a household survey in which you have
collected the value of each car that the family owns. You want to set up an integer array
in which to store each value, so you write:
This creates an array called carval which contains ten separate integer variables called
carval1 to carval10. Notice that we have followed the array size with the letter s so that
we can omit the parentheses from the individual variable names. We can then copy the
value of the first car into carval1, the value of the second car into carval2, and so on. If a
particular household owns three cars values at £6,000, £2,500 and £500, then carval1
would have a value of 6,000, carval2 would be 2,500 and carval3 would be 500.
If you create your own integer variables, it is recommended that you name them with
names that reflect their purpose in the run, as we have done in our example.
✎ All integer variables have a value of zero at the start of a run, and they are not reset
between respondents. If you want your integer variables to store information about
the current record only, you must include statements in the edit which reset those
variables to zero when a new record is read. For example, we might write:
carval1 = 0
at the start of the edit to reset the first integer variable of the carval array to zero.
T-variables with non-zero values are printed out at the end of the run.
Real variables
Quick Reference
To define a real variable, type:
name[cell_number]
You may define real variables and arrays to store real numbers with accuracy up to six
significant figures. Values with more than six significant figures have the sixth figure
rounded up or down according to the value of the extra figures.
☞ For further explanation of real values, see the section entitled "Real numbers".
As with integer variables, the names of real variables should give some clue to the type
of information they contain. Real arrays are created by statements of the form:
real liters 5s
This example creates a real array called liters which has five real variables named liters1
to liters5. It can store five real values, the first in liters1 and the fifth in liters5.
Quantum also provides a set of 100 real variables named X which you may use.
✎ All real variables start with a value of 0.0 and are not reset to zero between
respondents.
As an example, let’s say that the data contains information on how long, on average, each
person in the household spent watching television during a given week. We want to
manipulate these figures so we create an array of real variables in which to store the
average viewing figures:
real tvwatch 8s
This provides room for up to eight people’s figures. If our household contains four people
with viewing averages of 20.8 hours, 15.75 hours, 9.75 hours and 10.0 hours, then
tvwatch1 will have a value of 20.8, tvwatch2 will have a value of 15.75, tvwatch3 will be
9.75 and tvwatch4 will be 10.0 hours. The rest of the variables in the array have values
of 0.0.
Real variables with non-zero values at the end of the run are not printed out
automatically. If you want to see these values, you will need to write them using a report
statement.
Quick Reference
To read real values from the C array, type:
cx(start_col, end_col)
As we have already said, data from the questionnaire is read into columns for use during
the run. When the data contains real numbers you will have to tell Quantum that the dot
is to be treated as a decimal point rather than as a multicode representing a number of
different answers. The way to do this is to refer to the field as cx:
cx(15,20) cx(131,135)
Here we have two fields containing real numbers: the first is six columns wide including
the decimal place, which means that the number itself contains five digits, whereas the
second is only five columns wide with four digits. Notice that there is no need to tell
Quantum where the decimal point is.
4.4 Subscription
As we have shown above, you may refer to specific variables in integer and real arrays
and cells or columns in data arrays by naming their position in the array.
For example:
Variables within an array may also be referred to using any arithmetic expression. In this
case, parentheses must be used. For example:
c(t1) the column number depends on the value of t1. If t1 has a value of 10,
then the variable is c10; if t1 is 67, the variable is c67.
c(t4,t5) the field delimiters depend on the values of t4 and t5. If t4 has a value
of 12 and t5 has a value of 19, the column field referred to is c(12,19).
t(c4) the variable number depends on the value in c4. If c4 contains a single
code in the range 1 to 9, the integer variable will be one of t1 to t9
depending on the exact value in c4. If c4 is multicoded, then the result
is nonsense.
time(c4*23) the variable number is the result of multiplying the value in c4 by 23.
As in the previous example, c4 must be single-coded in the range 1 to
9 for this example to make sense. Thus, if c4 contains just a 4, the value
of the expression is 92 so the variable referred to is time92.
When variables are referenced in this way, the value of the expression must be positive.
The expression c(t1−5) is acceptable as long as t1 is at least 5. If the expression has a zero
or negative value Quantum will issue an array dimension error when it comes to read the
data during the datapass. Also, if the variable refers to columns, the value of the subscript
must not exceed 32,767.
These are called subscripted variables and they greatly increase the flexibility with
which you can write your edit.
✎ Subscription may be used in repetitive processes to save you writing the same thing
over and over again.
☞ See section 9.5 for an example.
The simplest form of arithmetic expression is a single positive or negative number such
as 10 or −26.5 or an integer or real variable.
Although the C Array is data, columns may also be used in arithmetic when the response
coded into those columns is a numeric response, such as a respondent’s age or the number
of different shops he visited. For example, if columns 243 to 247 contain the codes
4,7,2,6 and 0 respectively the value in c(243,247) could be read as 47,260. Similarly, if
columns 45 to 48 contain 7, 8, a dot and 2 respectively, the value in cx(45,48) would be
78.2.
Blank columns in a field are ignored when the codes in those columns are evaluated.
Thus, if columns 20 to 21 contain the codes 6 and 7 respectively, and column 22 is blank,
the codes in c(20,22) will be evaluated as 67. A similar result is produced if the blank
column appears anywhere else in the field. All the examples of c(20,22) below produce
an arithmetic value of 67:
The same applies to multicoded columns. If you use a multicoded column as part of an
arithmetic expression, the multicoded column will be ignored. The exception to this is a
multicode of a digit and a minus sign which creates a negative number: a minus sign
anywhere in a numeric field negates the value in the field as a whole, not just the number
it is multicoded with. For example:
----+----1----+----2
5 3778 is 5378
9
0
2---+----3----+----4
12-4 is -1234
3
4---+----5----+----6
83- is -83
Expressions – Chapter 5 / 23
Quantum v5e User’s Manual
Quick Reference
To combine arithmetic expressions, type:
where variable is a numeric value or the name of a variable containing a numeric value,
and operator is one of the arithmetic operators +, −, * (multiply) or / (divide).
More often than not you will want to combine numeric expressions to form a larger
expression, for instance to count the number of records read with a given code in a named
column.
Arithmetic expressions are linked with any of the arithmetic operators listed below:
+ (addition) * (multiplication)
− (subtraction) / (division)
Expressions may contain more than one of these operators, for instance:
t5 + c(134,136) / otot
c(150,152) * 10 + 2.5
1. Expressions in parentheses
If you wish to change this order you should enclose the expressions which go together in
parentheses. The first expression in the example above will be evaluated by dividing the
value in columns 134 to 136 by otot and adding the result to t5. If you change the
expression to:
this adds the values of t5 and c(134,136) first and then divides that by otot. Let’s
substitute numbers and compare the results. If t5=10, otot=5 and the value in c(134,136)
is 125 the two versions of the expression would read as follows:
24 / Chapter 5 – Expressions
Quantum v5e User’s Manual
Where two integer expressions are combined, the result is integer (any decimal places are
ignored), but if an expression contains a real then the result will be real. Therefore, if t1=5
and t2=3, then:
t1 + 4 = 9
t1 + 4.0 = 9.0
t1 * t2 = 15
t1 / t2 = 1
t1 * 1.0 = 5.0
t1 * 1.0 / t2 = 1.66
If you use parentheses in expressions which contain both integer and real variables, you
need to take extra care to ensure that your expression is producing the correct results.
Let’s look at an example to illustrate how an expression can look correct but can still
produce unexpected results.
yields a result of 2.8 (i.e., 200.0/70). The final value will be 2.8 if the result is saved in a
real variable, or 2 if it is saved in an integer variable.
If we use parentheses:
the result is 0.0 (or 0 if saved in an integer variable). The reason for this is as follows.
Because Quantum evaluates expressions in parentheses before it deals with the rest of the
expression, it treats that expression as integer arithmetic. The rules for integer arithmetic
dictate that real results are truncated at the decimal point, so the true result of 0.28
becomes 0. Any multiplication involving zero is always zero, so the final result is zero.
If you find that a run gives unexpected zero results, try looking for expressions of this
type and checking whether the parenthesized part of the expression has been truncated
because the integer division results in a decimal number.
Quick Reference
To count the number of codes in a column or list of columns, type:
If any columns are followed by a code reference, only those codes will be counted for
those columns.
Expressions – Chapter 5 / 25
Quantum v5e User’s Manual
The function numb is an arithmetic expression which counts the number of codes in a
column or list of columns. Its format is:
where cn1 to cnn are the columns whose codes are to be counted. So, if we wanted to
count the number of codes in columns 132 to 135 we would type:
numb(c132,c133,c134,c135)
Notice that even though the columns are consecutive, each one is entered separately, with
each column number preceded by a ‘c’. It is incorrect to define only the start and end
columns of a field when using numb. Therefore it is wrong to write numb(c(132,135)) or
numb(c(132,135)) and, if you write statements such as these, Quantum will flag them as
errors.
Sometimes you will only be interested in certain codes, for instance you may want to
know how many 1, 2 or 3 codes there are in a group of columns. In this case the function
is entered as:
where p1 to pn are the codes to be counted. Only the named codes are counted – any
others appearing in the columns are ignored. Let’s say our data on card 1 is as follows:
1---+----2---...---5----+----4
1 2 1
6 / /
8 6 7
8
and we want to count the number of codes in column 115 and also the number of codes
in the range ’5/8’ in columns 121 and 157. The expression would be entered as:
numb(c115,c121’5/8’,c157’5/8’)
When Quantum checks these columns and codes, it will tell us that there are 9 codes in
these columns which are within the given ranges. These codes are all four codes in
column 115 (we did not specify which codes to count in that column), codes 5 and 6 in
column 121 (codes 2 to 4 are outside the given range), and codes 5 to 7 in column 157
(codes 1 to 4 are outside the given range).
26 / Chapter 5 – Expressions
Quantum v5e User’s Manual
Quick Reference
To generate a random number in the range 1 to n, type:
random(n)
Quantum can generate random numbers automatically with the random function:
random(n)
where n is the maximum value the random number may take. So, to generate a random
number in the range 1 to 100, the expression would read:
random(100)
The number produced may be saved for later use in an integer variable or column, thus:
rnum=random(32)
c(110,112)=random(156)
When using random with columns, always make sure that the number of columns
allocated to the number is sufficient to store the highest possible number that can be
generated. In our example, we need three columns in order to store numbers up to 156.
✎ random generates a different random value each time it is run, even on reruns of the
same job. If you want to retain the same set of random values between runs, copy
them into the data the first time you run the job.
Logical expressions are used for comparing values, codes and variables.
Comparing values
Quick Reference
To compare the values of two arithmetic expressions, type:
where log_operator is one of the operators .eq., .gt., .ge., .lt., .le or .ne
Expressions – Chapter 5 / 27
Quantum v5e User’s Manual
Values are compared when you need to check whether an expression has a given value –
for example, did the respondent buy more than 10 pints of milk?
Values are compared by placing arithmetic expressions on either side of one of the
following operators:
.eq. equal to
.gt. greater than
.ge. greater than or equal to
.lt. less than
.le. less than or equal to
.ne. not equal to / unequal to
If the number of pints of milk that the respondent bought is stored in columns 114 and
115, the expression to check whether he bought more than ten pints would be:
c(114,115) .gt. 10
If the number in these columns is greater than ten the expression is true, otherwise it is
false.
In chapter 4 we said that integer variables may take numeric values or the logical values
true and false depending upon whether or not the value is zero. To check whether the
respondent bought any packets of frozen vegetables, we can either write:
fveg .gt. 0
to check the numeric value of the variable fveg, or we can simply say:
fveg
to check whether the logical value of fveg is true. To check whether fveg is false (i.e.
zero), we would write:
.not. fveg
☞ For further information about .not., see the section entitled "Combining logical
expressions" later in this chapter.
In virtually every Quantum run you will want to check which codes occur in which
columns. This is easily done using logical expressions. There are several forms of
expression depending on whether you are checking a column or a field of columns.
28 / Chapter 5 – Expressions
Quantum v5e User’s Manual
Data variables
Quick Reference
To test whether a data variable contains at least one of a list of codes, type:
var_name’codes’
To test whether a data variable contains none of the listed codes, type:
var_namen’codes’
To test whether a data variable contains exactly the given codes and nothing else, type:
var_name = ’codes’
var_name1 = var_name2
To test whether a data variable contains codes other than those listed, type:
var_nameu’codes’
To test whether two data variables do not contain identical codes, type:
var_name1uvar_name2
To check whether a column or data variable contains certain codes, place the codes,
enclosed in single quotes, immediately after the name of the column or data variable:
The expression:
Cn’p’
checks whether a column (n) contains a certain code or codes (p). The expression is true
as long as column n contains at least one of the given codes. It does not matter if there
are other codes present since these are ignored.
For example, to check whether column 6 contains any of the codes 1 through 4 we would
type:
c6’1/4’
Expressions – Chapter 5 / 29
Quantum v5e User’s Manual
----+----1
5
7
9
—
is false.
In our original example we chose the codes 1 through 4. You can, of course, use any
codes you like and they may be entered in any order.
cnN’p’
which checks that a column does not contain the given code or codes. The expression is
true as long as the column does not contain any of the listed codes. For example:
c478n’5/7&’
is true as long as column 478 does not contain a 5, 6, 7 or & or any combination of them.
A multicode of ’189’ returns the logical value true, because it does not contain any of the
codes ’5/7&’ whereas a multicode of ’1589’ makes the expression false because it
contains a ’5’.
The ’=’ operator is used to check that the contents of a column are identical to the given
codes. The expression:
c312=’1/46’
is true as long as c312 contains all of the codes 1 through 4 and 6, and nothing else. The
expression:
c142=’ ’
30 / Chapter 5 – Expressions
Quantum v5e User’s Manual
checks that column 142 is blank. The equals sign is optional when checking for blanks,
so we could simply write:
c142’ ’
The ’=’ operator may also be used to compare the contents of two data variables. For
example:
c56=c79
checks whether c56 contains exactly the same codes as c79. If so, the expression is true,
otherwise it is false. If we have
yields the value false because column 79 contains a ’9’ when column 56 does not.
If you have defined your own data variables, you could write a statement of the form:
brand1=c79
to check whether the data variable called brand1 contains the same codes as c79.
cnU’p’
This checks whether column n contains something other than just the code ’p’. Suppose
we have two sets of data:
----+-----5 ----+-----5
1 1
4 5
7 9
and we write:
c44u’7’
Expressions – Chapter 5 / 31
Quantum v5e User’s Manual
The expression is true for both sets of data. In the first example, the ’7’ is multicoded with
a ’1’ and a ’4’, while in the second example, column 44 does not contain a ’7’ at all. The
only time this expression is false is when column 44 contains a ’7’ and nothing else.
Quick Reference
To test whether a field contains a given list of codes, type:
To test whether the codes in one field differ from a given string, type:
var_name(start, end)u$codes$
To test whether the codes in one field differ from those in another, type:
The contents of data fields must be enclosed in dollar signs with each code in the string
referring to a separate column in the field. For instance, to check whether columns 47 to
50 contain the codes –, 6, 4 and 9 respectively we would type:
c(47,50)=$–649$
+----5-----+
-649
+----5-----+
-529
164&
All our examples have used columns, but the same rules apply to data variables that you
define yourself. For example:
rating(1,4)=$1234$
checks whether the field rating1 to rating4 contains the codes 1, 2, 3 and 4 in that order.
That is, it checks whether rating1 contains a 1, whether rating2 contains a 2, and so on.
32 / Chapter 5 – Expressions
Quantum v5e User’s Manual
When checking the contents of fields in this way, make sure that you enter as many
columns as there are codes in the string (i.e. five codes require five columns). The
exception to this rule occurs when you are checking for blanks when the expression may
be shortened to:
c(50,80)=$ $
This type of statement may also be used to compare two fields, to check whether the
second field contains exactly the same codes as the first field. When you compare one
field with another, Quantum takes each column in the first field in turn and looks to see
whether the corresponding column in the second field contains exactly the same codes.
For example, if the first column of the first field contains a code 1 and a code 2 and
nothing else, then Quantum will check whether the first column of the second field also
contains a code 1 and a code 2 and nothing else. If all columns of the second field are
identical to their counterparts in the first field, then the expression is true; otherwise it is
false. Here is an example:
c(129,132)=c(356,359)
For this expression to be true, column 129 must contain exactly the same codes as column
356, column 130 must be exactly the same as column 357, and so on. Once again, the two
expressions on either side of the equals sign must be the same length.
✎ Comparisons of one data variable against another are concerned with columns and
codes: they are not concerned with the arithmetic values of the codes in the fields as
a whole.
If we have:
----+----3----+----
02 2
the expression:
c(24,25)=c(34,35)
is false because the string $02$ is not the same as the string $2$. If you want to
compare fields arithmetically (i.e., is 02 the same as 2) then you will need to use the
.eq. operator:
c(24,25).eq.c(34,35)
to test whether the value in c(34,35) was equal to the value in c(24,25).
To check whether the codes in one field do not match a given string or the codes in
another field, we can use the u (unequals) operator:
Expressions – Chapter 5 / 33
Quantum v5e User’s Manual
If codes in the field c(m,n) do not match the given string or the codes in c(m1,n1) then
the expression is true. If the two fields are identical, then the expression is false.
✎ The comparison is of codes in columns, where the columns are compared on a one
to one basis. It is not a comparison of a field with a numeric value, or of the numeric
values in two fields. Numeric comparisons for inequality are written with the .ne.
operator.
☞ Numeric comparisons are described in the section entitled "Comparing values".
c(67,69)u$123$
+----7-----+
123
The expression:
c(67,69)uc(77,79)
is true as long as columns 67 to 69 differ by at least one code from columns 77 to 79. If
our data is:
+----7----+----8
123 256
the expression is true because each of columns 77 to 79 differ from columns 67 to 69.
Also, if we have:
+----7----+----8
123 123
5
the expression is true because column 77 is multicoded ’15’. The only time the
expression is false is when columns 67 to 69 are identical to columns 77 to 79.
34 / Chapter 5 – Expressions
Quantum v5e User’s Manual
Quick Reference
To test whether a value in a field is within a specified range, type:
Blanks at the start of the field cause this statement to give a false result. To ignore leading
blanks, type:
The logical expression range checks whether the number in a field of columns is within
a given range. If so, the expression is true, otherwise it is false. The format of this
statement is:
range(start,end,min,max)
where start and end are column numbers and min and max are the range delimiters. For
example, the statement:
range(137,139,100,150)
will return the value true if the number in columns 37 to 39 of card 1 is in the range 100
to 150.
✎ It is important to remember that this statement is designed for use with purely
numeric columns. Columns which contain blanks, multicodes or an ampersand (12
punch) automatically cause the statement to be false. The exception to this is a
multicode of a digit and a minus sign (11 code) which converts the whole field to a
negative number.
A variation of range is rangeb which allows columns to the left of the field to be blank
if the number is right-justified in the field. In all other respects it is exactly the same as
range. If our data is:
----+----2
123 6
the expression:
rangeb(17,18,1,10)
will be true because the string $ 6$ will be read as 6. With range the value would be false.
Expressions – Chapter 5 / 35
Quantum v5e User’s Manual
rangeb(15,18,2000,3000)
Quick Reference
To combine logical expressions, type:
Two or more logical expressions may be combined into a single expression using the
operators:
The .and. operator requires that all the expressions preceding and following the .and. be
true for the whole expression to be true. Thus, the statement:
is true if the integer variable int1 has a value of 9 and column 116 contains a 1. If either
subexpression is false, the whole expression is false too.
By comparison, the .or. operator requires that one expression or the other, or both, be true
in order for the whole expression to be true.
For this expression to be true, columns 249 to 251 must contain nothing but a ’1’, ’5’ and
’9’ respectively or the number of codes in columns 132 to 135 must be greater than 4. It
is also true if both expressions are true. However, if both are false, the overall result is
false.
Expressions are reversed (negated) simply by preceding them with the keyword .not..
Although it is not wrong to use it with a single variable, it is more generally used to
reverse an expression containing the keywords .and. and .or.. Thus, it is not wrong to
write .not.c15’1/5’ but it is much simpler to write this as c15n’1/5’.
36 / Chapter 5 – Expressions
Quantum v5e User’s Manual
✎ Take care when using .not. with the .eq. operator. Statements of the form:
.not. c(1,3) .eq. 100
are incorrect and will not work. They should be written as either:
(.not.(c(1,3).eq.100))
(c(1,3).ne.100)
Any of the operators .and., .or, and .not. may appear in a statement more than once, as
long as you use parentheses to define the order of evaluation. For example:
causes Quantum to check whether the .or. condition is true before dealing with the .and..
Suppose our data is:
----+----2----+
13 &
79
The first expression (c15’1/47’) is true because column 15 contains a 1 and a 7 and the
second expression (c16’3579’) is also true since the codes it contains are amongst those
listed as acceptable. Thus, the .or. condition is true. Column 22 contains an ampersand so
the last expression is also true, therefore the expression as a whole is true regardless.
If both expressions in the parentheses were false, the whole expression would be false.
When you use .not. with expressions in parentheses, be very careful that what you write
is what you mean. Let’s take the conditions male and married and forget about columns
and codes for the minute. The condition:
which refers to unmarried men and all women. This can also be written as:
The first .not. collects all the women, the second collects everyone who is not married
(e.g. single, widowed etc), and together they collect people who are female and
unmarried. We use .or. instead of .and. here because the latter will gather unmarried
women but will ignore the unmarried men and married women.
Expressions – Chapter 5 / 37
Quantum v5e User’s Manual
Reversing .or. expressions works in exactly the same way. The expression:
means anyone who is Male, or anyone who is Married, or anyone who is Male and
Married. The opposite of this is:
which means anyone who is not Male or is not Married or is not both; that is, anyone who
is a woman and is unmarried. This can be written as:
3----+----4----+----5----+----6----+
519 1
9 &
the expression is true because c(135,137) do not contain just the codes 5, 1 and 9 (c135
is multicoded), and c160 does not contain any of the codes 6 through 0. The expression
will only be false if:
a) column 135 contains a 5 only, column 136 contains a 6 only and column 137 contains
a 9 only, and
b) column 160 contains any of the codes 6 through 0, either singly or as a multicode.
We could therefore write the expression as:
38 / Chapter 5 – Expressions
Quantum v5e User’s Manual
Quick Reference
To compare the value of a variable or an arithmetic expression to a list of numbers, type:
Ranges of numbers may be entered in the list as start:end. If the item is a reference to a
field containing blanks, enter the values as strings of codes enclosed in dollar signs.
From time to time you may need to check whether a variable or arithmetic expression has
one of a given list of values. For example, if the questionnaire codes brands of frozen
vegetables as 3-digit codes into columns 145 to 147 we might want to check that only
valid codes appeared in this field. This is achieved using the logical expression .in. as
follows:
where variable-name is that of the variable to be checked and list is a list of permissible
values. The arithmetic expression is an expression consisting of data or integer variables,
arithmetic operators and integer values as described earlier in this chapter. If the variable
or arithmetic expression has one of the listed values, the expression is true, if not, it is
false.
The left-hand side of the expression may contain integer variables, columns or data
variables containing whole numbers, or expressions using these types of variables. If it is
a data variable, then the list may contain codes enclosed in dollar signs. Quantum will
then compare the codes in the data variable with the codes inside the dollar signs. We
could therefore check that the frozen vegetables have been coded correctly by keying in
a statement which says:
Quantum will flag any records in which c(145,147) does not contains exactly 205, 206,
207, 210, 215 or 220 (i.e. three single-coded columns) as incorrect.
If the data variable contains a valid positive or negative whole number, then the list may
also contain such values. Ranges of values may be entered in the form min:max, where
min is the lowest acceptable value and max is the highest. Since the frozen vegetables
have numeric codes, we could write the expression as:
Expressions – Chapter 5 / 39
Quantum v5e User’s Manual
Any columns in the field which contain non-numeric data (e.g. multicodes) will be
flagged as incorrect, as will any which contain values which do not match the
specification.
Sometimes, though, the codes and numbers will not be interchangeable. If you have
2-digit codes in a 3-column field, the statement:
unless column 206 is always blank. If the 2-digit codes have been padded on the left with
zeroes instead of blanks (i.e., 010, 011) or if they all start in column 206 (i.e., $10 $,
$11 $), then the first expression will be false, even though the second one will still be
true.
☞ For a fuller explanation of the difference between codes and numbers, see the earlier
sections of this chapter.
Lists may contain up to 247 values or codes, which may be entered in any order. In our
examples, we have always entered them in ascending order, but this is not a requirement
of Quantum. You may enter codes in a list in any order you like. The exception is numeric
ranges which must be entered in the form lowest:highest.
Naming lists
Quick Reference
To assign a name to a list of values, type:
definelist name=(list)
where list is a comma-separated list of numbers, ranges or code strings enclosed in dollar
signs.
If you have a list that is used more than once you may give it a name and refer to it by
that name instead of typing in the complete list each time. To name a list, write:
definelist name=(list)
40 / Chapter 5 – Expressions
Quantum v5e User’s Manual
For example:
definelist fveg=(205:207,210,215,220)
To use a defined list, simply replace the list with the name:
Quick Reference
To speed up your Quantum program by converting expressions of the form
c(1,4)=$1234$ into C in a more efficient way, type:
inline n
where n is the maximum field width to be converted in this manner. This statement must
appear at the start of the edit.
If you have a large edit, you can speed up the time it takes to run by including the inline
statement in your edit. This instructs the Quantum compiler to convert expressions of the
form c(1,4)=$1234$ into statements in the C programming language in a different
way to the way it normally does. You need not worry about these different methods of
conversion, apart from deciding whether or not to use them.
If you want to speed your program up, place a statement of the form:
inline n
at the beginning of the edit section, where n is the maximum field width to be converted
in the special way. For example:
inline 6
Here we are saying that fields of six columns or less should be converted in the special
way rather than in the normal way.
Expressions – Chapter 5 / 41
6 How Quantum reads data
In order for the answered questionnaire to be processed, the information contained on the
questionnaire must be read into the computer into a location where Quantum can access
it. This is done by reading the data into the data variable array called C which is supplied
automatically with every Quantum run. You may then access this data by addressing this
array.
Different types of records are read into the C Array in different ways.
Quantum deals with three types of record: ordinary, multicard and multicard with trailer
cards.
Ordinary records
These are strings of codes and numbers, one per respondent, up to a maximum of 32,767
characters per respondent.
Multicard records
When data originates from punched cards and each questionnaire requires more than 80
columns, the data is spread over several cards. So that all cards belonging to a particular
respondent may be easily identified, each questionnaire is assigned a serial number which
is entered as part of the data for each card. Within this, each card has a unique card type
or card number to distinguish it from others in the group. It is important that both the
serial number and card type be in the same relative positions on all cards in the file, since
this is the only way that Quantum can tell which data belongs to which respondent.
If the questionnaire serial number is in columns 1 to 4 of each card and the card type is
in column 5, and we are looking at questionnaire 1005, we will see that it has two cards
whose first five columns are 10051 and 10052 respectively. Quantum can deal with
records that contain up to 327 cards per respondent.
Occasionally you may have multicard records in which each ‘card’ is greater than 80
columns. The notes that follow refer to multicard records of up to 100 columns per card.
☞ For information on how Quantum deals with ‘cards’ of more than 100 columns, see
section 6.10.
Sometimes a record contains very repetitive data which is tabulated over and over again
in the same way. For instance, a shopping survey may ask the respondent a series of
identical questions for each store he visited. In this case, there may be a separate card for
each store.
Processing this type of data is often easier if we treat all cards containing the same
questions as if they were, in fact, one card with one card number. These cards are called
Trailer Cards.
Thus, if the respondent visited five stores, and the questions about these stores are coded
on a card 2, the record for that respondent would contain five cards of type 2. If
demographic details were stored on a card 1, the whole record would be 6 cards in all. In
Quantum, the demographic data would be described as the higher level and the stores as
the lower level.
Another example of data gathered at different levels might be a travel survey in which
respondents are asked about the places they visited and their method of travelling. The
highest level may be demographic information about the respondent, the second level
would be the various trips he made and the third level might be information about the
various modes of transport used. If we were to draw a chart of a record, it would look like
this:
Respondent
|
-----------------------------------------
| | |
Trip1 Trip2 Trip3
| | |
Tran1 Tran2 Tran3 Tran1 Tran2 Tran1 Tran2 Tran3
Here, we have three groups of data at level 2 and eight groups of data at level 3.
Data is read into the C Array automatically, one record at a time. The way data is read
depends upon the record structure. If a record contains carriage return characters
(CTRL-M), those characters are always ignored.
Ordinary records
Ordinary records are read into cell 1 onwards of the array. Therefore, for example, the
50th column is referenced as c50 and the 200th cell as c200.
Multicard records
Records are read into c101 to c200 for card 1, c201 to c300 for card 2, and so on. For
example, 80-column cards are read into c101 to c180 for card 1 and c201 to c280 for card
2. Columns 181-200, 281-300, etc remain blank. In this case, the C Array may be pictured
as ten rows of 100 cells each. Column 50 of card 1 is then accessed by referring to it as
c150, and column 67 of card 8 is referred to as c867.
It is also possible to read cards into the array sequentially regardless of card type: the first
card goes in c(101,200), the second in c(201,300), the third in c(301,400), and so on.
Each time an ordinary record or set of cards comprising a multicard record is read in, that
data is processed first by the edit section and then by the tabulation section of your
program. The complete record is edited and tabulated in one go. The exception to this is
the trailer card record where processing can take place a number of times within each
record for each lower level.
To ensure that only the part of the edit section applying to a particular level is used, the
edit section is defined separately for each level. Similarly, the table instructions specify
the level at which the table should be incremented.
By using the Levels facility, the user need not know how Quantum deals with trailer card
data internally. However, there are occasions when it may be necessary to edit or tabulate
the data without using levels. To do this, it is necessary to know more about how trailer
cards are processed.
Quantum deals with trailer cards in a number of ‘reads’. Cards are read into the
appropriate rows of the C Array until:
a) a card is located with a card type matching that of the previous card (e.g., two
consecutive card 2’s), or
b) a card is read with a type lower than its predecessor and matching one of the card
types already read in during the current ‘read’ (e.g., a card 2, a card 3, and then
another card 2).
In order to produce useful tables, you will need to know which cards are currently in the
C Array.
Quantum has four reserved variables – thisread, allread, firstread and lastread – which it
uses to keep track of which cards it has read for each respondent.
thisread
The array called thisread is used to check which cards have been read in during the
current read. thisread1 will be true (or 1) if a card type 1 has just been read in; thisread2
will be true if a card 2 has just been read, and so on.
There are nine such variables (thisread1 to thisread9) available unless extra card types
have been specified using the max= option In this case, these variables will be numbered
1 to max; if there are 13 cards, we will have thisread1 to thisread13.
☞ For further details on max=, see the section entitled "Highest card type number"
later in this chapter.
allread
allread notes which cards have been read in so far for this questionnaire. If cards 1, 2 and
3 have been read so far, allread1, allread2 and allread3 will all be true. Additionally,
each cell of allread will contain the number of cards of the given type read in – for
instance, if two cards of type 3 have been read, allread3 will be true and it will contain
the number 2.
As with thisread, there are nine allread variables available unless extra card types have
been specified with max=.
The variables firstread and lastread become true when the first and last cards in a record
have been read in.
You can use these variables in your program to associate specific parts of the edit or
tabulation section with specific types of data. For instance:
Let’s take an example and look at the contents of the C Array and the values of thisread,
allread, firstread and lastread. Suppose the record has five cards: 1, 2, 2, 2 and 3 of 80
columns each. The first ‘read’ places card 1 in c(101,180) and the first card 2 in
c(201,280). The second card 2 is not read into the array yet because it has the same card
type as the previous card. As this is the start of a new respondent, firstread is true (or 1),
and because cards 1 and 2 have been read, thisread1, thisread2, allread1 and allread2 are
also true.
The second ‘read’ deals only with the second card 2 since it is followed by another card
of the same type. thisread2 is true, as are allread1 and allread2. Also, allread2 contains
the value 2 because we have read in 2 card 2s so far. Note that thisread1 is now false (or
0) as no card 1 was read this time.
On the third and final ‘read’ the third card 2 is read into c(201,280) and card 3 is copied
into c(301,380). lastread is true because we have reached the end of the record, thisread2
and thisread3 are true because we have just read cards 2 and 3, and allread1, allread2 and
allread3 are true because this record contains cards 1, 2 and 3. allread2 now contains the
value 3 because there were 3 card 2s altogether.
The chart below summarizes the cards read and the variables which will be true after each
read.
If Quantum reads a record in which the repeated cards are out of sequence, it inserts
blanks cards of the appropriate types wherever necessary to force the cards into the
correct sequence. For example, if the record contains the cards 1, 2, 4, 3, 4, 4 in that order,
Quantum will generate a completely blank card 3 when it reads the first card 4. The
record is then processed as if it contained cards 1, 2, 3, 4, 3, 4, 4.
It is sometimes useful to know that in the case of multicard records the first card of the
next record is waiting in columns 1 to 100 of the array. Beware of overwriting these
columns.
In section 6.4 we discussed the reserved variables thisread, which keeps track of which
cards have been read in during the current read, and allread, which keeps track of all
cards read in for the current record. Other reserved variables associated with reading in
data:
lastrec set to true when the last record in the file has been read or, in the case of
trailer cards, the last read of the last record has occurred.
If the C Array remains blank (e.g., there are less than 100 columns in the record), these
spare columns can be used for data manipulation and storing additional information.
Remember, however, that it may be clearer to store this information in named variables
where the name gives some indication of the type of data stored.
In ordinary records you may use the space beyond the end of the record. If the record
length is 120 columns, you may use columns 121 to 1000.
✎ For ordinary records, only columns 1 to reclen are reset to blanks, where reclen is
the maximum record length as defined by the reclen= keyword on the struct
statement.
☞ See the section entitled "Record length" for further information in defining the
record length.
In multicard records you may not use c(1,100). However, you may use any columns
between the end of the card (reclen) and the end of that row of the C Array. For instance,
when reclen=80 you may use c(81,98), c(181,200), c(281,300) and so on. You may also
use full sets of columns in which there is no data: that is, if the record has only four cards
(1, 2, 3 and 4), then c(501,1000) are the spare columns you may use. Additionally, cells
101 to c(100+reclen), c201 to c(200+reclen), and so on are reset to blanks before the next
record is read in.
Quick Reference
To describe the structure of the data, type:
struct; options
All programs dealing with multicard records must contain a struct statement unless the
data contains trailer cards which will be read and tabulated using the levels facility. In
this case you may choose between using a struct statement or using a levels file. If the
run has no struct statement and no levels file, Quantum assumes that the data contains
ordinary records to be read into c1 onwards of the C array.
☞ Levels and how to describe the levels data structure are discussed in chapter 28.
The struct statement is used to define the type of records, the location of the serial number
and card type in the record and the number of the highest card type if greater than 9. Its
format is:
struct;options
Record type
Quick Reference
To define the record type, type:
struct; read=n
where n is 0 for ordinary records, 2 to read multicard records in sections according to the
card type, or 3 to read multicard records in all in one go.
Quantum recognizes two types of record: single card and multicard. The type of record
is defined by the keyword read= on the struct statement:
Ordinary Records
Ordinary records are defined using read=0. Each record is read into c1 onwards of
the array. Since it is the default, you need only use it when other options are
required; for example, when the records contain serial numbers and you wish to
have the serial number printed out as part of the record, or when you are working
with long records of more than 100 columns.
Multicard Records
Multicard records are identified by the keyword read=2. Each card in the record is
read into the row corresponding to the card type of that card – that is, card 1 in
c(101,200), card 2 in c(201,300), and so on.
Record length
Quick Reference
To define the record length of records greater than 100 columns, type:
struct; reclen=n
The keyword reclen=n defines the maximum number of characters to be read into the C
array, the number of cells to be reset to blanks and the number of cells to be written out
by the write statement.
With ordinary records reclen may take any value, but with multicard records the
maximum is reclen=1000. In both cases, the default is reclen=100. When data is being
read into the matrix, any record which is longer than reclen characters is truncated to that
length and a warning message is printed.
When ordinary records are written out with write or split, cells c1 to c(reclen) are copied,
with any trailing blanks being ignored. For instance, if we have:
struct;read=0;reclen=200
and the current record is only 157 characters long, the record written out will be 157
characters long. This length can be overridden by an option on a filedef statement.
When multicard records are written out, columns c101 to c(100+reclen), c201 to
c(200+reclen), and so on will be output. Thus, if we write:
struct;read=2;reclen=70
and we have 2 cards per record, Quantum will write out c(101,170) and c(201,270).
Finally, with ordinary records cells c1 to c(reclen) are reset to blanks between records,
but with multicard records cells c101 to c(100+reclen), c210 to c(200+reclen), and so on
are reset.
Quick Reference
To define the location of the serial number in each record, type:
struct; ser=c(m,n)
The keyword ser=c(m,n) defines the field of columns containing the respondent serial
number. For example, if the serial number is in columns 1 to 5 of an ordinary record we
would write:
struct;read=0;ser=c(1,5)
struct;read=2;ser=c(1,5)
Notice that even with multicard records we only give the actual column numbers
containing the serial number, rather than card type and column number as is usually the
case when identifying columns in such records. This is because the column numbers refer
to all cards in the data set rather than to a single card in the file.
Quick Reference
To define the location of the card type in the record, type:
struct; crd=cn
Defining the card type location is much the same as defining the position of the serial
number in the record. The keyword is crd=cn for a single digit card type or crd=c(m,n)
for a card type of more than one digit. Once again, m and n are column numbers only,
not card type and column number. For example:
struct;read=2;ser=c(1,4);crd=c5
tells us that we have a multicard record with serial numbers in columns 1 to 4 and the card
type in column 5 of each card. Each card will be read into the row corresponding to its
card number.
Quick Reference
To define cards which must be present in each record, type:
struct; req=card_numbers
Sometimes some cards will be optional and others mandatory. You may define those
cards which must appear in every record by using the keyword req= followed by the
numbers of the cards that each respondent must have. For example:
req=1,2
tells us that cards 1 and 2 must be present in each record for that record to be accepted.
Any other cards are optional. If a record is read without one of these cards, the error
message ‘Card Missing in Set’ and a note of the record’s position in the file are printed
and the record is ignored.
If you have ranges for required card types, you may type the numbers of the lowest and
highest cards separated by a slash (/) or a colon (:) rather than listing each card type
separately. For example, if cards 1 to 4 are all required, you may type:
Quick Reference
To define cards which may appear more than once in a record, type:
struct; rep=card_numbers
If the data contains trailer cards and the Levels facility is not used, you must list their card
types with the keyword rep=. For instance, if card 2 is a trailer card we would write
rep=2. Where there is more than one trailer card, each card type is listed separated by a
comma. If cards 2, 3 and 4 are all trailer cards we could write:
rep=2,3,4
If you have ranges for required card types, you may type the numbers of the lowest and
highest cards separated by a slash (/) or a colon (:) rather than listing each card type
separately.
If rep= is not used and a record is read with two or more cards of the same type, the last
card of that type will be accepted and the message ‘Identical duplicate’ or ‘Non-identical
duplicate’ and a note of the record’s position in the file will be printed. For example:
Record structure error: serial 026, card 234 in run, card 234 in dfile
card type 2 – non-identical duplicate
Because rep= refers to trailer cards only, it will be ignored if read=2 and crd= are not
both present on the struct statement.
Quick Reference
To define the highest card type in the record, if there are more than nine cards per record,
type:
struct; max=n
The only time you need to inform Quantum of the highest card type is when you have
records with more than nine cards. This is so that Quantum can allocate sufficient cells
in the C array to store the extra cards. The highest card type is defined with max=n, where
n is the number of the highest card type. Cells 1 to max*reclen are then cleared between
respondents. For example, to read a data set with 11 cards per respondent we might write:
struct;read=2;ser=c(1,4);crd=c5;req=1,2,3,4;max=11
If you forget max=, and a record is read with more than nine cards, the message ‘Too
many cards per record’ is printed and the record is rejected. On the other hand, if a card
is read with a card type higher than that defined with max=, the record is rejected with
the message ‘Card number out of range’.
✎ Since the maximum size of the C Array is 32,767 cells, the maximum value you can
set with max= is 32 cards.
Quick Reference
To define the location in the C array of cards with alphanumeric card types, type:
struct; order=card_types
where card_types is a list of card type numbers and letters in the order they are to appear
in the C array.
From time to time you may need to read in records with alphabetic as well as numeric
card types. This generally happens in a multicard data set containing more than nine cards
per record where only one column has been allocated to the card type.
Quantum can deal with this data but first you will have to say where in the C array the
alphabetic card types should go. This is done with the keyword:
order=n
where n is one or more of the codes ’1234567890–&’ or the letters A to Z (in upper or
lower case) not separated by spaces.
The card type bearing the first number in the list is read into c(101,200), the card bearing
the second code in the list is read into c(201,300) etc. For example, suppose each record
has ten cards – 1 to 9 and A – our struct statement might say:
struct;read=2; ser=c(1,4);crd=c4;max=10;order=123456789A
Data from card A would be read into cells 1001 to 1100 of the C array.
Quick Reference
To define the location of the merge sequence number in trailer cards, type:
struct;seq=cn
When trailer card data is merged during a run with the merge facility, you may wish
trailer cards to be merged in a specific order, according to a sequence number entered as
part of the data. The location of this sequence number can be defined with the keyword
seq=cn for a single column code or seq=c(m,n) for a multicolumn code. For more
information on merging data see the next section.
When we say that Quantum allows you to merge data files, we do not mean that Quantum
takes data from a number of files and merges it to create a new file. Rather, we mean that
data can be read from a series of files during a Quantum run. Of course, the merged data
can then be written out to a new file for future use.
Quantum provides two methods for merging data. The first is designed for studies where
you have different card types in different files; for example, cards 1 and 2 in the file data1
and card 3 in the file data2. In this case, merging is by serial number and, optionally, card
type and trailer card sequence number.
The second method is designed for situations where you want to merge a field of data
from an external file into records from the main data file. For example, you may have a
file of manufacturers’ codes which refer to a number of products. If each record in the
main data file contains the product the respondent preferred, you may wish to merge the
appropriate manufacturer’s code from the external file into the main data in the C array.
In this case, merging is based on finding matching keys in the main record and the records
in the external file.
Data for a study may be spread across a number of files. This is particularly useful with
large surveys because it means that you can put each card type in a different file and
simply merge in the cards required for the current batch of tables. For example, if we
require tables from cards 4 and 5, we need not even read in cards 1, 2, 3 and 6.
Data from up to 16 files may be merged; that is, the main data file and 15 others. It may
be merged on serial number and, within that, on card type. With trailer card data, you also
have the option of merging trailer cards according to a sequence number entered as part
of the data.
In order for the merge to be successful, all files must be sorted in ascending order with
the serial number, card type and sequence number in the same position. Quantum reads
the locations from the keywords ser=, crd= and seq= on the struct statement.
To merge data files you must create a file called merges telling Quantum which items to
merge on, and which files to merge. The type of merge is represented by a number:
1 merge on serial number. Cards are read in from each data file according to their
serial number only – the card type and sequence number, if any, are ignored. You
might use this option when you have two files, dat01 containing cards of type 1 and
dat02 containing cards of type 2, and you want the files to be merged so that card
type 1 is read into the C-Array, followed by card type 2.
3 merge on serial number and card type (default). With this option, cards with the
same serial number read from different data files are merged to form a single record
by comparing the serial number and card type. Cards within a record are then sorted
sequentially from 1 so that each card is read into the appropriate cells of the
C-Array. For example, if dat01 contains cards 1 and 3, and dat02 contains cards of
type 2, the merge will produce records containing cards 1, 2 and 3 in that order.
5 merge on serial number, card type and sequence number. This is similar to merge
type 3, except that trailer cards are merged according to their sequence number. For
example, if dat01 contains cards 1 and 2, where card 2 is a trailer card with a
sequence number of 2, and dat02 contains cards 2 and 3, where card 2 is a trailer
cards with a sequence number of 1, the merged record will contain cards 1, 2/1, 2/2,
and 3, in that order.
This is the first item in the merges file, and is followed by the names of the files to be
merged with the main data file named in the Quantum command line. Items may be
entered on separate lines or all on the same line separated by semicolons. For example,
if we want to merge data in files dat02 and dat03 with data in the main file, dat01, by
serial number, card type and sequence number, the merges file would look like this:
5; dat02; dat03
Notice that we have not mentioned dat01 in the merges file because it will be named on
the Quantum command line instead.
✎ This facility is not designed to work with merge files that contain *include or
#include statements to read additional data files into the current data file. All merge
files must be named in the merges file, which accepts pathnames if the data files are
not in the project directory.
Quick Reference
To merge extra data from an external data file into the data currently in the C array, type:
where
key_field is the location of the key in the main data file, entered using the standard
Quantum notation for columns and fields
key_start is the start column of the key in the external data file.
copy_to is the field in the main data record in which to place the external data. The
field is defined using the standard Quantum notation for columns and fields.
The mergedata statement merges a field of data from an external file with the main data
at the datapass stage of the Quantum run. Merging is by means of a data key present in
both the main records and the records in the external file. If a record in the external file
has a key which matches that of a record in the main data file, the external data will be
merged into a user-defined field of the main record when it is read into the C array.
In order for data to be merged correctly, both the main data file and the external file must
be sorted in ascending order by key value. If the key is the record serial number then the
data file will already be sorted in the correct order (assuming, of course, that the data is
sorted by serial number). If you are using a key that is not the record serial number you
must sort the data file so that it is ordered by key rather than by serial number.
where
int_variable
is the name of an integer variable in which the function can place its return
value.
ex_file is the name of the file containing the extra data. It must be enclosed in dollar
signs.
key_field is the location of the key in the main data file, entered using the standard
Quantum notation for columns and fields.
How Quantum reads data – Chapter 6 / 57
Quantum v5e User’s Manual
key_start is the start column of the key in the external data file, for example, 1 if the
key starts in column 1. The length of the key is taken from the length of
key_field.
copy_to is the field in the main data record in which to place the external data. The
field is defined using the standard Quantum notation for columns and fields.
data_start is the start column of the data to be copied. Quantum copies as many
columns as are defined by copy_to.
For example:
t1 = mergedata($manuf_codes$,c(178,180),15,c(168,175),1)
tells Quantum to compare the key in columns 178 to 180 of the main record with the key
which starts in column 15 of the external records in the file manuf_codes.
Because the key field in the main record is 3 columns long, Quantum reads columns 15
to 17 of each external record to obtain its key. If the keys match, Quantum copies the data
from the external record into columns 168 to 175 of the main record in the C array. The
external data to be copied starts in column 1 and, since the destination field is 8 columns
long, Quantum copies 8 columns starting at that column.
This statement returns a value of 1 if a match was found (i.e., merging took place), or 0
if not.
There is no limit on the number of mergedata statements in a specification, but you may
only merge data from up to nine different files per record.
Occasionally you may have multicard records in which each card contains more than 100
columns. To process this data, Quantum extends the width of the C Array to 10 rows of
1,000 cells each – that is, 10,000 cells in all – when a struct statement with reclen>100
is present. Data is read into c(1001,2000) for card 1, c(2001 to 3000) for card 2, and so on.
All other points mentioned previously for multicard records apply, but column numbers
refer to the extended rather than the default C Array. For example, in the default C Array
c(1,100) stores the first card of the next record, whereas in the extended C Array this data
is stored in c(1,1000).
If your run contains a mergedata statement and either the main data file or the file of
supplementary data for merging has records with duplicate keys or records that are out of
sequence. In some cases the run is also cancelled after all data has been read, when a
complete error report is available. The table below lists the situations when duplicate or
out of sequence data may occur and shows what happens to your job:
Occasionally you may have to process data which does not come in the standard formats
described in this chapter. For instance, records may be strung out one after the other
without being separated by a new-line character. Quantum provides limited facilities for
reading non-standard data.
☞ See the section entitled "Reading non-standard data files" in chapter 26 for further
details.
There are three ways of writing out your data once it has been read into the C-Array. You
may:
Data and print files are both accessed by the write statement, but the exact format of the
statement varies according to the type of file and the information being written. Report
files are written to with the report statement.
Print files are printouts of records or parts of records with headings, descriptive texts and
page numbers. They cannot be used as data for subsequent Quantum runs.
Quick Reference
To write a record or part of a record to a print file, type:
The word write by itself prints out a whole record in the form it is when the write
statement is executed, together with a ruler showing which codes fall in which columns,
the line number of the record in the data file and the message ‘write’ indicating that the
record was generated by a write statement. Any multicodes in the record are shown as
asterisks, but you may change this with an option on the filedef statement.
If the record contains more than one card, each card is listed separately beneath the ruler.
For example, the statement:
write
1 in file
----+----1----+----2-- ... --9----+----0
column 1 - 100 are |12345
write
2 in file
----+----1----+----2-- ... --9----+----0
column 1 - 100 are |23456
write
Each write statement will produce a line in the default print file, out2, telling you how
many records were written out, as follows:
2 (1%) write
Which cards are printed from multi-card records depends upon which cards have been
read in so far. Quantum looks at the ‘allread’ variables and writes out cards for those
which are true; so for example, if allread1, allread2 and allread3 are true, cards 1, 2 and
3 will be printed. If you have changed the contents of these variables prior to printing out
the record, you will see the cards for which allread is true rather than those which were
originally read.
The example above was very simple; more often than not your program will contain
several write statements and you will want some way of identifying which records were
printed by which statement and why. If the write is dependent upon some other statement
– for instance, it is part of an if statement – the whole statement is printed underneath each
record, thus:
67 in file
----+----1----+----2-- ... --9----+----0
column 1 - 100 are |0015263-16*735 *837361 ... 79&
if (c14n’1/4’) write
Here, as you can see, we are checking that column 14 contains a 1/4. This record has been
printed out because it contains a ’5’ instead.
Sometimes it is more helpful to have an explanatory text printed instead of the statement
itself. In this case all that is necessary is to follow the word write with the text to be
printed enclosed in dollar signs:
Record 17 51 in file
----+----1----+----2-- ... --9----+----0
column 101 - 200 are |00170116548986131*46*1 ...
column 201 - 300 are |0017026464515 875 ** ...
column 301 - 400 are |0017031929-5897231 ...
C308 incorrect
too many choices
Record 32 94 in file
----+----1----+----2-- ... --9----+----0
column 101 - 200 are |003201837021 **53798 ...
column 201 - 300 are |0032021353452 763736 ...
column 301 - 400 are |003203212 & ...
too many choices
Our first statement writes out all records in which column 308 does not contain any of
the codes 1/5, and the second picks up all records having more than 3 codes in columns
117 to 119.
Normally all output from write goes to the default print file, and whenever the current
record is written to this file, the variable printed_ becomes true. You may change the
output file by following the word write with the name of the file to write to. For example:
All files named on write statements must be defined on a filedef statement before they are
used.
If two or more write statements apply to a single record, the record is printed out once in
the state it was when the first applicable write was read, with all relevant write statements
or texts listed below it. If a record satisfies two or more write statements which write to
different files, Quantum will write the record out once for each statement, in the state it
is when each write is executed.
✎ If you want to write out more than one field at a time, or to print more than one text,
you can define those fields and/or texts on an ident statement. All write statements
from that point on will then print those fields and texts.
☞ To find out more about ident, read section 7.5.
Often you will not want to write out the whole record, especially if it contains several
cards. Therefore Quantum allows you to include a field specification in a write statement
to print only selected portions of an incorrect record. For example:
checks that columns 110 and 119 both contain a 2, and if so prints out columns 110 to
120 in the print file, followed by the text Married woman. If you are writing out less than
ten columns, Quantum does not print a ruler above the codes.
If you are dealing with multi-card records, you may prefer to use this form of write to
have only the card containing the error printed, rather than all cards in the record. If we
take our previous example where we were checking the contents of column 308:
Quick Reference
To write records or fields to a data file, type:
write may also be used to copy records to a data file. This is useful if you want to separate
a particular card type from the rest of the data, or if you want to correct errors and save
the corrected data in a new file for later tabulation.
write filename
If you use write in a levels job to write data to a new data file, the statement
write datafile at any level will write out data for that level only. Additionally, if the
write statement is inside an if clause, or a return statement is encountered, then only
relevant data is written for that level. To write out data for all levels, you will need one
write statement per level.
In all cases, records are written in the state they are when the write is executed, and all
cards read in with the current read are copied; that is, all cards for which thisread is true.
For instance, if thisread1, thisread2 and thisread3 are true, Quantum will write out cards
1, 2 and 3. To prevent any of these cards being written, you may set the appropriate
variable to false (zero); therefore to print only card 1 of our three cards, we would write:
thisread2=0; thisread3=0
write newdat
Any number of writes to data files are allowed in the edit, and each one may write to a
different file.
Records written by write are normally as long as the record length defined with reclen on
the struct statement. You may change this with len= on the filedef statement. The
exception is where records end with blank columns. In this case Quantum ignores the
blank columns. If you want to create a data file of fixed length records, and your data is
single coded, you can use the reportn statement. If your data is multicoded you can
convert it to single coded first by using the explode statement.
If your data is multicoded and you need to preserve the multicodes, the only way of
writing out fixed length records if the data currently has trailing blank columns is to insert
a dummy code in the last column of those records.
New cards can be created by copying information into spare columns of the C-Array. To
save these as part of a new data file you will have to give each new card the same
respondent serial number as the rest of the data in the array and a card type which may or
may not be unique. In the example below, we are moving some information from card 1
of a 2-card data set into a new card 3. The comments explain what each statement is
doing.
Quick Reference
To write information to a report file, type:
Use reportn rather than just report to start a new line each time the statement is executed.
A report file is a special type of print file in which you can print out records, fields or
variables in the format of your choice. To write information in a report file, use the report
statement, as follows:
where filename is the name of the file to be written to, and parameters define exactly what
is to be written.
Lines in a report may be up to 1024 characters long. Report does not start a new line
automatically at the end of each write, but you may tell it to do so by following the
keyword report with the letter n:
In both cases, the named file must be identified as a report file using a filedef statement,
as mentioned below.
The parameter list defines what is to be printed in the report file. It may contain variables,
texts, and special characters representing tabs and spaces.
Data variables
Quick Reference
To print the contents of a data variable, type:
var_name or var_name(start,end)
var_name:field_width
To print a the contents of every column in a field, even if they are multicoded or blank,
type:
start:field_width
where start is the first position in the field. You may also use this notation to print fields
whose contents evaluate to a value greater than the maximum integer value Quantum can
deal with.
All data variables that are single coded are printed using as many positions as there are
columns in the variable. For example, if the data is:
----+----4
511 538253
2
&
the statement:
prints the contents of columns 31, 35 and 40 one after the other, as follows:
153
The statement:
538253
In both the examples the last column of the field has contained a code. If the last column
or columns of a field are blank Quantum omits those columns when printing the contents
of the field. (You can get round this by entering the field specification as
start:field_width as described later in this section.)
A single data variable that is blank is printed as such, while a single data variable that is
multicoded is printed as an asterisk. The statement:
5 *1
prints a line containing the value 51538253. The value starts in the first print position
available.
✎ If the field you wish to print is very long, its contents may produce an incorrect value
when evaluated as an integer (the maximum integer value which Quantum can deal
with is 2,147,483,647). You can get round this by specifying the first column and
the field width as described below.
If you want to see all columns in a field which contains blanks or multicodes, or you need
to have the correct evaluation of a long field, you will need to deal with each column in
the field separately. You could type each column number separately, but it is quicker just
to specify the start column and the total number of columns you want to print starting at
that column. The format for this type of reference is:
start/:field_width
The output from this command would be 51*538253, the same as if you had typed each
column number separately. As before, the data is printed starting in the first print position
available.
You can use this alternative notation with field specifications too. In this instance
Quantum will evaluate the contents of the field as an integer and will print the result
right-justified in a field of the given width. If you type, for example:
Quantum will print the value 51538253 in positions 3 to 10 of a ten-position field. The
first two positions will be blank.
This notation is also useful if you need to create data files with fixed length records, and
some records end with blank columns. Writing records to a data file preserves multicodes
but ignores trailing blank columns. Writing to a report file allows you to create a
single-coded data file with fixed length records. If your data is multicoded you will need
to convert it to single-coded form before writing it out. You can do this by ’exploding’
any multicodes into a field of single codes. You use the explode statement for this.
☞ For information on how to use explode, see the section entitled "Converting
multicoded data to single-coded data" in chapter 13.
Once your data is in single-coded form you can then write the whole record out to a report
file using a reportn statement as follows:
Integer variable
Quick Reference
To print the contents of an integer variable, type:
var_name[:field_width]
If the report statement names a variable by itself, Quantum prints the variable’s value
starting in the first print position available. If the specification includes a field width,
Quantum prints the variable’s value right-justified in a field of the given width. Any extra
columns on the left of the field width are shown as blanks.
var_name[:field_width]
If you type the variable name by itself, without a field width, Quantum prints it
left-justified starting in the first available position on the line. If you would prefer values
to be printed right-justified, follow the variable name with a colon and a field width.
Quantum will then print all values for that variable right-justified in a field of the width
you have given. For example:
prints the values of the variable called codenums right-justified in a field five positions
wide. Values that are shorter than five characters are padded on the left with blanks.
Real variables
Quick Reference
To print the value of a real variable, type:
var_name[:field_width.dec_places]
where field_width is the width of the field in which the values are to be printed (values
are right-justified and padded on the left with blanks if necessary) and dec_places is the
number of decimal places to be shown for each number. If you omit these parameters,
Quantum prints the values starting in the first available print position and with six
decimal places.
var_name[:field_width.dec_places]
If you type the variable name by itself, without a field width and a number of decimal
places, Quantum will print the variable’s value with six decimal places and starting in the
first print position available.
You can control the layout by defining a field width and the number of decimal places
required. For example, by typing:
you can create a neat column of figures all with two decimal places and all right-justified
in a field six characters wide.
Quick Reference
To print text, type:
$text$
[number]x
print_post
Most reports require some sort of text or spacing on the line, either on the same line as
the values or on lines by themselves to create titles, column headings, and the like.
$text$
To print spaces between the values on a line, you can either use spaces or tabs. To print
a given number of spaces between one value and the next, type:
[number]x
where number is the number of spaces required. The default is one space.
If you are producing tabular or columnar output you’ll probably find tabs are more useful
for creating blank space since they allow you to skip to a particular print position on the
line. For example, typing:
25t
takes you directly to position 25 on the line, regardless of the current print position.
Compare this with 25x which moves you 25 positions on from your current position.
Examples
in a file called summary. Printing starts in position (column) 20 because we start our
parameter list with the keyword 20t. The variable brda is an integer variable whose value
is to be right-justified in a field three columns wide. Notice also how we have inserted
spaces between the texts and the value of brda.
The statements:
/* only print title if this is the first record in the data file
if (.not. rchk)
+reportn yogurt 30t,$Serial Numbers for Yogurt Buyers$
if (c119’1’) reportn yogurt c(1,4)
.
rchk = 1
produce a report showing the serial numbers of all respondents who buy yogurt. As you
can see, we have given our report a title.
As a final example, let’s look at the difference between printing a field of columns all in
one go and printing them one at a time. If our data is:
+----4----+
18 036
& /
7
the statement:
c(37,43) is 106
c(37,43) is 1* 0*6
✎ You cannot write information to the standard print file (usually called out2) using
report. To do this use the function qfprnt.
☞ To find out about qfprnt, read section 7.6.
Quick Reference
To define a report file, type:
where mpa, mpd and mpe indicate that multipunches should be printed across the page,
down the page, or as an asterisk and then listed below the record.
All files named on write and report statements must be defined by a filedef statement
before they are used. This tells Quantum whether the file is a report, print or data file, and
defines more specifically how the output should be written. So that you can be sure that
all filenames will be recognized, you are advised to place all filedef statement at the
beginning of the edit.
where filename is the name of the report file and report is a mandatory keyword
indicating that the file is a report file.
Quantum normally creates report files in the main project directory. If you want the
report file to be created in a different directory, follow the filename with =pathname. For
example, to declare a report file called repfile1 that is to be created in the directory
/home/ben, you would write:
where filename is the name of the output file and data is a mandatory keyword indicating
that the named file is a data file. As with report files you may use the optional =pathname
parameter to name the directory in which the data file should be created.
All records written to data files are as long as the record length defined with reclen on the
struct statement. If you wish to change this, add the option len=reclen to the filedef
statement, thus:
This example says that records written to the data file newdat1 must be 80 columns long.
where filename is the name of the print file with an optional pathname, print is a
mandatory keyword indicating that the file is a printout file, and options is a list of
optional keywords defining more specifically how the records should be written.
Filename lengths are as described above for data files. Options are:
len=n length of output record if different from reclen= on the struct statement
$text/$ heading text to be printed at the top of each page
mpa prints the codes in a multicode across the page enclosed in curly brackets. For
example
000401 635495{134}45111
Here, we have a multicode of ’134’. The ruler is of little use when multicodes
are printed in this manner, so you may prefer to suppress it with the option
norule.
mpd prints the codes in a multicode down the page, thus:
----+----1----+----2
000401 635495145111
3
4
mpe prints multicodes as an asterisk, but lists the individual codes within each
multicode beneath the record. For example:
----+----1----+----2
000401 635495*45111
Column 14 contains codes 134
norule turns off the ruler
noser prevents the messages ‘Record nnn’ and ‘n in File’ from being printed
The default output file is a print file called out2, and the default output style is as
described above. To change the output style for this (e.g., to suppress the ruler or print
multicodes in a different format), simply use a filedef statement naming this file and
giving the appropriate options from the list above:
Quick Reference
To define default print parameters for write statements, type:
Any number of texts, variable names and fields are allowed. Items are printed in the order
they are listed.
To turn off ident defaults and return to the standard write behavior, type:
noident
The ident statement gives you increased control over the content of the print file by
allowing you to print more than one field of columns and one text per write statement.
Each ident statement may contain any number of texts, variable names and columns as
long as each one is separated from the others by a comma. The order in which you define
items with this statement controls the order in which they will be printed. For example,
if you type:
and Quantum finds a record which fails this test, it will print the following:
Notice that the text defined with ident does not replace the text given with write. If you
do not define a message on the write statement, Quantum will print the complete
statement as it usually does.
In this example there is not much difference between using ident and writing the test as:
The real power comes when you want to write out more than one field and/or text per
write statement, or if you want to write out the values of data, integer or real variables.
For example, if you type:
t(1) is 10
t(2) is 15
t(3) is 20
in the print file (the values reported will, of course, be the values of the variables as they
are in your run).
You can combine texts, columns and variable names. The statements:
might print:
You could use this type of output for checking records which may be incorrectly coded
for use with field and bit statements.
When ident writes out data variables, it prints the data according to the specification on
the filedef statement for the file to which you are writing the data. If the filedef statement
includes the keyword norule to suppress the ruler, the data is written out without a ruler,
otherwise the ruler is always printed above the data, as in the precious example.
You can alter this behavior without having to respecify the filedef command by typing a
+ or − sign at the end of the ident keyword. If filedef normally requests a ruler, type:
to print the listed variables without a ruler. If filedef normally suppresses the ruler, type:
To switch off ident and revert to the standard write behavior, type:
noident
Quick Reference
To write data to the standard print file (usually called out2) in a format of your choice,
type:
where format defines the format in which the data is to be written and the data types of
the variables used. variables is a comma-separated list of the variables to be written out.
Variables must be listed in the order they are used in the format statement.
The format string consists of optional text interspersed with references to variables in the
list:
.write and report are both powerful statements for writing out data, but they do have
limitations which you may find restrictive in some circumstances. The write statement
lets you write data out to a print file, including the standard print file (usually called out2),
but it always writes the data in a fixed format that you cannot change. The report
statement lets you write out data and text in any format you like, but only to a report file.
You cannot write to a print file with report.
The qfprnt function brings together the functionality of write and report by writing text
and data to the standard print file in a format of your choice. To use it, type:
where format defines the format in which the data is to be written and the data types of
the variables used. variables is a comma-separated list of the variables to be written out.
Variables must be listed in the order they are used in the format statement.
If the respondent tested five products this statement will appear in the standard print file
as:
The underscore character in front of the 5 represents a space and appears as such in the
print file. We’ll explain why we have printed it here shortly. First, let’s look at the qfprnt
statement itself.
The format section of the statement consists of text to be printed exactly as it is written
and references to variables whose values are to substituted in the text at the given points.
In this example we are writing out the value of the numeric (integer) variable t1. The
variable is named in the variable list section of the statement and is represented by the
characters %2i in the format section.
There are three parts to the variable’s reference. The % sign signals to Quantum that it
has reached a variable reference: all references start with a % sign. The i says that the
variable is an integer variable and the 2 says how many print positions to reserve for
printing this variable. In the example two positions are reserved for printing the value of
t1, but since the value of t1 is only 5, Quantum prints the value on the right of the reserved
space and fills the remaining positions with spaces. In the sample output we have used an
underscore to represent this space.
As before, the underscore represents a space used to pad a value to the full field width.
This qfprnt statement produces the correct results because the variables are in the same
order as their references in the format section. This is your responsibility. As long as a
variable has the same type as the reference in the corresponding position in the format
section, Quantum will print its value at that point in the statement. So, if we had written:
As you can see, Quantum does not increase the number of print positions to
accommodate the value it needs to print. Instead, it prints asterisks. In this example, the
asterisks would alert you to the fact that there is something wrong with the qfprnt
specification, but this would not always be so.
More often than not you’ll be printing positive values. If Quantum needs to print a
negative number, it prints the minus sign directly in front of the first digit, just as you
would write it manually.
Besides integer variables, you can also print real variables, columns or fields of columns
and blank strings. You use a reference similar to the one you’ve seen for integer
variables.
%num_pos.dec_plr
where num_pos is the number of print positions required and dec_pl is the number of
decimal places. As an example, the statement:
prints the value of the real variable called liters in a field 5 positions wide. The value is
printed with two decimal places so, allowing for the decimal point, the maximum value
that can be printed in 99.99:
Quantum can also print the text values of a column, a field of columns or a data variable.
By this we mean that Quantum converts multicodes to letters or other keyboard
characters before printing them. Multicodes that do not correspond to letters or characters
are printed as asterisks. For example, the multicode ’&1’ translates into the letter A and
would be printed as such; the multicode ’&123’ is simply as collection of codes and
would therefore be printed as an asterisk.
%numberc
in the format section, where number is the number of print positions required, and the
name of a single column in the corresponding position in the variable list. Quantum will
then print number columns starting at the named column. For example:
might produce:
---+---2
9462&5736
5 1 8
9
The statement:
%numberb
where number is the number of blanks you want. You’ll find this useful if you want to
indent lines or print values in columns.
This chapter describes how to assign values to variables and the statements emit, delete
and priority, all of which may be used to alter the contents of a variable. Emit, delete and
priority are used only with columns whereas assignment statements can deal with
character, integer and real variables.
When we say that these statements change the contents of a column we mean that they
change the contents of that column as it exists during the run: at no time do they change
the corresponding column in the data file.
An assignment statement normally means ‘put the specified information into the given
variable overwriting anything already in that variable’. It can be used with any type of
variable to perform any of the following tasks:
• to replace certain codes in one column with those from a second column.
• to copy codes from groups of columns into another column using the logical
operators and, or and xor.
In spite of the diversity of these functions the basic format of any assignment statement
is:
variable=item
col 1
c(15,16)=$12$ is correct, but
C(15,16)=$12$ will be read as a comment even though the syntax is correct
Alternatively, you may precede assignment statements with the word set, thus:
set c(15,16)=$12$
Copying codes
Quick Reference
To copy codes into a single data variable, overwriting the variable’s original contents,
type:
variable=’codes’
var_name(start,end)=$codes$
variable1 = variable2
Assignment statements are most commonly used to copy codes into a column or to copy
the contents of one variable into another. For instance:
c121=’159’
c121=c134
In the first example we are copying the codes 1, 5 and 9 into column 121 overwriting
whatever is already there. The second example copies everything in column 134 into
column 121, again overwriting what was originally there. Column 134 remains
unchanged.
You can also copy strings of characters into fields of columns. Let’s say we want to copy
the code 59642 into columns 76 to 80 of card 3; we would write:
c(376,380)=$59642$
Notice that the characters to be copied into the array are enclosed in dollar signs as is the
rule when dealing with strings.
\;
Quantum uses a semicolon to mark the end of a statement, and will issue an error message
if it finds a semicolon by itself in the middle of a string. The backslash in front of the
semicolon tells Quantum to read the next character as an ordinary character with no
special meaning. For example:
c(376,380)=$59;42$
When characters are being copied into columns, the equals sign may be omitted:
Just as the contents of a single column may be copied into another, so may the contents
of one field be copied into another field. For example:
c(10,19)=c(70,79) or c(20,22)=c(45,47)
copies the contents of c(70,79) into c(10,19) and the contents of c(45,47) into c(20,22),
in both cases overwriting the original contents of those columns.
Data variables in assignment statements may be subscripted. The following are valid:
c(t1)=c145
c(178,180)=c(t4,t5)
c(t3,t4)=c(t10,t10+2)
When subscripting columns, remember that the current values of the integer variables
will be substituted in the expression before the statement: itself is executed. If t3=120 and
t10=240, the statement:
c(t3,t3+2)=c(t10,t10+2)
means:
c(120,122)=c(240,242)
Generally you will know how many characters are required to hold the information they
will receive, but this is not always the case. What if the field on the left of the equals sign
is longer than the string to be copied into it? Quantum always copies a string starting with
the rightmost column and transferring it into the rightmost column of the field. It
continues in this way until all characters have been copied, then if there are still columns
left in the field they are reset to blanks. When strings are copied in this way they are called
‘right-justified and blank-padded’. Let’s clarify this with a couple of examples. Suppose
we have:
and we enter:
c(241,245)=c(185,187)
If there are fewer characters than there are columns in the field, the characters are
right-justified in the field with the remaining columns set to blanks. If the reverse is true,
and there are more characters than there are columns in the field, the error message
‘Attempt to set too many columns into too few columns’ is issued.
c(145,150)=c(143,148)
copies the contents of columns 143 to 148 into columns 145 to 150, so:
----+----5 ----+----5
83645902 becomes 83836459
When a field is set to blanks it is never wrong to type in as many blanks (enclosed in
dollar signs) as there are columns in the field, but it much quicker and more efficient to
type, say:
c(301,380)=$ $
Quick Reference
To replace a code or set of codes in one data variable with a code or set of codes in a
second data variable, type:
variable1’codes1’=variable2’codes2’
codes1 and codes2 must contain the same number of codes, and the codes must be in
superimposable order (e.g., ’123’ and ’456’, but not ’123’ and ’135’).
Assignment statements are also used to replace parts of one column with those of another,
leaving the remaining contents of that column intact. Note that this is the only time that
assignment does not overwrite everything in the recipient variable. Let’s start with a
simple example. Suppose we have:
and we want column 124 to contain a ’1’ only if column 159 contains a ’7’. We would
write:
c124’1’=c159’7’
However, if we wrote:
c124’3’=c159’3’
meaning that c124 should only contain a ’3’ if c159 contains a ’3’, Quantum would give
us:
As you can see, the ’3’ in c124 has been deleted because there is no ’3’ in c159. Both
examples could equally well be written using if, else, emit and delete, but an assignment
statement is much more efficient when you have a set of codes to check for.
☞ For further information on if and else, see section 9.1 and section 9.2.
For further information on emit and delete, see section 8.2 and section 8.3.
c10’123’=c11’456’
+----1----+
14
35
4
+----1----+
14
25
4
Column 10 contains a ’1’ and a ’2’ because c11 contains a ’4’ and a ’5’. The ’3’ that was
originally there has been removed because there was no ’6’ in c11. The ’4’ in column 10
remains untouched because it has no corresponding code in c11.
Partial assignment need not have different column numbers either side of the equals sign.
Quantum accepts statements of the form:
c127’0/3’ = c127’1/4’
which can be used for recoding incorrectly coded data. The example we have used will
recode a ’0’ in column 127 as a ’1’, a ’1’ in column 127 as a ’2’, and so on.
When entering codes with this type of statement, make sure there are the same number
of codes on either side of the equals sign and that they are in the same relative positions
in the order &–0123456789. In the previous example we used ’123’ and ’456’. We could
also have used ’&–1’, ’789’ or ’234’ instead of ’456’, to name but a few alternatives.
The important thing is that the two groups follow the same pattern: if the first set names
alternate codes (e.g., ’1357’) then so must the second (e.g., &024’).
c21’&–0’=c92’456’
c21’05’=c86’49’
c56’ 0’=c91’15’
c78’123’=c81’367’
The statement for columns 56 and 91 is incorrect because blank is not a valid code here;
the statement using columns 78 and 81 is wrong because the codes ’367’ cannot be
superimposed on ’123’ (either 345 or 567 would be correct).
Quick Reference
To store the value of an arithmetic expression in a variable, type:
variable = expression
In many of your Quantum programs you will need to save the result of some arithmetic
expression in a variable. The variable may be a column or an integer or real variable and
the arithmetic information may be the contents of a column, integer or real variable, an
integer or real number, or the results of the functions numb or random. It can also include
arithmetic expressions which have been manipulated using the arithmetic operators +, −,
/ and *. Here are some examples to start with:
var1=100
/* Next statement expects that variable ntim is < 10
c135=ntim
/* In next example, if c31’5678’, variable np=4
np=numb(c31)
/* Increment rect (record total) by 1 for each record processed
rect=rect+1
Copying a number into an integer or real variable is easy because the variable has no
predetermined size – that is, Quantum does not say that such variables may only store
numbers of up to, say, three digits. Integer variables can store any whole number in the
range +2,147,483,648 to −2,147,483,647 and real variables may take values of any
magnitude with six digits accuracy.
Suppose our questionnaire tells us how many pints of milk a respondent bought and we
want to save this is in an integer variable called npt. Here’s what we might write:
npt=c(125,126)
Similarly, if we know how many miles the respondent travels to work each day, and we
want to convert this to kilometers, we could save the conversion in a real variable called
km0:
km=c(213,214) * 1.609
If the respondent travels 5 miles, km will have the value 8.045, but if he travels 9 miles
km would be 14.481.
The main difference between the two examples is the type of variable in which the results
are saved. The number of pints bought will always be a whole number so we save it in an
integer variable, whereas the conversion from miles to kilometers is likely to produce a
real number so we save it in a real variable.
When copying a real value into an integer variable or vice versa, remember that the
accuracy of the result depends upon the type of variable in which the value is saved. Real
values saved in integer variables are truncated before the decimal point, thus:
but integer values placed in a real variable are saved as reals with decimal places and
accuracy to 6 significant figures:
Integer variables are often used to count the number of respondents having a specific
characteristic. For instance, to count the number of respondents holidaying at home and
the number taking holidays abroad we can say,
☞ This example uses the if statement which is described in greater detail in chapter 9.
Whenever a record is read with c113’1’, the variable home will be incremented by one
and whenever one is read with c113’2’ the variable abroad will be increased by 1.
Let’s say we have five respondents who took the following holidays:
At the start of the run, the variables home and abroad are both zero. After these records
have been processed, home will equal 3 and abroad will be 2. The person unlucky enough
to have no holiday at all will be ignored.
In the example above we were accumulating information about holiday habits for all
respondents together, but on many occasions you will want to store information on a per
respondent basis instead. Normally, integer and real variables are NOT reset between
respondents, but all you need do to overcome this is to enter a statement at the start of
your edit to reset the variable in question to zero each time a new record is read. For
instance:
home=0
☞ We will discuss in more detail the times when you might want to do this when we
describe the do statement in section 9.5.
Columns which contain single codes may be treated as a whole number. For instance, if
our data is:
+----2----+
4922
the statement:
value=c(219,222)
will assign the value 4922 to value. If any of the columns are blank or multicoded in any
way, they are ignored.
+----2----+ +----2----+
49 2 and 4912
2
Columns
Columns may also store arithmetic information, but unlike other variables they have a
predefined size which means they can only store numbers of a certain size. For instance,
c(1,10) can store numbers of up to ten digits whereas c(1,3) only stores numbers of up to
three digits.
If the number is negative Quantum places the minus sign in the column immediately to
the left of the first digit, but if there are no spare columns the first digit will be dropped
and the minus sign placed in the left-hand column. If t5=−278, the statement:
but:
Note that this does not hold true for negative numbers whose length exceeds the field
width by more than one character. Then, the number is copied into the field from the right
and the minus sign and any excess digits are ignored. Thus, if t5=−1278, c(42,44) will
contain the number 278.
If the value to be saved has fewer digits than there are columns in the field, it will be
right-justified in the field and the remaining columns padded with zeros. Here are some
more examples:
When copying real numbers into columns, Quantum needs to know how many decimal
places are required. This is done by following the variable with a colon and a digit
defining the number of places. For example, if x5=10.22, the statement:
cx(15,19):2=x5
results in:
----+----2----
10.22
If the real number has more decimal places than we have allowed for, say 3 instead of 2,
the extra decimal places will be ignored.
Quick Reference
To copy codes which are present in at least one of a list of columns, type:
To copy codes which are present in only one of a list of columns, type:
If any of these statements includes codes (p), only those codes are checked for. Any
unlisted codes are then ignored.
The final type of assignment is copying codes from a set of columns. The codes copied
depend upon the type of operator used:
where ca, cb, and cc are the columns whose codes are to be compared. Note that even if
you are comparing codes in consecutive columns, each column must be identified
separately, preceded by a c. Suppose we have:
----+----4
111
/22
453
77
and we type:
c181=and(c137,c138,c139)
Notice that even though the codes ’3’ and ’7’ appear in more than one column they are
not copied to c181 because they are not common to ALL columns.
Let’s take the same three columns with the OR operator. We type:
c182=or(c137,c138,c139)
c182 contains a list of all codes present in AT LEAST ONE of the named columns.
c183=xor(c137,c138,c139)
yields:
Here only two codes have been copied because all other codes appear in more than one
column. If one column was blank, this would be ignored if there were other codes unique
to one column. Only if there were no other unique codes would column 183 be blank. For
instance, if we have c11=’ ’,c12=’12’, c13=’13’ and we type:
c14=xor(c11,c12,c13)
we would have c14=’23’, but if c13 were to contain a ’12’ instead, c14 would be blank.
All our examples so far have referred to whole columns, but sometimes you will only be
interested in specific codes in those columns. To write this in Quantum, follow each
column number with the positions to be checked enclosed in single quotes. Any unnamed
codes in those columns are then automatically ignored. Here is an example. Our data is:
----+----4----+----5
1 1 2
/ 3 /
5 5 6
Even though column 31, 41 and 45 all contain a ’3’ and a ’5’, Quantum only copies the
’3’ because the ’5’ is not part of our specification. We have used the same code
specification for all three columns, but you can use whatever combination you like.
✎ These types of statement are extremely useful for setting up shorthand references to
the codes present in a group of columns. Say, for instance, that you wanted various
statements throughout the edit to be executed only if there was a ’1’ in one or more
of c110, c112, c120 and c125. You can always write out each column and code
separately each time:
if(c110’1’.or.c112’1’.or.c120’1’.or.c125’1’) .....
c181=or(c110,c112,c120,c125)
if (c181’1’) ...
especially if you will need to refer to the contents of these columns again later on in
the edit. This facility may also be used to simplify what would otherwise be
complicated filter conditions in the tabulation section.
Quick Reference
To add codes into a column in addition to those that are already there, type:
Emit inserts codes into a column leaving the original contents intact. Its format is:
emit cn’p’
Suppose we have:
----+----7
4
5
&
----+----7
3
4
5
&
More than one column may be entered on each line, provided that each one is separated
by a comma.
✎ emit can only be used with single columns; string variables are not valid: emit
c(100,110)$99$ does not work.
Quick Reference
To delete selected codes from a column, type:
The delete statement is the opposite of emit in that it deletes codes from a column leaving
the remainder intact. Its format is:
delete cn’p’
Suppose we have:
+----1----+
5
6
8
9
+----1----+
6
8
9
More than one deletion may be effected with the same delete statement as long as each
column is separated by a comma.
Quick Reference
To force single-coding of a multicoded columns, type:
where a code at the start of the list should be accepted in preference to any later in the list.
Sometimes when you are cleaning your data you will come across a column which is
multicoded when it ought to contain only one code. You can either print out the record
and change the incorrect codes later or you can have Quantum do it for you automatically.
When data is to be corrected automatically, you will need to write a statement saying
which codes should be discarded and which are to be kept. Obviously, there can be no
hard and fast rule since the codes may vary between questionnaires, so what you may do
is assign each code a priority so that when a certain code is found Quantum knows that
all others in that column are to be deleted.
where cn is the column whose codes are to be checked and ’p1’ to ’pn’ are the positions
to check, entered in order of priority, the most important first.
✎ priority checks only the listed positions; if any other codes are present they are
ignored.
Suppose one of the questions in a survey asks respondents to give their overall opinion
of a product, rated on a scale of 1 (Poor) to 5 (Excellent). You have been told that if the
question has accidentally been multicoded you are to assume that the higher rating is
correct and delete the lower rating from the column. You will not know beforehand
exactly what multicodes there are, if any, but you will know the column and the possible
codes it may contain, and also that low codes should be discarded in favor of high ones.
If this question is coded into column 249, you could write:
This causes Quantum to scan column 249 to see first whether it contains a ’5’ and, if so,
to delete all subsequent codes in the list. If c249 contains a ’5’ and nothing else, obviously
there will be no extra codes to delete; this does not matter. If there is no ’5’ in c249,
Quantum then checks whether it contains a ’4’; if so, any other codes in the range ’1/3’
are deleted, otherwise the program skips to the next code in the list and checks for that.
If none of the listed codes are found, the column remains unchanged.
If our first record has c249’53’ Quantum will give us c249=’5’, but if the second has
c249’942’ we will end up with c249’94’; the ’9’ has not been removed because it was not
one of the named positions.
You can also use priority to force a field to be single-coded simply by listing the columns
and codes to be checked in order of importance. If a listed code is found in the first
column, any other listed codes will be removed from that column, as will any that appear
in subsequent columns. For example, if our record is:
-----+----6
22
3
5
and we write:
-----+----6
2
However:
-----+----6 -----+----6
22 would become 2&
3
&
In the previous example, we have named two different columns on the same priority
statement because together they form a field which must be single coded overall. If you
want to force two completely separate columns to be single-coded, you must write two
priority statements, one for each column. If our data is:
+----3----+
21
33
6
the statement
priority c129’1’,’2’,’3’,c130’1’,’2’,’3’
+----3----+
26
but:
results in:
+----3----+
21
6
Quick Reference
To choose a random code from a list of codes, type:
data_var_name=rpunch(’codes’)
data_var_name=rpunch(col_number)
Occasionally you may wish to set a random code into a column, perhaps because the code
in that column is incorrect. To do this, write:
cvar = rpunch(’p’)
where cvar is the column into which one of the codes ’p’ is to go. For example:
c115 = rpunch(’1/5’)
c115 = rpunch(c120)
Once this statement has been executed, column 115 will contain one of the codes present
in column 120.
Quick Reference
To set up an array based on numeric codes in the data, type:
column_specs are references to the fields containing the numeric codes. code is a
non-numeric code present in those fields and cell_number is the cell of the array which
should be incremented whenever that code is encountered.
Cells in the array are reset to zero at the start of each new record. To prevent this
happening, enter the statement name as fieldadd rather than field. The rest of the
statement is as shown.
On some studies you will find responses which are represented by numbers rather than
codes. There are various methods of checking and tabulating these responses. Which one
you use depends on whether you want to know the number of respondents whose record
contains a given code in a field or group of fields, or the number of times a code appears
in a group of fields.
To illustrate this, let’s suppose the question and response list in the questionnaire are as
follows:
Q6A: Which films did you see on your last three visits to the
cinema?
If you want a table which shows how many people saw each film, one way of tabulating
this data is to use a fld statement in the axis which tells Quantum which columns to read
and which codes represent each film.
Another way is to use a combination of field in the edit and bit in the axis. This is
particularly efficient if, rather than wanting to count the number of people who saw each
film, you want to count the number of times each film was seen.
The field statement counts the number of times a particular code appears in a list of fields
for each respondent. It stores these counts in an integer array that consists of as many
cells as there are fields to count. In the films example, the array will have five cells.Cell
1 will hold the number of times code 01 appears in the fields c(12,13), c(14,15) and
c(16,17). If the respondent saw Green Card then Batman 2 and then Green Card again,
his/her data will be:
1 ---+--- 2
040504
Cell 4 (Green Card) of the array will be set to 2, and cell 5 (Batman 2) of the array will
be set to 1. You can then tabulate the contents of this array using a bit statement in the
axis.
output_array is the name of the array in which you wish to store the counts of responses.
You can use spare columns in the C array, but you may find your program is easier to
read if you define an integer array of your own with a name which reflects the type of
information it contains. For example, if you want an integer array called films, you might
write:
int films 5s
ed
field films = .....
When you define the integer array, make sure that you request as many cells as there are
codes in the data. In this example there are five films so you define the array as having
five cells. Quantum automatically creates an extra cell (cell 0) which it uses to count
responses for which there is no cell allocated. If there were six films, for example,
Quantum would increment cell 0 each time it found code 06 in the films columns. You
might like to check the value of this cell as a means of reporting on invalid codes:
Negative and zero values also cause cell zero to be incremented. Codes which are shorter
than the field width are accepted as long as they are padded with blanks or zeroes.
The input_specs part of the statement defines the columns to read. You have a number of
choices here. First, you may list each column or field reference one after the other,
separated by commas. The list must be enclosed in parentheses. In our example this
would be:
Second, if you have sequential fields as you do here, you can type the start columns of
each field followed by the field length. The list of start columns is separated by commas
and enclosed in parentheses, and the field length comes after the closing parenthesis and
starts with a colon. If you use this notation for the film example you would write:
If you wish, you can abbreviate this further by typing just the start columns of the first
and last fields, followed by the field length.
Third, if the fields are not sequential, you list the start columns and field width of each
group of columns (as shown above) and separate each group with a slash. For example,
to read data from columns 12 to 17 and 52 to 57, with each field being two columns wide,
you would type:
You can also use this notation for single non-sequential fields. For example:
The special_specs part of the statement is optional. You use it when a field contains
non-numeric codes such as $&&$ for None of these films. If you want to count codings
of this type, you must remember to allocate cells in the array for each code or group of
codes you wish to count. You then include the notation:
code = cell_number
int films 6s
ed
field films = (c12, c14, ch16) :2, $&&$=6
If you want to count more than one non-numeric code, list each one individually,
separated by commas.
✎ To tabulate data counted by a field statement, you use a bit which names the integer
array you have created and defines the element texts associated with each cell of the
array.
☞ You will find this statement documented in section 18.4.
Quantum normally resets the cells of the integer array to zero at the start of each record.
If you want counts to continue from one record to another, use a fieldadd statement
instead of field. For example:
✎ The advantage of using field or fieldadd is that they automatically count the number
of times a code appears in a list of fields. If you want a table which uses this
information, you just tell Quantum to increment the counts in the table by the values
stored in the appropriate cells of the array.
You can also manipulate the values stored in the cells before you tabulate the data.
For example, if you had codes for Aliens 1, 2 and 3, you might wish to merge them
into a single cell for all Aliens films so that the tabulation spec is easier to write.
Quick Reference
To remove values from variables, type:
where var1 to varn are any valid Quantum variable or range of variables. For example:
Data variables are reset to blank, integer variables are reset to 0 and real variables are
reset to 0.0.
Variables can also be cleared using assignment statements (e.g., t1=0), but there are
advantages to using clear instead. Firstly, clear is much easier to write. Secondly, with
clear the compiler checks that the subscripts are in the correct range (e.g., 1 to 33 if
‘myarray’ has only 33 cells); this is not possible with the loop method because the
subscript is a variable. However, if you use variables as subscripts with clear (e.g., clear
c(t1,t1+5) subscript checking once again cannot be done.
Quick Reference
To prevent Quantum from checking array boundaries during a run, type:
nobounds
Quantum normally terminates if it detects that you are writing beyond the end of an array.
For example:
Here, we have defined an integer array called ‘number’ as having 10 cells. When
Quantum reads the assignment statement and detects that it refers to ‘number(11)’ it will
terminate because there are only 10 cells in the array, not 11. The same would be true for
statements which referred to, say, t201 when the size of the T array had not been extended
past the default of 200 cells.
The exceptions to this are emit, delete, partial column moves and reads from fetch files.
☞ emit, delete and partial column moves are discussed earlier in this chapter. fetch files
are described in the section entitled "The fetch statement" in chapter 13.
While they may save you time in the long run, these checks do mean that your job will
run slightly slower than it otherwise would.
If you wish to run without these checks, insert a nobounds statement near the start of the
edit.
Quick Reference
To assign a value to a T variable in the data file, type:
*set tn = value
You may use a *set statement in the data file to assign a value to a T variable. Its format
is:
*set tn = value
where n is a number between 1 and 200 (unless you have increased the number of
T-variables).
The statement must start in column 1. You may type ‘set’ in upper or lower case, and may
follow it with any number of spaces. If Quantum reads anything that it cannot interpret
as a T variable, it terminates the run immediately.
This facility is available in all jobs with or without levels (trailer cards). You may use it
as many times as you need throughout the data file to assign different values to the same
T-variable, or to assign different values to a number of T-variables.
Statements in the edit section are usually dealt with in the order in which they occur in
the program. Quantum provides statements which may be used to alter this normal order
of execution, for example, by missing out a statement or repeating a group of statements
a number of times.
Quick Reference
To define statements to be executed if a certain condition is true, type:
The if statement has exactly the same meaning as in English; it defines a statement whose
execution depends upon the value of a logical expression. Let’s first take an English
sentence to explain this: we might say ‘If it is raining, I will take my umbrella’. Here, the
statement is ‘I will take my umbrella’ and it depends upon the logical expression ‘It is
raining’. If the expression is true (i.e., it is raining), the statement is executed (I take my
umbrella), if it is false (no rain) it is ignored (I don’t even think about my umbrella).
Now let’s take a Quantum sentence. We have a shopping survey in which respondents
have been asked to name the supermarkets in which they shop at least once a week. These
responses are coded into column 21 of card 1, and we want to keep a count of the number
of respondents shopping in Safeway’s (code 4). Our sentence would say ‘If column 21
contains a 4, increment our counter by 1’.
2. The logical expression whose value controls the action to be taken, enclosed in
parentheses (see section 5.2).
Thus, to translate our sentence into the Quantum language, we would write:
if (c121’4’) safe=safe+1
The logical expression to be tested states that the number of codes in columns 10, 11 and
12 is greater than three. If it is true, and there are, say, 5 codes altogether in those
columns, we will add a 9 into column 20 in addition to what is already there. On the other
hand, if it there are 3 or fewer codes in that field we leave column 20 as it is and continue
with the statement on the line immediately after the if. For instance:
+----1----+----2----+ +----1----+----2----+
62- 1 62- 1
0 / yields 0 /
4 4
9
but::
+----1----+----2----+ +----1----+----2----+
2- 1 2- 1
0 / yields 0 /
4 4
Once the emit statement has been executed, Quantum continues with the statement on the
next line.
The statement to be executed if the expression is true may be any Quantum statement,
even another if. For example:
says ‘if c130 contains a ’1’, and then if c131 contains a ’9’, then put the multicode ’19’
in c181’. This statement is not incorrect, but it can be more efficiently written as:
if (c130’1’.and.c131’9’) c181’19
This says, if the value of t4 is less than or equal to 5, put the multicode ’45’ in column
235 overwriting whatever is there already, then add a ’2’ into column 567 and, finally,
remove the ’0’ from column 789.
if (....) missingincs 1
Quantum will, in fact, switch on missingincs for the rest of the edit or until a
missingincs 0 statement is read. It does not switch on missingincs selectively
for only those records that satisfy the expression defined by the if clause.
☞ For further information about missingincs, see section 12.6.
Quick Reference
To define statements to be executed if a given condition does not exist, type:
In Quantum the keyword else means ‘otherwise’. In English we would say ‘If it’s raining
I’ll take the car, otherwise I’ll walk’; in Quantum we write:
This says, if the expression is true, execute the statements immediately after the if, but if
it is false, execute those following the else. For example:
Here, if c76 contains a ’4’, t3 is set to 1 and a ’3’ is deleted from c76. However, if c76
does not contain a ’4’, t3 is set to 2 and a ’2’ is added into c77.
Else may only be used as part of an if statement and must be separated from the if by at
least a semicolon. Statements of the form:
are correct, but since action is only required if the expression is not true, it more usual to
write:
Sometimes your Quantum program will include statements which refer to certain
respondents only; for instance you will only want to check the data associated with a
particular brand of soap powder if the respondent bought that powder. These statements
may be routed over when the respondent does not buy the powder by using the go to (or
goto) statement, followed by a statement number. The statement:
if (c121n’1’) go to 50
causes Quantum to go immediately to the statement labeled 50 if column 121 does not
contain a ’1’ (e.g., the respondent did not buy Brand A soap powder). Any statements
between this if statement and statement 50 are ignored whenever a record is read where
c121n’1’ is true.
The statement labeled 50 may be any Quantum statement, but many people just write:
50 continue
to gather all respondents together before continuing through the rest of the program. This
statement is described in the next section. All labels must be attached to statements: a
label by itself is an error and Quantum will tell you so.
You may route forwards or backwards in your program, but when routing backwards,
take care that you are not creating a situation from which it is impossible to escape: the
following will go on and on forever if you let it:
10 t1=t1+1
- - other statements - -
go to 10
The only way to avoid situations like this is to make sure that somewhere between
statement 10 and go to is another statement that routes you past the go to at some time,
for example:
10 t1=t1+1
- - other statements - -
if (t1.gt.10) go to 15
go to 10
15 continue
9.4 continue
Quick Reference
Attach the keyword:
continue
This statement is a dummy statement whose sole purpose is to join various bits of a
program together. It is often used with a statement label as a destination for routing with
go to, or to identify the end of a loop.
☞ To find out more about using continue with loops, read the section entitled "do with
individually specified numeric values" later in this chapter.
9.5 Loops
Quick Reference
To define a set of repetitive statements, type:
do label_number int_variable=value_list
statements
label_number statement
Loops are extremely important structures because they enable the same set of basic
statements to be executed over and over again on a changing series of numbers, columns
or codes. Their use can reduce the work involved in checking data. The statement which
introduces a loop is do which is formatted as follows:
3. An integer variable (for numbers or columns) or a letter (for codes) whose value is
to be used by the statements in the loop.
4. An equals sign.
5. A list of whole numbers, integer variables or codes which are the values the integer
variable or letter is to take. These may be entered in two ways (see below).
Loops should be terminated by any statement other than go to, stop, return, another do or
an if containing any of these words. The main purpose of the terminating statement is to
identify the end of the loop and send the program back to the start of the loop. Go to and
return send the record elsewhere, stop terminates the run and another do indicates the
start of another loop. The statement most often used to terminate a loop is the dummy
statement continue. Any statement that terminates a loop must be preceded by a label
number.
We will now go on to discuss the various ways of defining the values in the value list.
Quick Reference
To define a loop to be repeated for a set of given values, type:
The simplest way to define the values for the loop is to list them individually. In this case,
values must be whole numbers, separated by commas with the whole list enclosed in
parentheses. For example:
do 20 t5 = (125,130,140,145)
if (c(t5,t5+4).gt.3000) c(t5,t5+4)=$ $
20 continue
Before we discuss what this loop is doing, let’s look at the way it has been written. The
do statement tells us three things, namely that the loop is terminated by the statement
labeled 20, the integer variable to be used is t5, and the statements within the loop are to
be repeated four times (there are four values in the list). The statement labeled 20 is
continue which just sends Quantum back to do.
The purpose of this loop is to check whether the contents of four fields are greater than
3000, and if so to reset those columns to blank. The first time through the loop, t5=125.
When substituted into the if statement it yields:
if (c(125,129).gt.3000) c(125,129)=$ $
The next statement is continue which sends us back to the top of the loop. t5 is now
pointing to the second value in the list, 130. The if statement reads:
if (c(130,134).gt.3000) c(130,134)=$ $
This process is repeated until t5 has taken all values in the list. There is no need to include
statements which check the value of t5 and jump out of the loop when the last value is
reached: Quantum keeps a count of how many values there are and it knows that once the
last value has been reached it should continue with the statements following the loop.
Quick Reference
To define a loop which will be executed for a range of values, type:
If the incremental value is 1 and the loop has one range only, the incremental value may
be omitted.
Sometimes there will be a pattern to the numbers in the list: for example, they may
increase in steps of 5. You may list them all individually if you prefer, but it is quicker to
enter them as a range with a start, end and incremental value (in our example, 5) separated
by commas. The start value must be smaller than the end value, and the increment must
be positive. Quantum checks the start and end values and if the start is larger than the end
value, the statements inside the loop will not be executed at all. If the increment is
negative, the loop will be executed for the start value only.
do 20 t5 = 125,145,5
if (c(t5,t5+4).gt.3000) c(t5,t5+4)=$ $
20 continue
This loop is very similar to that used in the previous section. It will be executed for all
values of t5 between t5=125 and t5=145 where the value is incremented by 5 each time.
The loop says:
if (c(125,129).gt.3000) c(125,129)=$ $
if (c(130,134).gt.3000) c(130,134)=$ $
if (c(135,139).gt.3000) c(135,139)=$ $
if (c(140,144).gt.3000) c(140,144)=$ $
if (c(145,149).gt.3000) c(145,149)=$ $
You may enter as many range specifications as you like on one line, as long as each one
is separated by a slash (/):
do 15 t1 = 25,35,2 / 50,62,3
if (numb(c(t1).gt.1) c(t1)’ ’
15 continue
This loop replaces eleven if statements: t1 will take the values 25, 27, 29, 31, 33, 35, 50,
53, 56, 59 and 62.
If the loop has only one range, and the incremental value is 1, the 1 may be omitted. If
t3=11 and t4=15:
do 15 t2 = t3,t4
if (numb(c(t2).gt.1) c(t2)’ ’
15 continue
checks that columns 11, 12, 13, 14 and 15 each contain no more than 1 code. If not, the
column is reset to blank.
do with codes
Quick Reference
To repeat a set of statements for all codes in a given range, type:
Sometimes you will want to repeat a statement or set of statements for a given set of
codes, rather than for columns or other types of variable. The way to do this is to write a
do statement which, instead of naming an integer variable and whole numbers, defines a
list of codes and a temporary variable which points to each code in turn. When you want
to refer to the current code, you simply enter the name of the temporary variable and
Quantum will substitute the value of the current code in the statement before it is
executed.
to execute statements for all codes in the range ’p1’ to ’p2’, where the sequence of codes
is &–01234567890–&; or
In both formats, note that the variable name and the codes must all be enclosed in single
quotes. Additionally, you may not use the notation ’ ’ to indicate a blank code, nor may
you use the temporary variable in partial column moves (i.e., in statements of the form
c(1,4)=c(3,6)).
Here is an example which illustrates how to check for certain codes in a series of
columns:
do 10 ’code’ = (’1’,’3’,’5’)
if (c110’code’ .or. c111’code’) emit c180’code’
10 continue
This loop is executed three times, once for each of the three listed codes. The first time
the loop is executed, the statement will read:
Nested loops
Loops may contain other loops: this is called nesting. Loops may be nested up to six
levels deep, but they must not overlap. Also, each loop must have a separate terminating
statement. In other words, they must always take the form:
do 60 t2 = do 60 t2 =
do 70 t3 = do 70 t3 =
do 80 t4 = .
. or 70 continue
. do 80 t4 =
80 continue .
70 continue 80 continue
60 continue 60 continue
It is possible to route from inside a loop to outside, but not from outside to inside. The
following is permissible:
do 150 t1 = 125,145,5
if (numb(c(t1)).eq.1) c189’1’; go to 76; else; c(t1)’ ’
150 continue
76 continue
What we are saying in this loop is that if a given column specified by t1 is single-coded
(i.e., contains one code only) we set a spare column equal to 1 and send the record out of
the loop. If not, we set the column being checked to blank and return to the top of the loop
to get the next value of t1. This process continues until a single-coded column is found,
or until all values of t1 have been tried.
if (c176’3’) go to 76
.
.
do 150 t1 = 125,145,5
76 if (numb(c(t1)).eq.1) c(t1+1)’&’
150 continue
If c176’3’, the program would jump into the middle of the loop and have an unidentified
value for t1. An error message will be printed under the offending statement.
Quick Reference
To reject a record from the rest of the edit, type:
reject [level_name]
In a levels job, include a level name to reject all data at the givenlevel.
Normally all records are passed straight from the edit to the tabulation section regardless
of whether or not they contain errors. Reject tells Quantum to continue editing the record
but not to include it in the tables. The record is also rejected from the weighting and
where split is used, it is rejected from the clean file and may be found in the dirty file. For
instance, we might write:
if (c73’8’) reject
if (c80’1’) t5=t5+1
end
to reject records in which column 73 contains an ’8’ from the tabulations but not from the
rest of the edit. Therefore, even if c73’8’, the record is still checked for a ’1’ in column
80 and if one is found, t5 is incremented.
Whenever a record is rejected the variable rejected_ becomes true. You may use this
variable in your program to deal with rejected records in a different way to accepted
records. For instance, we may wish to write all rejected records out in the file rejfil for
later inspection and correction:
The variables rec_rej and rec_acc count the total number of records rejected and accepted
so far. You may wish to check these variables and terminate the run if too many records
are rejected.
If you are working with hierarchical (levels or trailer card) data, reject at a given level
will reject all data at that level. Additionally, data at a level higher than that currently
being edited may be rejected from tables – for instance, in the edit of data at the item
level, you may reject all data at person level. The syntax for this is:
reject levelname
✎ When used with split, reject at any level rejects the whole record from the clean file.
☞ For more information on levels jobs, see chapter 28.
Quick Reference
To send the record to the tabulation section, type:
return
The word return in Quantum bears no relation to the same word in English. It does not
mean go back to the start of the edit or anything like that, rather it means ‘terminate the
edit immediately and jump to the tabulation section’. Once the record is tabulated
Quantum reads in another record as usual. If there is no tabulation section, the next record
is read in straight away.
Return is very often used with reject to reject a record without finishing the edit. For
example:
Here any records in which c73’8’ are rejected from the tables, but, because reject is
followed by return which sends records to the tabulation section, editing is terminated
immediately. Thus, only records in which c73n’8’ will be tested for a ’1’ in column 80.
Compare this example with the one in section 9.6 above.
✎ Do not put reject after return because it will never be reached. Once the return is
read, the edit is terminated immediately and the record is passed to the tabulation
section without the rest of the statement ever being read:
if (c73’8’) return;reject
Quick Reference
To stop editing records and start tabulating records read so far,type:
stop [num_times_execute]
On some surveys you may want to run test tables on a few records only. This can be done
using the word stop.
Stop tells Quantum to stop the run and print tables once editing has been completed on
the current record. For example, we may want test tables for 50 people who own goldfish,
so we set up a counter and terminate the run when it reaches 50:
If we did not wish to restrict ourselves to goldfish owners, and were satisfied with just
the first 100 respondents, we could use the reserved variable rec_count in our test and
stop when it reached 100:
if (rec_count.eq.100) stop
Alternatively, to be sure that we stop when 100 records have been accepted for tabulation,
we could write:
if (rec_acc.eq.100) stop
When the stop statement is executed, the reserved variable stopped_ becomes true.
A variation of stop is
stop n
where n is the number of times the statement is to be executed. If stop is part of a routing
pattern in the edit, it may be necessary to read in more than the n records to execute the
statement n times. As an example, here is another way of counting goldfish owners:
Here, the stop statement is only executed whenever we find someone who owns a
goldfish. We may need to read data for 72 respondents before we reach our target of 50
goldfish owners.
When either form of stop is used, editing and tabulation is completed for the respondent
at which the condition is fulfilled, and no more records are read. Therefore, if we have to
process 72 respondents in order to find 50 goldfish owners, a holecount requested by the
edit would include 72 records and errors in those 72 records would be included in the
error listings.
Quick Reference
To cancel a run, type:
cancel [num_time_execute]
The word cancel, which is similar in format to stop, terminates the run immediately,
producing tables only for those respondents already passed to the tabulation section. It is
often used to halt a run when too many errors have been detected in the data. For instance,
to cancel the run when more than 100 errors have been found, we might have:
To cancel the run when more than 50 records have been rejected, we could write:
if (rec_rej.gt.50) cancel
Alternatively, cancel may be followed by a number indicating that the run should be
cancelled when the statement has been executed a specific number of times:
cancel 100
cancels the run when this statement has been executed 100 times.
As with stop, holecounts and error listings will only contain information about records
read prior to the cancellation condition being fulfilled. If 400 records are read before 101
errors are found, we will only see errors for those 400 records.
Quick Reference
To send a record temporarily to the tab section, type:
process
Process is an edit statement which is similar to return but must not be confused with it.
When return is executed, the record is sent on to the tabulation section; after the tables
are completed for that record, the program returns to the start of the edit section and the
next record is read in.
When process is executed, the record is also sent immediately to the tabulation section
where it is used in table creation. However, after the record has been tabulated, control is
passed back to the edit section to the statement immediately following the word process.
The record continues through the edit and any statements after process applicable to the
record are executed. At the end of the edit the record is passed through the tabulation
section again.
Process is used when you need to tabulate portions of a record more than once. For
example, if our survey asks shoppers about the brands of bread they purchased the last
four times they visited the shops, our data may be set out as follows:
Suppose we wish to create a table showing the total number of loaves of each brand
bought by all (or selected groups of) respondents during their four trips to the store. The
simplest way to do this is to set up an axis of the form:
l brd;inc=c135
n23Number of Loaves Bought
col 134;Brand A;Brand B;Brand C;Brand D
process
in the edit at the point you want to tabulate the record for the first brand.
c(134,135)=c(136,137)
process
This overwrites the information about the first purchase with information about the
second purchase, and the record is processed a second time. The total number of loaves
bought on the second trip will be added to the total number of loaves bought on the first
trip.
c(134,135)=c(138,139)
process
c(134,135)=c(140,141)
process
When we finish, the total number of loaves of each brand bought by all respondents
during those four visits will be contained in the relevant cells of the axis.
In a situation like this we would probably put the process statements in a loop at the end
of the edit, although this is not strictly necessary. For instance:
do 10 t1 = 134,140,2
c(134,135)=c(t1,t1+1)
process
10 continue
This performs exactly the same task as the list of statements shown earlier; it is just a
more efficient way of writing them.
✎ Be careful if process is the last statement in your edit: the record will be passed to
the tabulation section by process and then again by the end statement. If this is not
what you want, omit the last process.
☞ For another example of process, see the section entitled "Incrementing tables more
than once per respondent" in chapter 18.
There are a number of ways of examining your data once it has been read into the
C-Array. You may:
c) write out specific records and examine them individually, as discussed in chapter 7.
10.1 Holecounts
Holecounts are used to obtain an overall picture of the data before you write your edit
program. For each column they show:
• the density of coding – i.e., how many respondents have 1, 2 or 3 or more codes in
each column
There is an example of a holecount on the next page. The first column tells us the columns
for which codes are being counted; in this case it is columns 1 to 16 of card 1. The
numbers across the top are the individual codes, and the total in the top left-hand corner
is the total number of respondents (records): our data has 196 respondents.
As you can see, there are two numbers in each cell; an absolute figure and a percentage.
The former tells us how many records were found with a specific code in a column and
the latter tells us what percentage of the total data that is.
For example, there are 169 records with a code 1 in column 14 and this is 27.9% of the
total. Similarly, 32 records have a code 4 in column 15 which is 5.3% of the total records.
Notice that when the cell total is zero, no percentage figure is printed: this all makes it
easier to see the pattern of coding in each column.
The four right-hand columns of the holecount show the density of coding in each column.
the columns headed Den1 shows the total number of records with only 1 code in any sort
in the column. Den2 is the number of records with 2 codes in the column, and Den3+ tells
us how many records were multicoded with 3 or more codes in that column. The TOTAL
is the total number of codes in that column – that is, the sum of Den1, Den2 and Den3+.
Let’s look at column 108. 47 records have 1 code only in that column; 54 have 2 codes
and 84 have 3 or more codes. The total number of codes in this column is 410, and each
card has an average of 2.09 codes in this column.
The holecount is the starting place in your search for errors. There are many holecounts
in which it is immediately apparent that the presence of certain codes indicates an error.
It is also clear whether or not the column should be multicoded.
Creating a holecount
Quick Reference
To create a holecount, type:
where text is the heading to be printed at the top of each page. This is optional; if it is
omitted the holecount will simply be headed ‘Holecount’. Our example was created by
the statement:
Quantum itself accepts double quotes in the holecount heading, but the C compiler which
processes the code that Quantum creates from your specification does not. Generally, it
will issue an error message that refers to a missing ) symbol at the point the double quote
occurs. To prevent this happening, precede the double quote with a backslash. For
example:
You may count as many or as few columns as you like, as long as the columns to be
counted are consecutive: to count, say, columns 135 to 140 and columns 160 to 180 you
will need two statements, one for each field.
Records are counted at the stage they are when the count is read. If you have previously
altered any columns, say, with assignment or emit statements, the count will refer to the
columns as they are after the alterations rather than as they were in the original data file.
Similarly, any changes which are effected after the count are not reflected in the output.
✎ If you place a count statement in a loop, a holecount will be created the first time
the statement is read, and will be ignored thereafter.
Filtered holecounts
A filtered holecount is one in which only records fulfilling a specific condition are
counted. They can be created using the if statement to define the occasions when a record
should be counted.
For example, suppose we only wish to include male respondents in our holecount. Our
statement might be:
Normally, trailer cards of a given type are treated as one card and are counted together.
Thus, the number of codes in a column for a particular trailer card contains the sum of all
codes found in that column on all trailer cards of the given type (e.g., all cards 2s).
You may, however, prefer to produce holecounts on such cards based on their relative
position within the group of trailer cards. For example, suppose card 2 is a trailer card
and we wish to make a holecount on the third card 2 of each group. In chapter 6 we said
that the variable allread2 is true when a card 2 has been read in for the current record, and
that it keeps count of the number of card 2s read. So, to produce a holecount for the third
card 2, we would write:
We can also create filtered holecounts of trailer cards based on characteristics of the
individual cards. Suppose we have a trailer card for each store visited, in which the store
is identified in c79. The trailer card is the 5-card. We would write:
Multiplied holecounts
Quick Reference
To create a multiplied or weighted holecount, type:
where text is the holecount title and c(m_start,m_end) is the field in the C array
containing the multiplier or weight for each record.
In ordinary holecounts, the cells are simply counts of records: each time a record is read
with a specific code in a given column, the relevant cell in the holecount is incremented
by 1. If 231 records have a 7 in column 79, the figure in that cell will be 231.
Holecounts may also be created by incrementing each cell by the value found in a column
field in the record. This value is the record’s ‘multiplier’. If the multiplier is 15, and the
record has a 6 in column 152, the count for c152’6’ will be incremented by 15 rather than
by 1 for this record. You may hear this type of holecount referred to as a weighted
holecount because multiplying a record by a given value is the equivalent of weighting i.
✎ If the multiplier is being calculated during the run, it must be placed in the C array
using wttran before the holecount is requested.
☞ For further details on weighting and wttran, see section 25.9.
where c(m,n) is the field to be counted, text is the optional heading to be printed at the
top of each page, and c(x,y) is the field containing the multiplier for the record. If this
field contains a real number, it must be referenced as cx(x,y) otherwise the decimal point
will be ignored (e.g., 1.5 will be read as 15).
The number labeled TOTAL at the top of each page of output is no longer the total number
of records in the data file, rather it is the number of records after each record has been
multiplied by its multiplier. This is best illustrated by an example. If we are producing a
holecount for C(20,30), and of our 50 respondents, 20 have a multiplier of 2.5, 15 have a
multiplier of 2.6 and 15 have a multiplier of 3.0, the total at the top of the page will be
134 respondents, calculated as follows:
Multipliers may be part of the original data file or they may be calculated during the edit.
Both real and integer values are valid, even though the cell counts in the output will
always be shown as whole numbers. This does not mean that you lose accuracy with real
multipliers. Quantum stores the cell counts with as many decimal places as are necessary
until the count is complete, whereupon it rounds all values ending in .49 or less down and
all values ending in .5 or more up.
The figures used to create the multiplied holecount would then be 22.4, 12.7, or 11.9,
depending upon the contents of c104 in each record. Suppose we have 27 home owners
(i.e., 27 people have c104’2’); the count for a ’2’ in column 4 of card 1 would be 612.9
(27 x 22.4) which would appear in the output file as 613.
(i) since we are copying a real number into a field of columns we use the notation cx to
refer to the columns and follow them with the number of decimal places required.
(ii) also, because the word count is written in lower case it may start in column 1. If it
had been written in upper case it would need to start in a column other than 1 to
prevent it being read as a comment.
By default, each distribution has two parts. In the first part, the values in the column field
are sorted in alphabetic or numeric order; in the second, they are sorted in rank order,
according to the number of times each one occurs in the data. Any multicodes in the field
are decoded and the constituent codes are listed. Each distribution shows both absolute
and cumulative figures as well as percentages for both. At the end of the alphabetic sort,
Quantum prints:
c) the sum of factors – that is, the sum of all wholly numeric items (values which occur
more than once are counted as many times as they occur)
d) the mean for the numeric items listed (i.e., the sum of factors divided by the number
of numeric items)
If the field is numeric and the run has missing values processing switched on, fields that
are non-numeric will contain the value missing_. This value is counted as zero by the sum
of factors, mean and standard deviation lines of the report.
Quick Reference
To create a frequency distribution sorted in alphabetic and rank orders, type:
To produce a frequency distribution sorted in alphabetic order only, type lista instead of
list. For a distribution sorted in rank order only, type listr instead of list.
A frequency distribution, as shown in the example on the next page, is created with the
list statement, as follows:
where c(m,n) is the column field whose contents are to be listed and $text$ is the heading
to be printed at the top of each page. If no heading text is given, the heading ‘Frequency
Distribution’ is used instead.
The list statement, as shown above, produces both the alphabetic and numerically-sorted
distributions. To request an alphabetic distribution only, type:
The first example produces a frequency distribution of the contents of c(107,108) sorted
in numeric order; the second example generates a list of car brands which will be sorted
in alphabetic order. Additionally, we are using subscripts to represent the column
numbers. If t1 has a value of 36, Quantum will list the values found in columns 36 to 40.
The rules for double quotes in the text are the same as for holecounts, that is, you must
precede them with a backslash.
The list in the diagram below shows a frequency distribution for the column field
c(123,125). It was created by the statement:
Since it was run on a data file containing 200 respondents, the total is 200.
Let’s start with the first table – the alphabetical sort. The figures in the column headed
‘string’ are the values found in columns 123 to 125, in this case, the price paid for a bottle
of mineral water. The next column (item) tells us how many times each code occurred in
those columns – that is, how many people paid each price. We can see the actual number
of people and also what percentage of the total sample that is. For instance, 31
respondents paid 111p which is 15.5% of the total (200).
The columns labeled cumulative show accumulated totals and percentages for each value
found. There are 86 respondents who paid between 111p and 114p, and these are 43.0%
of the total respondents.
The second table shows exactly the same information presented in rank order, with the
most frequently occurring value first. In the example this is 212. 41, or 20.5%, of
respondents paid 212p for a bottle of mineral water.
Unlike count, if list is part of a loop, it will be executed once for each pass through the
loop. All values found will be entered in the same list: Quantum does not create a separate
listing for each pass through the loop.
PRICE PAID
Total = 200 Alphabetical Sort
Number of categories = 14
Number of numeric items = 200
Sum of factors = 32218.00
Mean Value = 161.09
Std deviation = 67.97
PRICE PAID
Total = 200 Rank Sort
Quick Reference
To create a multiplied or weighted frequency distribution, type:
where text is the frequency distribution title and c(m_start,m_end) is the field in the C
array containing the multiplier or weight for each record.
For a distribution sorted in alphabetic or rank order only, type lista or listr as appropriate
instead of list.
As with count, c(m,n) is the column field whose values are to be listed, text is the optional
heading to be printed at the top of the page, and c(x,y) is the field containing the
multiplier. Multipliers may either be part of the original data, or they may be created
during the edit, in which case they must be placed in the C-Array with a wttran statement
before the frequency distribution is requested.
☞ For further details on weighting and wttran, see section 25.9. Multiplied frequency
distributions are generally required when you are producing weighted tables and
you want to check that you have the correct number of people in each row of a table.
In earlier chapters we discussed ways of examining the data for a set of records (with
count) or for an individual record (with write). In general, however, we want to check the
validity of the data for individual records by putting in the edit a set of testing sentences
which will tell us not only whether a record contains an error but also what that error is.
There are two types of checking sentence. The first involves checking whether a column
contains the correct type of coding (single-coding/ multicoding) and whether the codes
in that column are valid. Take the question on a respondent’s sex which may be Male,
coded c106’1’, or Female, coded c106’2’. c106 must be single-coded since no person
can have two sexes, and the only codes which may appear in that column are 1 and 2.
Any record in which c106 is not single-coded with a 1 or a 2 will be flagged as incorrect.
The second type of checking involves making sure that columns whose contents depend
on the contents of other columns contain the correct codes. For instance, suppose the
questionnaire asks whether the respondent has ever used a particular brand of washing up
liquid. The answer is coded into c125 as ’1’ for Yes and a ’2’ for No. If the answer is Yes,
the next questions concerning price and quality are asked. If c125’2’ indicating that the
respondent has not used that brand of washing up liquid, the following columns must be
blank. Conversely, if c125’1’, the following columns must be coded according to the
codes on the questionnaire.
11.1 require
Both tasks listed above can be carried out using if but sometimes they can become very
complicated and repetitive. Therefore, Quantum has an additional testing statement,
require, specifically designed to increase the efficiency of this checking process.
Column Validation
Tests columns against a given set of characteristics and deals with records not meet-
ing the requirements according to a specified action code.
The actions which are carried out when the stated conditions are violated are determined
by an error action code defined either in the require statement itself or in a global
statement placed at the start of the edit.
☞ The error action code is discussed in the section below entitled "The action code".
The require statement has three forms, depending upon the function it performs, and
these are described in the subsequent sections. Each one must start with the word require
which may be abbreviated to R.
Quick Reference
To validate columns and codes, type:
where code is the error action code, condition is the type of coding required, and col1 and
col2 are the columns or fields to be tested.
For example:
Our example checks that columns 110 and 125 are not blank (nb). Any records in which
this is not the case are written out to a new file and rejected from any tables that may be
produced (/5/).
Quick Reference
To define a default error action code, type:
rqd number
The action code is a number between 0 and 7 which tells Quantum what to do with
records that do not match the required conditions (e.g., are blank when they should
contain codes). It may either be entered as a parameter on each require statement or, if it
is the same for all statements, on an rqd statement.
0 Print a summary of errors only – records are not listed individually, but a count is
kept of the number of records failing each require statement. This is printed out at
the end of the run..
3 Print the record and reject it from the tables. This is the default.
5 Write the record into the output data file, punchout.q and reject it from the tables.
6 Print the record in the print file, out2, and write it into the output data file, pun-
chout.q.
To write a statement which would print out incorrect records but include them in the
tables, we would write:
r /2/ ....
Similarly, to have all incorrect records printed in the print file, written into the output data
file and rejected from the tables, we would write:
r /7/ ....
In both cases the action code is part of the individual require statement, but where the
same action applies to all requires, it is quicker and more efficient to define the action
code on an rqd statement at the beginning of the edit. For instance, if all erroneous
records are to be written out and rejected we would write:
rqd 5
The default action is to print the record out and reject it from the tables:
Checking with require can be as simple or complex as you like. In this section, we will
start with the simplest checks and deal with each extra feature in turn. We will assume,
unless otherwise stated, that the error action code is the default Print and Reject (code 3)
and will omit it from most of the examples accordingly
The most basic form of the require statement simply checks whether the column or field
of columns contains the correct type of code; it does not check the individual codes
themselves. Code types may be:
b Blank
nb Not blank (i.e., single-coded or multicoded)
sp Single-coded (literally, single-punched)
spb Single-coded or blank
One of these types must follow the word require since it tells Quantum what to check for.
All that remains is to say which columns are to be inspected; just list each column or field
of columns at the end of the statement. If more than one column or field is defined, each
one must be separated by a comma.
----+----1----+----2----+----3----+----4----+
002411123481231&- *1927235537*&& 1 1 1
The statement:
checks that columns 10, and 25 to 35 inclusive are not blank – they may contain any
number of codes. This record satisfies both conditions so it passes on to the next
statement in the edit.
The statement:
looks to see whether columns 11, 15, 23 and 41 are single-coded. In our record they are,
but if this were not the case (say c11’123’) the record would be printed out and rejected
from any tables that may be produced. Additionally, Quantum would tell us ‘Column 11
is 123’.
✎ Be careful when using field specifications with require: the condition applies to
each column individually, not to the field as a whole. For instance:
r sp c(1,4)
means that each of columns 1, 2, 3 and 4 must contain one code. It does not mean
that the field must contain one code overall. To check that a field contains one code
only, use numb.
☞ numb is described in the section entitled "Comparing data variables and data
constants" in chapter 5.
Very often some columns on the questionnaire are not used, so you might like to check
that all such columns are blank in the data file. In our example, let’s say that columns 51
to 70 are not used. To check that there are no stray codes in these columns we would
write:
r b c(51,70)
Quick Reference
To define a message to be printed when a record fails a test, type:
When incorrect records are printed out, require automatically prints a short text
describing the error. Normally, it tells you what codes were found in the column which
is wrong, but if this is not what you want, you may define your own error text by entering
it enclosed in dollar signs at the end of the statement. This text will then be printed in
place of the default text when errors are found. For example, if c329 is multicoded when
it should be single-coded, the statement:
r sp c329
will print the whole record and tell us which codes were found in that multicode:
Column 329 is 13
Instead of being told which codes the column contains, you may prefer to see a message
linking the error to a question on the questionnaire. In this case you will need to add your
own error text as follows:
Quick Reference
To check for specific codes in a column, type:
where codes1 are to codes to be tested for in column or field col1, and codes2 are the
codes to be tested for in column or field col2.
Any codes which are present in col1 but are not listed in codes1 are ignored. The same
applies to any other column and code pairs listed.
Sometimes it is not sufficient to check just the type of coding, and you will want to know
whether the codes found are valid for that column. To do this, we use the information
given in the previous section as a base, and add on our first ‘optional extra’.
To check whether a column or field of columns contains specific codes, follow the
column specification with the codes to be checked, enclosed in single quotes. For
example:
r /5/ sp c223’1/5’
tells us that column 223 should be single-coded within the range of codes 1 through 5.
Any other codes in this column are ignored. Thus, a record in which c223’14’ is
incorrect because it contains two of the listed codes, whereas a record in which c223’27’
is correct because it contains only a 2 from the range ’1/5’. Of course, any record which
does not contain a 1, 2, 3, 4 or 5 at all is also incorrect, regardless of whether or not it is
single-coded: c223’9’ is just as wrong as c223’789&’.
Codes may also be defined with all other code types, thus:
r /3/ nb c156’2/6’
If c156 does not contain at least one of the codes 2 through 6 (regardless of anything else
it may contain) the record is printed out. Column 156 may be multicoded as long as at
least one of the codes is within the required range.
----+----6
9
Even though it checks for blanks, require b may be followed by columns and codes. You
would do this when you are checking that a column is either blank or, if not blank, that it
does not contain certain codes. Here’s an example to clarify this:
r b c134’1/8’
This statement tells Quantum that column 134 must never contain any of the codes 1
through 8: only ’09-&’ or blank are acceptable. This is the opposite of r sp and r nb, both
of which list valid codes. Any record failing this condition will be printed and rejected
via the default action code 3.
Exclusive codes
Quick Reference
To check that a column or field contains no codes other than those listed, type:
If col1 contains any codes other than those given in codes1, the test is false.
Now that you know how to check codes, the next thing to discuss is how to check that all
other code positions are blank.
r sp ca’p’
accept all records containing only one of the codes ’p’ in column a, regardless of what
other codes are also present. To check that a column contains only the listed codes and
nothing else, follow the code specification with the letter O (for only) in upper or lower
case. For example, to indicate that c356 must be single-coded in the range ’1/5’ and that
all other positions (’6/&’) must be blank, you should type:
r sp c356’1/5’o
Any of the following would cause the record to be printed and rejected:
Require may define conditions for more than one column. Just follow each column with
the code positions to be checked and separate each set with a comma:
Here the columns to be checked are consecutive but have been listed separately because
they each have different sets of valid codes. If all columns could be single-coded in the
range 1 to 7 we might abbreviate this to:
r sp c(164,168)’1/7’ $q10a/e$
since this notation means that each column in the field must be single-coded within the
given range rather than that the field as a whole may contain only one of those codes.
Quick Reference
To define a correction code to be used as a replacement for codes which fail the required
condition, type:
new_code is the code or codes to be inserted in col1 if it fails the test condition. Any codes
already in that column are overwritten.
As you know, records found to have errors are printed, coded and/or rejected according
to the error action code. When the run is finished you will look at these records and, if
possible, correct the errors by using the on-line edit or correction file facilities.
☞ See chapter 12 for information about on-line editing and the corrections file.
Occasionally you will know in advance what to do with certain types of error; say, for
instance, the respondent’s sex has been miscoded. You may decide or be told to recode
this person as a ’3’ in the appropriate column indicating that the sex was not known. The
way to do all this in one go is to write the normal require statement that checks columns
and codes, and to follow the code specification with a colon (:) and the replacement code
(in this case ’3’) enclosed in single quotes, thus:
Any record in which c106 is not single-coded with either a ’1’ or a ’2’ will have the
contents of c106 overwritten with a ’3’.
if (numb(c106’12’).ne.1) c106’3’;
+write $c106 incorrect$
When working with fields, it is not possible to define replacement strings for the field as
a whole. You should, however, note that if a single replacement code is given for a field
of columns, any incorrect columns in that field will be overwritten with the replacement
code. The correct columns remaining untouched. If we have:
+----4----+
1927
+----4----+
1&2&
✎ If you use this facility, remember that the replacement code is an alteration to the
data, and as such is operative only as long as each record is in the C array. If you
want to save these modifications you must include a statement in your edit which
will write records to another file. Statements which write out new data files are split
and write. Alternatively, you can use one of the action codes which writes records
to the output data file.
☞ For information about split, see section 12.4.
For information about write, see section 7.1.
Quick Reference
To define defaults for all columns or fields tested, type:
The defaults may be overridden for an individual column by following the column with
the required coding, only flag and replacement code as usual.
By now you will have guessed that require statements can become lengthy things,
especially when specific codes have to be checked, replacement characters defined and
error texts entered. In many cases some, if not all, of these items will be common to the
majority of the columns listed in the statement; for instance, several non-consecutive
columns may have the same set of valid codes.
When this happens you may enter these common items at the beginning of the require
statement as defaults for that statement. There are several ways of doing this, so let’s take
the statement:
Both statements check whether columns 127, 129, 131 and 133 are single-coded n the
range 0 to 9 or are blank. If the − or & codes appear in any of these columns, or if the
columns are multicoded, the offending records will be printed and rejected.
Defaults defined at the start of a require may be overridden for an individual column or
field by following that item with the new specification. For example:
tells us that columns 10, 12 and 15 must be single-coded in the range 1 to 5 while column
20 must be single-coded in the range 1 to 3.
This checks that columns 10, 12, 15 and 24 are single-coded in the range 1 to 5 and that
none of the codes ’6/&’ are present in those columns. Column 20 has its own code
specification which overrides not only the default codes but also the Only operator.
Quantum will check that c20 contains only one of the codes 1 to 7, but it will ignore
anything it finds in the range ’8/&’.
This is exactly the same as the previous example except that we have added a
replacement code to be used when errors are found. This code refers to all columns
named with this require, even though column 20 has a different set of valid codes.
Quick Reference
To evaluate a logical expression, type:
Require can be used to evaluate a logical expression. If the expression is false, the record
will be dealt with according to the specified (or default) action code. If the expression is
true, the program continues with the next statement.
This type of require also has four parts, two of which are optional:
For example:
says that c133 must contain a ’4’ and c140 must not contain a ’5’. If one or other or both
expressions are false, Quantum prints the record out with the message ’Cols 33/40
incorrect’ and rejects it from the tables.
This type of require statement is often used to check the number of codes present in a
column or group of columns. For example, if the questionnaire specifies that the
respondent should name no more than three products in his answer, you might write:
r (numb(c139).le.3)
causing any record in which column 39 is multicoded with more than 3 codes to be
printed and rejected. This statement has no error text, so any records printed will be
followed by the require statement itself.
Quick Reference
To test whether a group of logical expressions all have the same logical value, type:
Require can evaluate groups of expressions and perform given tasks depending on
whether all expressions are true or all are false. When all the expressions have the same
value (i.e., all true or all false) Quantum continues with the next statement in the program,
whereas if some are true and some are false, the record being tested will be dealt with
according to the given (or default) error action code.
This type of statement is generally used to check routing patterns. For example: if a ’2’
in c125 means that the respondent did not try Brand A washing powder, we would expect
columns 126 to 145 which record his opinion of it to be blank. On the other hand, if he
tried the washing powder, we would expect to find his opinions about it coded in columns
126 to 145. This can be written:
r = (c125’2’) (c(126,145)=$ $)
which says that to be accepted, a record must either have a ’2’ in column 125 and blanks
in columns 126 to 145, or something other than a ’2’ in c125 with at least one code
somewhere in c(126,145). Here’s some data to clarify this:
----+----3----+----4----+----5
2 15 is accepted, so is
----+----3----+----4----+----5
15 42674 262&03 37 73 but
9 4 0
----+----3----+----4----+----5
2 6 8 15 is rejected, so is
----+----3----+----4----+----5
3 635
The first example is accepted because both expressions are true, the second is accepted
because both expression are false. The third and fourth expression are both rejected
because one expression is true and the other is false.
Note that in this example, if column 125 does not contain a ’2’ we are only checking that
columns 126 to 145 contain at least one code; we are not checking whether those codes
are correct.
When Quantum executes a require statement, it sets the variable failed_ to True if the
data fails the require statement or to False if the record passed the requirement. You can
then test whether failed_ is True and take whatever actions you wish. For example, if you
are checking that the respondent’s sex is coded as a ’1’ or a ’2’ only, you may wish to
blank out the column if it contains any other code or codes. You could write this as:
r sp c123’12’
if (failed_) set c123’ ’
The test for failure is made on the last require statement executed for the current record.
This may not always be the most recent require statement in the program, and it may not
be the require statement you intend Quantum to execute. If you write:
r sp c112’1/5’
if (c115’1’) r b c116
if (failed_) set c116’ ’
the test for failure could apply to either of the previous statements. If column 115 does
not contain a ’1’, the second require statement will not be executed and failed_ will be
True if column 112 is not single-coded in the range ’1/5’. If column 115 contains a ’1’,
then failed_ will be True if column 116 is not blank.
You can get around this potential problem by setting failed_ to zero (the equivalent of
False) just before the require statement you wish to test. For instance:
r sp c112’1/5’
failed_ = 0
if (c115’1’) r b c116
if (failed_) set c116’ ’
Require is often part of an if statement saying "If this is true, then that also must be true".
In our previous example with r= we were saying two things:
a) if the respondent didn’t try Brand A, the columns associated with it must be blank, or
b) if he tried Brand A, there must be a code in at least one of the associated columns.
Sometimes this type of test is too stringent and will reject records in which the data is
perfectly correct. For example, the extra questions for people who tried the product may
not contain a specific code for Refused or No Answer, so anyone who tried the product
but refused to answer the extra questions would have blanks in the relevant columns. This
data is perfectly correct but would be rejected by the r= statement which expects at least
one column to contain a code. Therefore, we need to write a statement that will only
check whether columns 126 to 145 are all blank if the respondent didn’t try the product;
if the respondent tried the product we do not care whether he answered the extra
questions or not. The statement for this is:
This says that if the respondent did not try Brand A, all columns associated with it must
be blank, but if he tried the product we expect those columns to be single-coded in the
range ’1/7’ or blank.
One can also make require statements apply to smaller sets of data by having records for
which they would be irrelevant go around the statements. Let’s say c112 records whether
there are children in the household. If c112’1’ there are children and c113 and c114 must
contain answers. We could write:
if (c112n’1’) go to 30
r nb c(113,114)
30 continue
This means that all irrelevant records (respondents without children) would not be tested.
✎ This system makes sense when there are several requires and you want to avoid a
whole set of identical if statements. It’s more efficient and it’s easier to follow.
Remember, as well, that you can put in comments to remind yourself what you are
doing and why.
It is always possible to deal with data which has been incorrectly coded and/or entered.
If the errors themselves cannot be corrected because correct codes cannot be determined,
the incorrect data can be collected under some miscellaneous heading in the tabulations.
However, a cleaner data set can be obtained by correcting or removing invalid data
whenever possible.
• Replace the incorrect codes with specific codes using edit forcing statements.
• Write a file of corrections to be merged with the original data when it is read in by a
Quantum program.
Changing the contents of the original data file is not a function of Quantum: you will need
to use the data editing program, ded, for this. If you do need to edit the original data file,
you should always take a copy of it first in case your editing does not have the desired
effect.
This section does not introduce any new keywords; instead it tells you how to combine
the statements that you already know in order to clean your data.
A record which generates too many error messages, or which is clearly incorrect can be
removed, as noted. Suppose its serial number is 2004. Then we have:
This rejects the record from the rest of the edit and the tabulation section as well.
This statement should be at the beginning of the edit to avoid unnecessary editing of a
useless record.
Columns within a record can be removed by blanking them out or setting them to a
common reject code, often a minus or ampersand. For example:
All records in which c125 contains neither a 1 or a 2 will have the contents of that column
replaced with an ampersand, and whatever is in c(126,145) blanked out. As a real-life
example, suppose a 1 in c125 means that the respondent visited the market, and a 2 in that
column means he did not. Information about purchases made at the market are stored in
c(126,145). If column 125 contains neither a 1 or a 2, we cannot clearly establish whether
or not the respondent visited the market so we set c125 to a special code and blank out
any information about purchases.
Inserting correct data is generally more difficult than removing invalid data, because you
very often don’t know what the correct data is. However, if you do know, you can correct
the data record by record, or make the same correction for any record which is incorrect.
For instance:
corrects the record whose serial number is 2222 by setting a 2 into c112 and blanking out
c(113,114).
If you do not know what the correct data is, you may decide to replace the incorrect code
or codes with a valid code chosen at random. For example:
if (c(101,104)=$3625$) c145=rpunch(’1/5’)
replaces whatever was in column 145 with one of the codes 1 through 5 for the record
whose serial number is 3625.
Note that when correcting data on a record-by-record basis, it is more convenient to use
the methods outlined below.
Quick Reference
To allow interactive correction of errors, type:
online [label_number]
at the point at which you want to make corrections. label_number is the label of the
statement to execute when the record is returned to the main edit with an rt command.
The default is to return to the start of the edit.
When an incorrect record is found, the current contents of the C array are written to the
print file, out2, as usual, and a message is displayed on your screen indicating the record’s
position in the data file. Any messages associated with the write or require statement
finding the error are also displayed, and you then have the opportunity to accept the
record as it is, reject it, correct it or re-edit it. The record itself is not displayed unless you
request it.
online
You may put in as many online statements as you like, but as long as there is one online
statement in the edit, on-line editing will be possible both at the point where the statement
occurs and also at the end of the edit. If there are no errors to be corrected, Quantum
ignores the online statements.
Once an incorrect record has passed through the on-line edit, you may leave it to continue
through the rest of the standard edit until it reaches the end statement or you may return
it to the start of the edit to be retested. If you prefer, you may name a statement to which
records should return simply by giving that statement a label number and following
online with that number. For example:
online 45
✎ Runs containing on-line edits must be run from a terminal rather than in the
background until the edit section is finished; otherwise you will not know when
there is a record awaiting correction.
Any corrections made during on-line editing are effective only during the current
run unless your edit contains one of the commands split or write to create a new data
file. If your program calls the on-line editor but does not contain split or write, a
warning message will be displayed when your program is checked.
Like any other editor, the Quantum on-line editor has its own set of commands, many of
which are similar in appearance and function to statements you would write in a normal
Quantum edit. There are three types of editing command: those which determine what
happens to the record, those which correct errors in the record and those which terminate
on-line editing either for the individual record or for the file as a whole.
Quick Reference
To display the record being edited, type:
di [column(s)]
Sometimes it is easier to see the error if you print out the incorrect column or columns
separately rather than looking at the whole record. To see a column or field only, just
follow the di command with the numbers of the columns you wish to see. For example,
Column fields may be entered as just two column numbers separated by a comma, the
parentheses and the C being optional. Thus, the second example could equally well be
written:
di 115,130
When a single column is displayed, the individual codes comprising a multicode are
shown, but when fields are displayed, a ruler is printed and multicodes appear as asterisks
(*). Here is an example:
-> di 25,35
+--- 3 ---+
613*9 2 144
-> di 28
159
->
In the first example, the asterisk represents a multicode, whereas in the second example
where only one column is displayed, the codes 1, 5 and 9 are a multicode in column 28.
Correcting records
Quick Reference
To overwrite the current contents of a column or field with a new code or string, type:
e column(s) codes
de column(s) codes
The words used for correcting records are set, emit and delete which are usually
abbreviated to s, e and de. They work in exactly the same way as their counterparts in the
ordinary edit section: s overwrites the original contents of a column or field with new
information; e appends a single code to the codes that are already in a column and de
removes one or more codes from a column leaving the remainder intact.
There are many variations of these commands, all of which are equally correct. Just
choose the one that you find most convenient. Here are some examples. The first group
are set statements for overwriting the contents of a column or field with the given code
or string of codes.
If you want to overwrite a single column with a single code, use one of the four formats
on the first line. In all cases you may type in the full command word (set) or the
abbreviation (s). All four variations replace whatever is currently in c5 with a code 7.
The examples on the second line are for overwriting a single column with a multicode.
Notice that if you use the = notation, the single quotes enclosing the multicode are
optional.
The last line illustrates how to overwrite a field of columns with a string – in this case to
replace the current contents of columns 123 to 126 with the codes 4, 5, 6 and 7
respectively.
In all on-line set statements you may omit the set or s at the beginning of the command,
thus:
When it comes to adding codes to columns, the on-line editor has an option that the
ordinary editor does not. Whereas the ordinary emit statement only allows you to specify
single columns, the on-line editor also allows you to emit strings of single-codes into a
field of columns. Thus, the syntax of the on-line emit statement is:
The same notes apply to deleting codes: the online edit allows you delete codes from a
single column or a field:
✎ In all the examples we have just shown, the c, equals sign, single quotes and dollar
signs are optional as long as the components of each statement are separated by
spaces. Additionally, in assignments, set (or s) is optional.
Whenever you alter columns with set, emit or delete, the on-line edit checks that the
columns you are editing are within the range of the C array for the current job. If you are
using the default array of 1,000 cells, c1001 and above are out of range for editing.
Quick Reference
To accept a record whether or not it has been corrected, type:
ac
To terminate the edit and send the record to the tabulation section, type:
rt
To reject the record from the tables but continue the edit, type:
rj
The following commands may be used to determine a record’s path through the
remainder of the edit section and the tabulation section:
ac (or accept) accepts the record up to the point at which the online statement occurs,
whether or not it has been corrected. The record continues on through the rest of
the edit and will only be re-presented for correction by other online statements or
at the end of the edit if other errors are found. Records accepted in this way are
written to the clean data file if split or write are used.
rt (short for return) terminates the edit for that record: that is, the record is assumed
to have reached the end statement. If split or write has not yet been reached, the
record will not be written to a ‘clean’ data file even though it will be included in
any tables produced by the run.
rj (or reject) rejects the record. The record continues through the edit unless it is
terminated with rt. The record is copied to the dirty data file.
Quick Reference
To add new cards to the current record, type:
ad card_num1[card_num2 ... ]
rm card_num1[card_num2 ... ]
The add command adds new cards to the output data file and rm removes cards from it.
To add a card type, type add or ad followed by the number of the card type to be added.
If you are adding several different cards at once, separate the card type numbers by
spaces. Quantum will then set the appropriate thisread variable to be true so that the new
card type will be written out with the rest of the data. Thus:
-> ad 3 4
will set thisread3 and thisread4 to be true so that the new cards 3 and 4 will be written
out. Each card will contain as many columns as the record length defined for the current
run. If the C array already contains data for a card 3 or 4, Quantum issues an error
message to this effect.
Removing cards is exactly the same, except that the appropriate thisread variables are
reset to false to prevent the unwanted cards from being written out. It does not alter the
data in your original data file. If you try to delete a card that is not currently in the C array
(i.e., the thisread variable is already false) an error message is displayed
Quick Reference
To return the record to the start of the main edit section, type:
ed
The edit command (abbreviation, ed) re-edits the record by sending it back to the start of
the edit or to the statement number given with online. If no more errors occur, the record
is copied to the clean data file.
If you prefer, you may hit the return key instead of typing ed.
Quick Reference
To cancel on-line editing for the rest of the data file, type:
ca
cancel (abbreviation, ca) cancels on-line editing but continues passing records through
the standard edit program. Any errors found subsequently are not displayed on the screen
for correction, but records are still placed in the clean or dirty files as appropriate.
The on-line edit commands we have just described are the defaults which are
programmed into Quantum. If you wish, you may redefine these command names or
translate them into a language other than English, or define your own abbreviations. You
do this in the translatable texts file.
Quick Reference
To write correct records out to a clean data file and incorrect records out to a dirty data
file, type:
split [only]
at the point at which records are to be written out. Type split only if the edit does
not alter the contents of the record and you want to copy records directly from the original
data file rather than from an intermediate file.
Clean and dirty data files are the terms used to refer to files of correct and incorrect or
rejected records created automatically by the edit statement split.
Each time a record is read and reaches split, it is written out to the appropriate file in its
current state. If any changes have been made with assignment statements, emit, delete,
priority, require or the on-line edit, they will be saved in the clean data file if the record
is now correct or in the dirty data file if the record still contains errors or has been
rejected.
Split may occur several times in the edit, but each record will be written out once only.
In the example below, the second split is redundant since all records will have been
written out by the first one. The data to be checked is:
Let’s suppose that the record has reached the require statement without error. Since
c234’2’ and c309’3’, the record is correct so it is copied to the clean file. However, when
the next statement is read and the contents of c146 are checked, we find that it contains
a ’5’ which means that it must be rejected and should be copied to the dirty file by the
second split. This does not happen because it has already been written out by the previous
split. For this example to place the record in the dirty file instead, it should read:
r sp c234’1/5’,c309’1/5-&’ :’&’
if (c146’12’) emit c180’1’;else; reject
split
Split is often used at the end of an edit after online. This causes all records found in error
by write and require statements to be offered in the on-line edit for correction and then
saved in the clean or dirty file according to the type of on-line commands you use. For
example, if a record is flagged as incorrect and you correct those errors, the record will
be placed in the clean data file. The same is true if you use ac to accept the record even
if you do not make corrections. If you reject the record with rj, the record will be placed
in the dirty data file. By putting both statements at the end of the edit you can be sure of
seeing all erroneous records and of saving all records in their final state.
If some records are rejected from the run using reject;return, these records will not be
included in the clean or dirty files unless the data is split before the records are rejected:
split
if (c132n’1/9’) reject; return
In this example, because split appears in the edit before reject;return, all records will
appear in one or other of the clean or dirty files (depending on whether or not they contain
errors) even though records in which c132 does not contain any of the codes 1 through 9
have their edit terminated and are rejected from the tables.
Here, because split appears after reject;return, only records in which c132 contains any
of the codes 1 through 9 will appear the clean or dirty files. Again, which file the records
are written to depends on whether or not they contain errors.
☞ Using reject and return separately and together is described in section 9.8.
By default, an intermediate data file is created for splitting. If the run does not contain
statements which alter the data (e.g., recoding with assignment statements, etc., or
creating new columns) then this file will be identical to the original data file. In such
cases, you may save disk space during the run by splitting the original data file instead
with the statement:
split only
When we talk about the original data file, we do not mean that Quantum alters your
original data file in any way; merely that it reads records directly from this file and
allocates them to the clean and dirty files rather than taking a backup copy of this file and
reading records from there.
✎ You may not use split only when the datapass reads input from another program
(e.g., when you use a corrections file to correct records rather than writing a forced
edit or using the on-line edit). Instead, you should run Quantum using the
corrections file only and write all records to a new data file. Then run the datapass
on this new data file.
If you do an on-line edit but forget split or write your changes will not be saved. Also
if you have created new cards and have not made thisread true for the new cards
(e.g., thisread3=1 for a new card 3), they will not be written out.
If you use split on a levels (trailer card) job, splitting is switched on for all levels and
must therefore be part of the top level edit. Additionally, it must appear once only
and must not be part of an if statement. A reject statement at any level rejects the
whole record (i.e., it will be written to the dirty file).
Quick Reference
To correct data using a corrections file, create a file called corrfile containing statements
of the form:
The last method of correcting errors is to create a file of corrections which will be merged
with the original data when it is read by a Quantum program. The correction file must
exist in the directory or partition in which you will be running your job.
Corrections are made by comparing the serial number of the record currently in the C
array with the serial number given with each correction in corrfile. Consequently, all
serial numbers in corrfile must be in the same order as those in the data file. The format
for a correction record is:
serial ; corrections
serial /n ; corrections
for records containing trailer cards. In both cases, serial is the record serial number and
corrections are the corrections to be made. The /n in the trailer card format is the read
number defining the trailer card to be corrected; it can be found from the error listing. For
example, if our data contains a card 1, three card 2s and a card 3, and we want to correct
an error on the third card 2, the read number would be /3 because the third card 2 is read
into the C array during the third read. If /n is omitted, the read number is assumed to be 1.
As in the on-line edit, the s and the equals signs may be omitted. If the correction refers
to a field of columns, you may define a string of codes in place of a single code.
Any number of corrections may be specified for a record as long as each correction is
separated by a semicolon. The data to be corrected may be a single column or a field, and
the corrections may be single-codes or multicodes enclosed in single quotes or strings
enclosed in dollar signs. If the data variable is larger than the string it is to contain, the
string will be right-justified and padded with blanks. If the string is longer than the data
variable, a warning message is issued.
The first record to be corrected is that with serial number 10. Column 112 is to be
overwritten with a ’1’, a ’3’ is to be added into column 212, column 314 is to be
overwritten with the multicode ’34’ and the ’3’ in column 115 is to be deleted.
The second correction is to the cards in the C array after the fourth read for serial number
123. Both corrections involve overwriting the original data with new codes.
✎ Correcting data with a corrections file is considerably faster than using a forced edit
of the form:
if (c(101,103)=$123$) c109’2’
Corrections in corrfile are made before the statements in the edit section of your program
are executed. If you are rerunning your previous job to correct errors and you have not
altered the edit in any way, you may save more time by telling Quantum to read the data
but not to recompile and load your program. This is done with the option –r on the
Quantum command line.
☞ For further information on options for Quantum runs, see chapter 34 and chapter 35.
The term missing values refers to data in numeric fields that is either non-numeric or
totally blank. You may find them in data gathered from questions of the type shown
below:
If the respondent replies ‘no’ to question 1 or does not answer it at all, question 2 is not
asked and columns 9 and 10 are left blank. If the respondent replies ‘yes’ to question 1
then question 2 should be coded either with a numeric value or, perhaps, with && for a
don’t know answer. The blank data and && are missing values.
You may also find missing values when a numeric field is incorrectly coded with a
combination of numbers and letters. This is usually the result of mistyping when the data
is entered and can often be corrected by looking at the questionnaire itself and then
cleaning the data within the edit section of the run.
Missing values processing is an optional feature. If you use it, Quantum automatically
detects missing values and provides a variety of facilities for dealing with them in both
the edit and tabulation sections of your run. In the edit section you have:
• Manual assignment of the special value missing_ to variables of your choice within
the edit.
You can use missing values processing in the edit section, in the tabulation section, or
both. To switch it on in the edit section, type:
missingincs 1
missingincs 0
You may use these statements any number of times in the edit to toggle between using
and not using the missing values features.
✎ The missingincs statement is always executed wherever it appears in the edit. This
means that although the compiler will accept statements of the form:
if (....) missingincs 1
Quantum will, in fact, switch on missingincs for the rest of the edit or until a
missingincs 0 statement is read. It does not switch on missingincs selectively
for only those records that satisfy the expression defined by the if clause.
If a job contains an edit and a tab section and missing values processing is used in the
edit, the setting of missingincs carries forward from the edit to the tab section. If the edit
uses missing values processing but the tab section does not require it, remember to end
the edit with a missingincs 0 statement.
The general rules for non-numeric data variables in arithmetic assignments are as
follows:
• Blanks in an otherwise numeric field are ignored, but totally blank fields are read as
zero.
• &’s in an otherwise numeric field are ignored, but fields full of &’s are read as zero.
• Multicodes in an otherwise numeric field are ignored, but a field in which all columns
are multicoded is read as zero.
If you switch on missing values processing these rules are modified so that any field that
is not totally numeric or a combination of numbers and blanks is counted as missing.
Here is a table showing samples of data in a numeric field and the difference missing
values processing makes to the way that data is interpreted:
If you print variables whose values are missing_ in a report file or write them out to a data
file, Quantum will show their values as −1,048,576 rather than as the word missing_.
If an arithmetic expression uses a variable whose value is missing, the value of the
expression differs depending on whether or not missing values processing is switched on.
If missing values processing is switched on the value of the expression is always
missing_. If it is switched off, the value of the expression is always zero. For example, if
c(1,3) contains the string ABC:
missingincs 1
t1 = c(1,3) * 100
missingincs 0
t1 = c(1,3) * 100
sets t1 to zero.
If you have other values that you want to replace with the missing value in the edit, you
may do so by typing a standard assignment of the form:
variable_name = missing_
Since missing_ is a special value you cannot use statements of the form:
to test whether a variable has the special missing value. Instead, use the function:
ismissing(variable_name)
For example:
if (ismissing(t4)) ....
Subroutines can be used to make your program more readable by eliminating the need to
use go tos in certain circumstances. If you use a subroutine with a name describing its
purpose it will be immediately apparent what is to be done, and it will mean you don’t
have to go skipping backwards and forwards in the program in order to understand what
it is doing.
Quick Reference
To call a subroutine, type:
To use any subroutine, enter the call statement at the point at which the routine is
required. The call statement simply says:
call routine[(arguments)]
where ‘routine’ is the name of the subroutine to be used and ‘arguments’ are other items
of information required by the routine. These will differ from routine to routine and are
clearly explained in the appropriate section below.
Quantum has its own library of subroutines which you may call from within your
Quantum program.
Quick Reference
To load data from a look-up file, type:
To load data from a look-up file and generate a report of used and unused keys, type:
where keys is a number whose value determines whether used or unused keys are listed
in the report.
Sometimes you will have additional information available that is not part of each
respondent’s data record but that nevertheless needs to be read into the C array for use in
the analysis. For instance, suppose we did some additional work on a chocolate
purchasing survey and collected information about the cost of various types of chocolate
bars. We can transfer this information to the array in two ways. We can either write an
edit to check which brand has been bought and then copy the appropriate price into the
record using if and an assignment statement, or, in a much simpler operation, we can put
the costs into a look-up file and call them up as required with the fetch statement.
A look-up file is one which contains information to be transferred into the data record at
a given point. Each item in the file has a unique key associated with it; this is very often
the code representing that information in the data. If brands A, B, C and D are represented
by the codes 1 through 4 in the data, the costs for those brands must have the keys 1
through 4 as well. Similarly, if a Ford Escort car is coded 274 in the data, the additional
information for a Ford Escort would be identified by the key 274 in the look-up file.
Data in the look-up file must be sorted in alphabetical order and must be formatted as
follows:
1. The first line must contain exactly two whole numbers anywhere on the line. The first
is the key length, the second is the total record length including the key.
2. All other lines must start with the key which may be followed by any other
information as necessary.
The look-up for our chocolate survey is named costs and is as follows:
1 4
1 14
2 15
3 21
4 17
The first line tells us that the key is 1 character long and that the record length is four
characters long (the space in column 2 is part of that information). The other lines refer
to the individual chocolate bars. Brand A (coded 1) costs 14 pence, Brand B (coded 2)
costs 15 pence, Brand C costs 21 pence, Brand D costs 17 pence.
✎ Fetch files may contain up to 32,767 records. If you require more than this, create
two files and use if statements to select the correct file for each respondent.
To transfer data from the look-up file to the record in the C array, enter the fetch statement
in your edit at the point at which data is to be copied. Fetch is a C routine and is invoked
by typing:
call fetch($file_name$,key_col,put_col)
where file_name is the name of the look-up file, key_col is the start column of the key in
the record, and put_col is the start column of the field into which the data is to be copied.
Data copied from a look-up file does not retain its key at the beginning. If you look at the
example in the previous section, the data transferred for records with key 1 will be $ 14$.
Suppose, in our chocolate survey, that the first brand bought is stored in c135, the second
in c150 and the third in c165. Brands are coded 1 through 4 as noted above, and costs are
to be copied into fields starting in columns 136, 151 and 166 respectively. To deal with
all three purchases we will call fetch three times, once per purchase. For the first purchase
we would write:
call fetch($costs$,c135,c136)
When the first record is read, Quantum inspects c135 and compares its contents with the
first field of the look-up file. If c135’1’ (brand A was bought) and a matching key is
found in costs, the information associated with that key is copied into the C array starting
at c136. In our example, brand A chocolate bars cost 14 pence so c(136,138) will contain
$ 14$. If a matching key cannot be found in costs, the destination area c(136,138) will be
blanked out.
Calls for the second and third purchases would be entered as:
call fetch($costs$,c150,c151)
call fetch($costs$,c165,c166)
When you read additional data in from fetch files, Quantum writes a summary of what it
has done to the file out2. The format of the report is as shown here:
This tells you that the run used two fetch files. The first file, cost1, contained seven keys;
five were present in the data and two were not. The file was called 893 times altogether
and 869 times the key in the data was found in the fetch file. The 24 misses refer to keys
that were present in the data but not in cost1.
The second file was called cost2 and contained three keys all of which were present in
the data. The file was called 196 times and every time the key in the data was found in
the cost2.
Nine digits are allowed for each column making the maximum count in a column
999,999,999.
If you want a list of which keys were used and unused, use fetchx instead of fetch. fetchx
has the same syntax as fetch except that it has an extra parameter at the end which tells
Quantum which additional information is required. Possible values for this parameter
are:
So, to load data from a fetch file called costs and to see a list of used and unused keys,
you would type:
call fetchx($costs$,c150,c151,3)
If you use fetchx more than once, the key listings are printed after the summary line to
which they refer. If the listing goes over onto a new page the column headings are
repeated at the top of the page.
Quick Reference
To write a multicoded column out as a single-coded field, type:
When Quantum converts multicoded data into single-coded data, it takes the codes in the
multicode and transfers each one to a separate column in the data, thus creating a
single-coded field of columns in addition to the original multicode. You may choose
which codes should be exploded in this manner, and also the start column of the single
coded field.
call explode(mc_start,num_cols,’codes’,sc_start)
where mc_start is the first multicoded column to be converted, num_cols is the number
of sequential columns to be converted, codes are the codes to be written out as single
codes, and sc_start is the first column in the single-coded field.
Codes are exploded in the order 1234567890–&. If the first code in ’p’ is present in the
multicode, that code will be copied into the first column of the single-coded field. If the
code is not present, the column is blank. For instance, if our data is:
----+----5
1
/
4
and we write:
we will have:
----+----5----+
1 1234
/
4
If we write:
then:
----+----4 ----+----4----+----5
14 14 12 4 45
25 becomes 25
46 46
7 7
The explode statement says ‘explode codes 1 to 5 in the two columns starting at column
132 into a field starting at column 140’. Quantum copies a ’1’ into c140 because there is
a ’1’ in c132, and a ’2’ into c141 because there is also a ’2’ in c132. Column 142 is blank
because there is not a ’3’ in c132, and so on. Notice that the ’7’ in c132 and the ’6’ in
c133 have been ignored because they are not part of the code specification with explode.
If explode is called for any record in the data file, Quantum prints a map in the print file
listing the contents of the multicoded columns and the columns into which the codes were
transferred. If explode is not called for any record, no map is produced.
Writing subroutines in C
Quick Reference
To write C subroutines, either type them into a file called private.c in the project
directory, or insert them in the Quantum edit as follows:
#c
C statements
#endc
Subroutines written in the C language must be filed in the file private.c in the current
directory so that they will be compiled automatically with the rest of your Quantum
program. If you have already compiled your subroutines before doing your Quantum run,
the compiled version must be stored in the file private.o in the current directory.
Alternatively, you may insert the C code as part of your edit as long as you enclose it
between #c and #endc statements as shown here:
#c
/* C code
#endc
Here are two examples of how to calculate square roots. The first uses C code in the edit
to call the sqrt function from the standard C library:
#endc
cx(181,190):4 = x1
filedef srdata data
write srdata
end
The second method of calculating square roots is to write a private C routine in private.c
which calls the standard square root function from the C library. In private.c you will
type:
#include <stdio.h>
#include <math.h>
double
square(val)
int val;
{
double dval;
dval = val;
return(sqrt(dval));
}
real square 1f
ed
cx(181,190):4 = square(cx(1,3))
filedef srdata data
write srdata
end
Quick Reference
To define a subroutine written in the Quantum language, type:
return
subroutine name [(var1 [,var2, ...]) ]
Quantum statements
return
Subroutines written in Quantum must be placed at the end of the edit section, before the
end statement, and preceded by a return, thus:
Each subroutine starts with a subroutine statement and ends with a return. The format of
the subroutine statement is:
where name is the name of the subroutine. If you define more than one subroutine their
names must be unique within the first six characters of the name so, for example, sqroot
and sqrt are acceptable whereas sqroot and sqroot1 are not.
var1, var2, and so on, are variables which the subroutine will use. These variables are
generally referred to as the arguments of the subroutine.
When you use subroutines, Quantum differentiates between variables defined in the
variables file or before the ed statement and those defined after the ed or subroutine
statements.
☞ For more information on the variables file, see chapter 14 and section 33.1
Variables defined in the variables file or before ed are called external variables and may
be accessed and changed by statements within a subroutine. Variables defined after ed or
inside a subroutine are local variables and cannot be changed by a subroutine. For
example:
real cost 1
int items 1
ed
int nshop 1
/* edit statements
return
/* subroutines
end
The variables cost and items are defined before the ed statement. This means they are
external variables and can have their values changed by a subroutine. The variable nshop
is defined after ed so it is a local variable. This means it cannot have its value changed by
the subroutine, even though its value can be passed to the subroutine for use by it.
Information stored in external variables is always available within a subroutine, and may
be accessed and changed regardless of whether you pass it as an argument to the
subroutine. For example, if we define an integer variable called items in the variables file,
we can read the contents of items and change them in the subroutine even if we do not
include items as part of the call statement. We might write:
call sub1
return
subroutine sub1
if (items.gt.5) emit c134’1’
return
end
This checks, inside the subroutine, whether the value of items is greater than 5 and, if so,
inserts a ’1’ in column 134. We do not pass the value of items to the subroutine because
it is an external variable which is available to the subroutine as a matter of course.
Because items is an external variable we could change its value in the subroutine if we
wished. For instance, we could reset it to zero.
Information stored in local variables which is required in the subroutine must be passed
to the routine as an argument. If the items variable was defined after the ed statement we
would have to name it on the call statement and on the subroutine statement thus:
ed
int items 1
call sub1(items)
return
subroutine sub1(items)
if (items.gt.5) emit c134’1’
return
end
This example performs the same task as the previous one. The difference is that this time
items is a local variable, so we must pass it to the subroutine. Once inside the subroutine,
we cannot change the value of items in any way.
In neither example is it necessary to pass c134 as an argument as all cells in the C array
are external variables.
When you use a subroutine which requires arguments, be sure that you call it with as
many arguments as are listed on the subroutine statement for that subroutine. If you give
too many or too few arguments, errors will occur. For example:
call conv(gallons,liters)
.
subroutine conv(gallons,liters)
is correct because we call the subroutine with the same number of arguments as there are
in its definition, but:
call conv(aa,bb,cc)
.
subroutine conv(aa,bb,cc,dd)
is not because we are calling conv with one less argument that its definition specifies.
When you return to the edit from a subroutine, any changes made to external variables
will still exist, but values assigned to local variables defined in the subroutine will not be
accessible from the main edit program. For example:
call sub1
return
subroutine sub1
int doneit 1
if (items.gt.5) emit c134’1’
items = 0
doneit = 1
return
end
Once the subroutine has been executed and control has returned to the edit, the value of
items will be zero but doneit will have no value at all.
Arguments
Generally, subroutines only need arguments when you are passing the values of local edit
variables to the subroutine. All arguments on the call statement must have a
corresponding argument on the same type on the subroutine statement. This is because
Quantum does not compare the names of the arguments on the call and subroutine lines.
It simply passes the value of the first argument given with call to the first argument
named with subroutine and so on. For instance, if gallons and liters are local edit
variables and we want to use their values in the subroutine calc, we would write:
int gallons 1s
real liters 1s
ed
call calc(gallons,liters)
.
subroutine calc(input,output)
int input
real output
Here, the value of gallons is passed to input while the value of liters is passed to output.
Input and output are variables used solely within the subroutine so they are defined in the
subroutine.
However, if you have a subroutine that is called more than once with different external
variables, you would represent them with local variables in the subroutine. For instance:
Here, n1 represents c120 or c220 and n2 represents total or tot2. n1 and n2 are local to
the subroutine so they are defined after the subroutine statement.
All local variables named on the subroutine statement must be defined in that subroutine.
Real or integer variables passed to the subroutine must be defined as such in the routine.
For example:
subroutine conv(gallons,liters,price)
/* number of gallons bought
int gallons
/* equivalent in liters
real liters
/* price per gallon
real price
Single data variables (i.e., columns in the C array or user-defined data variables with one
cell only) are passed to a subroutine by naming the variable on a data statement as shown
here:
subroutine chk(flav,prefb)
/* flavors bought
data flav
/* brand preferred
data prefb
subroutine ctyp(car)
/* make of car owned
int car
Any multicodes present in this field are ignored. If you have a multicoded field and you
want to be able to access the codes in each multicode, you must treat the field as a series
of single data variables and pass each one separately, using a data statement, rather than
passing the field as a whole. When variables are passed with call they are written in
exactly the same way as you would write them anywhere else in your edit. For example:
call sub1(c15,gallons,cost,c(20,28))
passes the address of the data variable c15, and the integer values of the variables gallons
and cost and the field c(20,28). Here is a chart summarizing how to define variables for
subroutines:
Notice that in the main definitions the size of the variable is defined, whereas in the
subroutine definition no size is required since all values are passed as integer values or,
in the case of a single data variable, as an address.
We have conducted a survey to test the market for a new TV station which would be
available via the satellite network. When it comes to asking how likely respondents
would be to take this new channel, people who already subscribe to the satellite network
are asked slightly different questions from those who do not. However, the possible
responses to each set of questions are identical.
One way of checking these answers is to write a subroutine and call it up using variables
to define the columns to be checked.
ed
/* c(21,23) is for those already subscribing
/* c(24,26) is for those who don’t subscribe
if (c17’1’) call subchk(21,22,23); else; call subchk(24,25,26)
/* rest of edit
return
subroutine subchk(high,low,dep)
/* high – willingness to take at $20
/* low – willingness to take at $10
/* dep – willingness to pay advance deposit
int high
int low
int dep
r sp ’1/59’ c(high), c(low), c(dep)
return
end
As our comments show, the fields to be checked are c(21,23) for those already
subscribing to the satellite network and c(24,26) for non-subscribers. Both calls to the
subroutine subchk name the columns in the field individually. This is because we want to
look at the codes present in each column. We have not defined the data variables at the
start of the edit because they are read automatically from Quantum’s variables file. This
means that they are external variables and can have their values accessed by the
subroutine.
The subroutine statement uses local variables with names describing the contents of the
variables they represent. High represents c21 and c24 which tell us how likely the
respondent would be to take the new station if it costs him $20 a month. Similarly low
represents c22 and c25 and dep represents c23 and c26. All local variables are defined in
the subroutine as being the name of the variable they represent.
The require statement simply checks whether each column is single-coded in the range
’1/59’.
If you glance back at the example, you’ll notice that although we’re talking about
columns in the data, we’ve actually treated them as integers. The call to the subroutine
simply gives the column numbers without a preceding ‘c’. The subroutine itself defines
its arguments as integers and then uses them as pointers into the C array. There are two
reasons for this.
First, it allows Quantum to report the column numbers correctly if it finds records which
fail the require statement. Passing columns to a subroutine as data variables causes
Quantum always to refer to column 0 in the output from require regardless of the true
column number which is in error.
Second, it enables you, if you wish, to set new codes into the columns used in the
subroutine. Normally, any changes made to the C array inside a subroutine are forgotten
when control passes back to the main program. Referring to the columns as pointers into
the C array, as in this example, causes any changes to the C array to be remembered when
the subroutine finishes.
✎ The notes in this section are for guidance only. Quantime does not own the source
code for functions in the C libraries and therefore cannot support them. If you have
any problems, consult your C compiler reference guide.
The C runtime and maths libraries contain a number of general-purpose functions, some
of which may be useful in Quantum programs. For example, if you want to square a
number or calculate a square root, you will almost certainly find functions that do this in
one of the C libraries.
Before you use a C function in Quantum, read the documentation on that function to find
out what parameters it needs, and of what type. Having done this, you then need to
provide this information in a format Quantum understands. In order to explain how you
do this, we’ll use the pow function which raises a value to a given power.
The Unix documentation for pow( ) states that the function expects two arguments, both
of which are double precision real variables. This means that your Quantum program will
need to hold the value and the power (exponential) in x variables:
x1 = 5
x2 = 2
x3 = pow(x1, x2)
Even if one of the arguments is a constant, as both are in this example, you must assign
the values to variables as Quantum will not accept real constants within the function’s
parentheses.
pow( ) returns a value which you want to use in your Quantum program. In order to do
this, you must define the function in the variables section of your run (that is, in the
variables file or at the top of your program, before the ed statement). The function’s type
must be set to the type of data the function returns. pow( ) returns a double precision
value so we define it as:
real pow 1f
real pow 1f
ed
x1 = cx(11,14)
x2 = 2.0
x3 = pow(x1, x2)
end
The table below lists the various C return types and shows how to define them in
Quantum:
When looking things up in this table, bear in mind the following points:
1. Quantum uses long integers, so all integer variable types except ’unsigned long’ can
be accommodated.
2. Quantum does not support unsigned values, but this is only a problem with ’unsigned
long’ variables.
If you are not interested in the value the function returns, or the function does not return
a value at all, you can treat it as a subroutine and run it using call, as you would for the
standard Quantum functions. For example:
Whether you call C library functions as subroutines or functions, you need to specify the
arguments correctly in Quantum so that they are converted to the appropriate C variable
types. In general, the safest option is to store any real or integer arguments in Quantum
real or integer variables, as in the pow( ) example, and then call the function with those
variables as the arguments. This is particularly important when dealing with Quantum
data variables.
You can pass text strings as they are, as you saw for printf, but you cannot pass text held
in data variables.
In chapter 4 we said that Quantum automatically provides you with an array of 1,000 data
variables in which to store data, 200 integer variables for storing whole numbers and 100
real variables for storing real numbers. We also said that you may create your own data,
integer and real variables with names representing the type of information they contain.
In this chapter we will discuss how to increase the number of variables that Quantum
provides and how to create your own named variables.
Each variable which you create must have a unique name of up to ten characters, the first
of which must be a letter. You may choose any name you like, but you are advised to use
names which have some relevance to the type of data they contain – for instance, totinc
for a variable which contains a respondent’s total income. The names:
The name ‘totalspending’ is invalid because it is too long (it has 13 letters) and ‘2qqq’ is
incorrect because it starts with a digit.
As you can see, both upper and lower case letters may be used, but do remember that
Quantum does not distinguish between the two: TEMP is the same as temp.
Quick Reference
To define a data variable, type:
Type s after the variable’s size if you want to be able to omit the parentheses from
references to single cells in the variable.
Before Quantum will recognize named variables in your program, you must say what
type of information the variable is to contain and how many cells it should have. If you
wish to increase the size of the C array, you must indicate how many cells you require.
There are three places that you can declare named variables:
1. In the variables file. Variables declared here are available in the edit and tab sections
of your program and also in subroutines, and may be changed by the edit or by a
subroutine.
2. At the start of your program before the ed statement. Variables declared here are
available in the edit and tab sections of your program and also in subroutines and may
be changed by the edit or by a subroutine.
3. In the edit after the ed statement. Variables declared here are available in the edit
section only and may only be changed there. They are unknown to the tab section and
to subroutines.
2. The variable name: C, T or X to increase the number of data, integer or real variables
available; any name for a new variable.
3. The variable size. This is generally the number of cells the variable is to have.
data c 1500
increases the size of the C array to 1500 cells. This provides space for records with up to
14 cards per respondent.
int ntrip 5
creates an integer variable called ntrip which can store up to five whole numbers.
real price 10
When we first talked about variables we said that the individual cells of an array may be
referenced by following the name of the array by the cell number enclosed in parentheses.
Therefore:
We also mentioned that you may omit the parentheses when you are referring to a single
cell in the C array so that c100 means the same as c(100).
To make this possible you must follow the variable size in the variables file with the letter
‘s’. This is particularly important when you are increasing the size of the C array as,
without it, any references to, say, c15 will cause errors. For instance, if we write:
data c 1200s
we are increasing the size of the C array to 1200 cells – enough for 11 cards per record.
Because the array size is followed by ‘s’ we can write c1056 when we mean c(1056):
Quantum will substitute the parentheses automatically.
The dimension of the C array will be taken automatically from the value of max= on the
struct statement if this is greater than the dimension requested in the variables file. For
example, if you have:
int c 1300s
struct;max=15;ser=c(1,4);crd=c(79,80); ....
in your program, the C array will be increased to 1600 cells to accommodate card type 15.
int brand 1s
int brand 1
The former creates the variable ‘brand’ as an array, and you must refer to it in your
program as brand1. The latter creates a single named variable that can be referred to as
brand.
If you are not increasing the number of data, integer or real variables or creating new
variables, there is no need to set up a variables file. Quantum will read the default values
from its own variables file, as follows:
data c 1000s
colreal cx c
real x 100s
int t 200s
This gives you the 1000 data variables, 100 real variables and 200 integer variables
mentioned in chapter 4.
The second statement (colreal cx c) informs Quantum that variables referred to as cx are,
in fact, data variables whose contents are to be treated as real numbers.
☞ For further information about external and local variables, see the section entitled
"Passing information between the edit and a subroutine" in chapter 13.
When a record has passed through the edit without being rejected, it is passed to the
tabulation section, if one exists. At this point, data, integer and real variables are available
to create tables. The program deals with one complete record at a time (we’ll ignore
trailer card records for the moment).
The tabulation section consists of a series of statements which determine the contents of
the tables. Each table may be thought of as a matrix of cells. The table shown in Figure
15.1 below is a 3-by-5-cell table. It consists of three columns (Total, Male and Female)
and five rows (Base, Single, Married, Divorced and Widowed) making fifteen cells in all.
Each cell of this table is defined by two conditions, one from the row and one from the
column. In this table the conditions which define each row and column are shown in
parentheses. They are not, of course, printed in ordinary tables of output. The top
left-hand cell contains 200 people. This is everyone in our sample, since the conditions
creating this cell are ‘All Respondents’ and ‘Total Respondents’. The middle cell of the
top row is defined by the conditions ‘All Respondents’ and ‘Male’, which is the condition
that column 106 contains a ’1’. The total number of male respondents is 44.
The third cell in the first column is defined by the two conditions ‘All respondents who
are single’ and ‘All respondents’. A single respondent has a ’1’ in column 109; there are
53 such respondents.
The second cell of the second row has the conditions Male (c106’1’) and Married
(c109’2’). There are 23 married male respondents.
Each time a record passes through the tabulation section, the count in the top left-hand
cell is increased by 1, since this cell is to include all respondents. Each time a record
comes through in which c106 is a ’1’, the count in the middle cell of the top row is also
incremented by 1 since this cell includes all respondents who are males.
✎ Conditions are positive rather than negative; a person is included because he fulfils
the required conditions rather than being excluded because he does not fulfil them.
Many tables contain counts which are created by the existence of more than two
conditions. An entire table may be filtered. This means that no one is considered for
inclusion in the table at all unless he fulfils a condition specified for the table as a whole.
For example, we might wish to look at a table which includes only respondents who live
in Central London. That condition is c121’1’ which is satisfied by 19 people as shown in
Figure 15.2 below.
Base 19 1 18
(All in C. Ldn)
Single 4 0 4
(c109’1’)
Married 13 0 13
(c109’2’)
Divorced 1 0 1
(c109’3’)
Widowed 1 1 0
(c109’4’)
The count for each cell is now defined by three rather than two conditions. The second
cell of the third column, for instance, refers to all respondents living in Central London.
(c121’1’) who are female (c106’2’) and who are single (c109’1’). There are four people
in this cell.
Many cells in tables consist of counts created by a series of conditions or filters. As you
can see from the examples above, these conditions are created by columns and codes in
the general form Cn’p’ (e.g., c109’1’).
There are three other kinds of information that can be used to compute cells in a table.
First, the conditions can be set up so that every time a record satisfies an arithmetic or
numeric condition, the count in the cell is increased by one. You will normally do this
when the question on the questionnaire requires a numeric response that will be entered
directly into the data as it stands (e.g., age, number of products tried) rather than a
response that will be represented in the data file by a specific code (e.g., Green=1,
Red=2).
In Figure 15.3 we have set up age ranges so that every time a respondent whose age is,
say, 45 passes through the tabulation section, the count in the relevant cell is incremented
by 1. There are 30 respondents whose ages lie between 45 and 54, six of them men and
24 of them women.
Second, there is arithmetic information itself. In the table in Figure 15.4 the contents of
the cells are not counts of individuals fulfilling conditions; in this case, the base is the
number of loaves of bread bought by all respondents who bought bread over the period
of a month. The figures in the row reading ‘1 – 5 Loaves’ are the total number of loaves
bought by those respondents who purchased between 1 and 5 loaves in that month. 94
loaves were purchased during the month by people who bought between 1 and 5 loaves
altogether.
Tables of this type are generally created when the questionnaire requires the interviewer
to write down the exact number the respondent says rather than circling a code
representing a range of numbers. When the data is entered on the computer, the columns
assigned to this question will contain the exact number the respondent gave – for
instance, if he bought 15 loaves of bread, the number 15 will appear in the data rather
than, say, a ’3’ indicating that he bought between 10 and 15 loaves.
Third, there are statistical functions such as means. At the bottom of Figure 15.4 we show
the mean number of loaves bought per respondent who bought bread.
Table text
The text associated with each table is created at various levels of the tabulation program.
The text of each specific line (e.g., Single, in the first example) is generally written on
the same statement that defines the characteristics a respondent must have to be included
in that line.
Some text, such as the table title (Quantime Sample Table 1), is created at ‘table level’,
while some is generated at the ‘run level’ so that it applies to all the tables in the current
run. In our sample tables, this is the title ‘Bread Purchase Survey’.
Figure 15.4 Incrementing cells using values read from the data
As was mentioned briefly in the previous chapter, the tabulation section is hierarchical in
that characteristics can be defined at one level which will apply to that and all lower
levels.
The aim of this chapter is to describe those levels and define simply their purpose in the
run as a whole as a prelude to the more detailed discussions of the statements themselves
in subsequent chapters.
☞ The run control statement – the a statement – is described in section 16.4 and the
lower level filter statements (sectbeg and flt) are discussed in chapter 23. Titles and
other text statements are described in chapter 17, chapter 18 and chapter 22.
Quantum’s tabulation section consists of a series of levels, beginning with the lowest
level, the line, and progressing upwards to the entire run. Within each of these
hierarchical levels, conditions and characteristics can be specified for the current (and
sometimes lower) level.
The same option placed at the next level down (the tab statement) may state that the cells
in this particular table only will show, say, column percentages. This option overrides the
option on the a statement for this table only.
At the lowest level, the option on the a or tab statement can be overridden by an option
on the statement creating a single line (i.e., an n, col, val, bit or fld statement). The option
might specify that the line created by this statement will display absolute figures only
rather than absolutes and column percentages.
These hierarchies greatly increase the flexibility of the tabulation program. However,
they do mean that you must pay attention to what level you are on otherwise you may not
get the results you expect.
Quick Reference
To define global and default conditions for the run, type:
Global run conditions, if any, are defined on the a statement. If used, it must be the first
statement in the tabulation section. Its format is:
a;options
where options are keywords defining the global characteristics of the run. You may list
as many keywords as you like, provided that they are separated by semicolons (;), for
example:
a;dsp;op=12;date;dec=1
This statement tells Quantum that all rows will be double-spaced (dsp), cells will contain
absolute figures and column percentages (op=12), the date will be printed on each table
(date) and absolutes will be shown to one decimal place (dec=1). These and all other
options are described below.
Most options which are valid on the a statement are also valid on sectbeg, flt or tab
statements. In this chapter, unless an option’s description specifically states that the
option is not valid on particular statements, you may assume that the option is valid on a,
sectbeg, flt and tab statements.
Where a keyword appears on two or more of these statements, the setting at the lower
level will override the setting at the higher level for that table or group of tables only. For
example, if the same option is present with different values on a flt and a tab statement,
the option on the tab will override the option on the flt for that table only. Similarly,
where an option is present on both the a statement and a flt, the option on the a statement
will be overridden by the option on the flt until another flt is read.
☞ This concept of overriding options is discussed further in section 21.3 and section
23.2.
Options can be divided into two categories: output options and data options. The former
determine the format of each table in the run, but have nothing to do with the numbers in
each cell, whereas the latter determine how the cell counts are to be created but have
nothing to do with the overall appearance of the tables.
Jobs in which only the output options have been changed can be rerun without rereading
the data, but jobs in which data options have been altered must be rerun just as if they
were new jobs.
☞ For further information on running Quantum, see chapter 34 and chapter 35.
Output options
Output options are those which affect the way your tables are formatted and printed. They
do not determine how the data is tabulated or how the individual cell counts are
calculated.
Unless otherwise stated, all options are valid on a, sectbeg, flt and tab statements.
acr100 This prints the text 100% on each cell of the base column when row
percentages are requested with op=0. Normally, a base column
contains absolute figures only. If acr100 is used without row
percentages, it is ignored.
anlev= Defines the level at which axes are to be cross-tabulated in a
hierarchical (trailer card) job. Analysis levels are described in chapter
28.
axttx This option creates table titles of the form ‘axis name by axis name’. x
at the end of the keyword may be l for a title printed in the left, c for a
title printed in the center of the line, r for a title on the right, or a number
between 1 and 9 to have the title indented by ten times that amount of
spaces. For instance axtt5 will indent the title by 50 spaces. You may
also type axttg to have the start of the title lined up with the start of the
column headings.
baft This keyword causes any table titles starting with the word ‘Base’ to
be printed last after all other titles for that table. If the keyword base
appears on a ttbeg/ttend/ttord statement and baft is also used, an error
message is generated.
✎ Do not use baft with ttbeg=base since the two are incompatible.
☞ For a full discussion about creating column headings, see section 20.5.
csort Sort tables column-wise (i.e., horizontal sorting rather than vertical
row-wise sorting).
date By default, tables are printed without a date. Use of the keyword date
causes the current date to be printed in the top right-hand corner of each
table. The date is in the format dd mm yy (e.g., 3 OCT 83).
dec=n This determines the number of decimal places for absolute figures. If
dec= is not used, the default of no decimal places is assumed.
decp=n This sets the number of decimal places required for percentages. The
default is decp=1 meaning one decimal place. This applies when op=0,
2, 7 or & (see below). Any number of decimal places are allowed, as
long as you make each column wide enough to accommodate them.
dsp This leaves one blank line between each row of data in a table. Without
this, one line follows directly underneath another.
flt=name Invokes the filter conditions and titles named on the flt= statement. If
the filter defines conditions, the rules governing data options apply.
This option is valid on sectbeg, flt and tab statements, but not on the a
statement.
font=(ttype=fnum, ... )
You will use this keyword when you want your tables to be laser
printed. It defines the fonts in which various types of output are to be
printed. Fonts are entered in the format shown above where ttype
defines the text type and fnum is the number of the font to be used for
that text. Fonts and the numbers which represent them are defined on a
per site basis: your Account Manager will know what they are. Text
types are:
def default font
a text following the a statement
bot text following bot statements
foot text following foot statements
flt text following flt statements
tab text following tab statements
tb table numbers
If most of the table is to be printed in the same font, you may define this
font as the default font (using the text type def=). However, if this
option is used, it must precede all other options.
Similarly, you may use the option pc to define a font for all percentage
figures, but if you then wish to have row and/or column percentages in
yet another font, the options rowpc and/or colpc must follow pc
otherwise they will be overridden by the more general percentage font.
Let’s say that we have three fonts; 1 is standard type, 2 is bold and 3 is
italic. We wish to have all run level titles (i.e., those following the
statement) printed in a bold font and all percentages in italics. We
would write:
a;font=(a=2,pc=3)
All other texts are printed in font number 1, the standard font.
If tables are subsequently printed on the line printer, any font changes
are ignored.
Because of the way in which Quantum stores the font changes, tables
to be laser printed must have a page width of 132 or 158 characters
defined on the statement.
graph= Produces SYLK format files for use with graphics or spreadsheet
packages which read these type of files (e.g., Chart, Graphwriter for
graphics and Symphony for spreadsheet applications). A separate file
is created for each table containing the statements necessary to
reproduce the table as a 2-dimensional bar chart.
hitch= Prints the current table on the same page as the previous table if there
is room for the whole table on the page. If the current table has more
than one page, Quantum prints its first page on the same page as the
previous table.
indent=n Where a row text is longer than the space allocated to the row text in
the table, Quantum breaks the line in between words and continues the
text on the next line. To have these continuation lines indented from the
left margin, specify the amount of indentation required with indent=.
Texts may be indented by between 0 and 15 spaces: the default is
indent=0.
linesaft= Defines the number of blank lines to print after the last line of column
headings. The default is one blank line.
linesbef= Defines the number of blank lines to print before the first line of
column headings. The default is two blank lines.
manipz Apply spechar, nz, nzrow and nzcol to elements created using
manipulation.
✎ For netsort to work, the keyword sort must be present on the same statement as
netsort or on a statement at a higher level. For example, to sort the nets in a single
table, place netsort on the l statement of the row axis and sort on the a, sectbeg, flt
or tab statement.
☞ For examples of nets and sorted nets, see section 17.6 and section 31.3.
See the section entitled "Sorting with subsort and endsort" in chapter 31 for more
details about subsorts.
☞ See chapter 20 for a description of how to create column headings with g statements.
nzrow Suppresses the printing of rows where all cells are zero or round to
zero.
op=n This keyword governs the type of output in the tables. Output types are:
& Total percentages. The value in the cell is percentaged against
the number in the upper left-hand corner of the table (normally
the base) rather than on the totals in the relevant column or
row. If the table contains more than one base element,
percentages are calculated using the leftmost figure in the most
recent base element.
- Row rank figures are printed below each cell. Figures are
ranked within rows, using 1 for the largest figure. Where two
or more numbers have the same rank, they are all assigned the
lowest rank possible. Thus, if the previous rank was 2 and the
next value to be ranked occurs in the row three times, those
numbers will all be ranked 5.
✎ You may not request row and column ranks in the same table.
0 Row percentages.
1 Absolute figures (default).
2 Column percentages.
3 Column rank figures are printed below each cell. Figures are
ranked within columns, using 1 for the largest figure. Where
two or more numbers have the same rank, they are all assigned
the lowest rank possible. Thus, if the previous rank was 2 and
the next value to be ranked occurs in the column three times,
those numbers will all be ranked 5.
✎ You may not request row and column ranks in the same table.
You might use this when you have a table showing which of
two products people preferred, and their reasons for preferring
this product. Percentages could be calculated against a
redefined base such as ‘All preferring Brand A’. and then
against the first base (all respondents).
7 Cumulative percentages.
8 Indices. The index for a cell is generated by dividing the row
percentage in the cell by the row percentage in the base row. If
the table contains more than one base row, indices are
calculated using the row percentage in the most recent base
row. It show you how closely the percentages in the current
row reflect those in the base row. The nearer the index is to
100%, the more closely the current row mirrors the base row.
9 Prints absolutes and percentages side by side. Four columns
are allocated to the percentage if it has no decimal places;
percentages with decimal places are allocated 5+decp columns
(e.g., 7 columns for percentages with decp=2).
This can be useful for tables with very wide column axes where
no column contains 100%.
When you create a table with more than one output type, Quantum
prints the different values one under the other in each cell. If you’d
prefer to have a separate table created for each output type (e.g.,
absolutes and column percentages as separate tables rather than both on
the same table), enter the letter s in upper or lower case between the
equals sign and the list of output types. For example:
op=S012
creates three tables, one of absolutes (1), one of column percentages (2)
and one of row percentages (0).
There is no significance in the order in which you list output types with
op=. Quantum always prints them in the order shown below:
1 Absolutes
2 Col percentage on current base
6 Col percentage on first base
0 Row percentage
& Total percentage
8 Indices
– Row or column ranks
Absolutes for tables of means/proportions (Quantum only)
Proportions (Quanvert only)
Means (Quanvert only)
Statistical probabilities
Statistical flags
Percentage differences
Thus, although the example above specifies output types in the order
row percentages (0), absolutes (1) and column percentages (2), the first
page will show absolutes only, the second page will show column
percentages only and the third page will show row percentages only.
The exception is when you want Quantum to calculate percentage
differences. In this case, the difference is calculated using the last
percentage type named with op=. For example, column percentages if
op=012 is used.
☞ Examples of tables containing some of these type of output can be found in section
16.5. For information on percentage differences, see section 18.9.
page This option invokes automatic page numbering. Since this is the default
– pages are numbered from 1 automatically – this option is generally
used in its negative form of nopage which suppresses automatic page
numbering.
paglen=n This determines the number of lines printed on each page. The default
is paglen=60 lines but any value between 10 and 10,000 is valid.
pagwid=n Normally tables can be up to 132 characters wide. pagwid= enables
you to decrease the page width or to extend it to a maximum of 10,000
characters.
✎ If your job uses a variety of page widths, you must ensure that the largest one is
defined on the a statement. Additionally, if you will be laser printing tables using
the font= option, the tables may be 132 or 158 characters wide only.
pc This prints percent signs after percentage figures. This is the default, so
this option is usually used negatively – nopc – to print percentage
figures without percent signs.
pcpos=[±]n When Quantum prints percentages underneath absolutes, it prints the
percentages by one column to the right of the absolutes. pcpos= is an
extension of flush, provided so that you may determine more precisely
where percentages are printed in relation to absolutes.
The number of columns to offset is defined by indicating the position
of the rightmost digit of the percentage in relation to the rightmost digit
of the corresponding absolute figure. Offsets of up to ±7 characters are
valid. A negative value indicates positioning to the left of the absolute,
while a positive value indicates positioning to the right. The default is
pcpos=1 which has the same effect as noflush; pcpos=0 is the same as
flush.
Here is a brief example to illustrate the effect of pcpos:
39 39 39
15.1% 15.1% 15.1%
pcpos=1 pcpos=0 pcpos=-1
(default) (same as flush)
pcsort Sort on percentages rather than absolutes.
pczerona Prints NA instead of 0.0 as the percentage in cells which have a zero
base.
✎ NA is printed in the same position as the decimal part of the percentage. By default
this is one character to the right of the absolute value.
printz This prints tables in which all cells are zero. Normally such tables are
suppressed.
rinc Indicates that when a table is both too long and too wide to fit on one
page, rows should take precedence over columns when the table is
paginated. This means that Quantum will print all the rows on as many
pages as necessary with the left-hand side of the column headings
before repeating the process for the right-hand side of the column
headings.
☞ For further details, see the section entitled "Automatic pagination" in chapter 18.
round Force row and column percentages to round to 100%. This option
works only with op=2 and op=0 and is ignored for op=6 and op=&.
All op= options on the elements of an axis are ignored for rounding
purposes.
The option noround may be used on elements which may be totaled,
(e.g., n01) to prevent them being altered in any way by the rounding
process. n15 and n25 statements and nets are automatically assigned
the default of noround.
When deciding which percentages may be altered to force the row
and/or column percentages to round to 100%, Quantum adds up the
absolute numbers in the relevant rows and/or columns (between bases
if there are several bases). It then compares the result with the absolute
value in the base.
If both numbers are integers, and they are the same, forcing can be
done. If both are integers and they differ, no forcing is done. Any
elements with the option noround are not altered by the rounding
process.
If either the base or the sum of the absolutes is not an integer, and the
values are the same to within one part in 1,000, forcing can be done. As
noted above, elements with the noround option are not affected by the
rounding process.
If rounding may be done, Quantum finds the largest absolute and alters
the corresponding percentage so that the sum of all percentages is
100%. Quantum chooses the largest absolute because changing the
percentage for this figure will cause least distortion to the original
percentages. If several elements have the same largest value, Quantum
changes the percentage for the first of those elements and leaves the
others untouched.
The decisions on whether or not rounding may take place, and which
value to round, are based on the values with all the decimal places that
Quantum holds for them, not just the values you see printed in the table.
rsort Sort tables row-wise (i.e., vertical sorting rather than horizontal
row-wise sorting). This is the default.
side=n This option can be used to alter the row text width. The default is
side=24 and the maximum side text width is 120 characters.
smallbase= Defines the value below which bases are to marked as small bases in
tables with special T-statistics.
☞ See chapter 20 for further information about column headings and g statements.
smflag=n When statistical tests are carried out on cells with small bases, the
percentages in those cells may appear to be significant when they are
not. smflag= enables you determine what constitutes a small base and
to flag cells with that base when the table is printed.
n is an integer or real value which determines the maximum size of a
‘small’ base. When a table containing row, column and/or total
percentages is printed, Quantum will print the letter ‘s’ to the right of
any cell in which the unweighted base in the appropriate direction (e.g.
column base for column percentages) is smaller than n.
smrow Indicates that values defined with smsupa/smsupp/smsupt refer to
rows. This is the default.
smsupa=n Suppresses any element in which all absolutes are below the given
value. Use with smrow to suppress rows and/or smcol to suppress
columns. If all elements in a table are suppressed, the table itself is
suppressed. You can therefore use this option to suppress tables in
which the base is less than a value of your choice.
smsupp=n With smrow, suppresses any row in which all column percentages are
below the given value. With smcol, suppresses any columns in which
all row percentages are below the given value. If all elements in a table
are suppressed, the table itself is suppressed.
smsupt=n Suppresses any element in which all total percentages are below the
given value. Use with smrow to suppress rows and/or smcol to suppress
If all elements in a table are suppressed, the table itself is suppressed.
smtot Used with one of the smsup options to suppress rows in which the
leftmost Base value is less than the value given with smsup. Without
smtot, all values in the row are compared against the smsup value.
sort: Creates sorted or ranked tables.
spechar=ab When a cell in a table is zero or would round to zero, you may wish to
have specific characters or blanks printed in the cell in place of the
zeroes. spechar (short for ‘special characters’) makes this possible. The
first character (a) is placed in cells which have true zero values, while
values which round to zero are replaced with the second character (b).
The special characters may be any non-numeric character or blank. If
either character is blank, enclose the pair of characters in double
quotes. For example:
spechar=" *"
sets blank as the character for values which are truly zero.
✎ Quantum differentiates between means and other statistics (e.g., standard deviation)
in which the sum of values is zero, and those in which the sum of cases (i.e.,
respondents) is zero, and only prints the special character if the number of cases
going into the mean is zero. Thus, 0/3 is always printed as zero, whereas 0/0 will be
printed as the special character for zero if one is defined.
Zero values generated as the result of manipulation are always printed as such and
are never replaced with a special character.
squeeze= Prints as many pages of the current table as possible on the same page.
✎ Tables formatted with the postprocessor pstab for printing on a PostScript printer are
automatically centered within the page width.
☞ For further information on pstab, see chapter 38.
title Creates left-justified table titles of the form row by column’ from the
axis titles defined with hd= on the row and column l statements. For
example, if the axes age and sex are introduced by the statements:
l age;hd=Age of Respondent
l sex;hd=Sex of Respondent
the table title will be ‘Age of Respondent by Sex of Respondent’.
topc Prints a percent sign at the top of each column of the table. This option
is only valid if you use nopc (don’t print percent signs after
percentages) and op=2 (print column percentages) together when
op=1, op=3 or op=8 are not present.
tstatdebug Prints the intermediate figures for all special T-statistics in a run. You
may use this option instead of placing debug on every tstat statement.
ttbeg=(text,text, ... )
Quantum normally prints titles in the following order:
table number
higher (3+) dimension texts
texts following column l statements
texts following row l statements
texts following tab statements
texts following flt statements
texts following flt= statements
If you are generally happy with this order but there are one or two titles
that you want to print first rather than in their default positions, specify
just those titles with ttbeg=. Quantum will then print those titles first,
in the order you list them with ttbeg=, and then all other types of titles
in their default order.
Any other titles would be printed below the Base texts in the order ‘tb,
high, top, side, nflt’.
✎ Using ttbeg=base with baft generates an error message since the two are
incompatible.
ttend=(text,text, ... )
This is the opposite of ttbeg because it defines those titles which are to
be printed at the end of the list of table titles. Keywords are as described
for ttbeg=. You would normally use ttend when you are generally
happy with the order in which Quantum prints titles, but there is one
particular type of title that you always want to print last rather than in
its default position.
If baft is also present in the program, it is imputed as ttend=base after
all other ttends, although if base is present on the ttend= an error will
occur.
ttord=(text,text, ... )
This defines the order in which table titles are to printed. You use it
when you want to specify your own order for titles, and to have this
order override completely the default order that Quantum normally
uses. Conventions are as described for ttbeg=, except that the top
parameter is not valid with ttord.
If your list omits some of the titles that Quantum normally prints, these
titles will not be printed. For example, if you type:
ttord=(tb,tab)
your tables will have a table number and any titles defined with tt
statements under tab statements. Titles defined in any other place
(under the l statement, for example) will never be printed.
type Prints the output type (e.g., Absolutes, Column Percents) in the top
right-hand corner of the table. This is the default, but notype may be
used instead to suppress printing.
Data options
Data options are options which determine how the numbers in the tables will be
calculated. They have nothing to do with the way those values are subsequently printed.
Unless otherwise stated, all options are valid on a, sectbeg, flt and tab statements.
axcount This keyword requests a summary of the records present in each axis.
For each axis, Quantum reports the axis name, the number of records
which were excluded from the axis because they failed the condition on
the l statement, the number of blank records (i.e., those with no
responses to any of the totalizable elements), and the number of records
which had at least one code in a totalizable element. For this group of
records, a further breakdown shows the number of records with 1 code,
2 codes, and 3+ codes.
The report is printed at the end of out2, just before the counts of records
accepted and rejected. Axes are sorted in alphabetical order, and in
levels jobs, within level. For example:
Axis Skipped Blank Coded 1 code 2 codes 3+ codes
ax01 - 4 96 26 38 32
ax02 - 8 92 27 31 34
ax03 - 4 96 22 44 30
ax04 7 - 93 17 49 27
In this example, all records are eligible for inclusion in ax01: 4 are blank
and 96 are coded. Of those, 26 are single-coded, 38 have 2 codes and 32
have 3 or more codes. In ax04, 7 records skip this axis because they fail
the condition on the l statement. The remaining 93 records are all coded.
axreq= This allows an axis to be used to edit the data and report on whether the
record fulfilled the conditions within the axis. The type of coding
required within the axis is defined as follows:
If a record fails the condition on the l statement, then the checks for
type of coding are ignored. In all other cases, the checks are made, and
any records failing the requirement are written to out2 with an
explanatory message, as for require statements in the edit. Lines are
also appended to the summary at the end of the print file, out2, and to
the summary file, sum_, reporting the number and percentage of
records failing each type of condition.
For example, the axes:
l sex;axreq=sc
col 7;Base;hd=Sex;Male;Female
l prefer;axreq=scb
col 15;Base;hd=Brand Preferred;Brand A;Brand B;
+Brand C; ...
l tried;axreq=nb
col 22;Base;hd=Brands Tried;Brand A;Brand B;
+Brand C; ...
produce:
60 in file
___+___ 1___ +____ 2___+ ....
Columns 101-200 are |006110*3116162*101001 14 ...
not single-coded in axis sex
multi-coded in axis prefef
blank in axis tried
✎ axreq= merely reports records with incorrect coding; it does not reject them from
the axis or from tables using that axis.
c=logical expression
This defines conditions which a record must satisfy to be included in
the tables. Conditions may be any valid logical expression. The option
c= is most frequently used when creating rows and columns.
✎ Double precision produces more accurate results when you are workingwith very
large numbers or numbers with many decimal places. It also increases the time taken
to run your job. For these reasons you should think carefully before using this option
unnecessarily.
You should also note that it is only the cell calculation (accum) and output (qout)
stages of a job that work in double precision, and that these values are used
internally by these programs. The values in the cell counts (nums) file are always
written in single precision.
Numbers held in single precision are only ever precise to six digits. If accum
calculates a number as being 1234.56789, it will write the number out to the nums
file as 1234.56???, where ? is unpredictable. The same is true of 1234567890 which
becomes 123456????.
If you have to deal with very large numbers, you are advised to scale the numbers
down by 1,000 or 10,000 so that the printed output will appear more correct. You
can add a title to the table explaining the scaling factor applied.
inc=arithmetic expression
This causes the cell counts in a table to be incremented by the value of
the arithmetic expression, rather than by 1, for each respondent
included in that cell.
☞ You may find out more about inc= by reading the section entitled "Data options" in
chapter 18.
maxim Produces tables in which cells are the maximum values of inc=
variables. Means that are zero are omitted from the calculation.
means Produces tables in which cells are the mean values of inc= variables.
☞ For an example, see the section entitled "Table of means" later in this chapter.
median Produces tables of medians using values read from the data using inc=.
medint=n Determines when and how to interpolate when medians are created
from incs. Values of n are:
0 Interpolate between the value that goes over the 50% mark and
the previous value. Theoretically, this means that the values
are being treated as the high boundaries of intervals. This is the
default.
1 No interpolation. Return the exact value that went over the
50% mark. If a value goes precisely to the 50% mark, the
median is the midpoint between that value and the next value.
2 Interpolate between
a) the midpoint between this and the previous value, and
b) the midpoint between this and the next value
This corresponds to values being treated as the midpoints of
intervals.
3 Interpolate between this and the next value. This corresponds
to values being the low points of intervals.
minbase=n Defines the minimum size of the effective base for T-statistics. If the
effective base is less than n the tests are not carried out.
minim Produces tables in which cells are the minimum values of inc=
variables. Means that are zero are omitted from the calculation.
✎ If the run has an edit section and missing values processing is switched on in the edit,
this setting carries through to to the tab section. If you want missing values
processing in the edit but not in the tab section, you must remember to switch it off
with a missingincs 0 statement at the end of the edit.
☞ For an explanation of what missing values are and when they occur, see section 12.6.
missing=logical expression
Treat any record that satisfies the logical expression as a record with a
missing value. This option is normally used with inc=.
☞ For further information see the notes on inc= in the section entitled "Data options"
in chapter 18.
nsw Causes the compiler to insert a squared weighting statement after each
base element in every axis and before every n12.
overlap Causes Quantum to calculate T-statistics using a formula that takes into
account the fact that a respondent may be present in more than one of
the elements being tested.
scale=[/]n This option defines a scaling factor by which all cells in a table will
by multiplied (scale=n) or divided (scale=/n). For example, if you
have a table reporting the number of gallons bought, you can convert
these counts to liters by including the option scale=4.55 on the tab
statement.
stat=stat.name Defines the statistics to be calculated on the table.
useeffbase Causes the n19 standard error to be calculated using the weighted count
of respondents rather than the unweighted count.
☞ For further information, see the section entitled "The mean, standard deviation,
standard error and error variance" in chapter 19.
Many of these keywords act as switches, turning a particular feature on or off. For
instance, we noted above that the options page and type define defaults for printing the
page number and output type at the top of each page automatically, and are therefore
more commonly used in their negative form to switch off these facilities when they are
not required.
Some options can be switched off by preceding the keyword with the word no. Thus,
printing of page number and output type are switched off by the options nopage and
notype. Options in which the keyword is followed by an equals sign lose the equals sign
when no is added; thus, scale= becomes noscale.
In all cases except page, pc and type, the negative version of these keywords is the default
and would not normally appear on the a statement. You would probably use it on a
sectbeg, flt or tab statement to turn off the default for a particular table or group of tables.
This will be discussed more fully in the appropriate sections.
You may switch off global suppression of small absolutes, column or total percentages
by setting the suppression value to zero (e.g. smsupp=0). Again, you would normally do
this on a flt or tab statement to cancel a run-level option for a specific table or group of
tables only.
Often you will find that you need the same default options for a series of jobs. You can
either write an a statement for each job or you can set yourself up a run defaults file which
defines default options for one or more jobs.
The run defaults file should contain only an a statement listing any of the options
described above and depending on the file’s location, these defaults will refer either to all
jobs run at an installation or just to one particular job.
This section contains sample tables to illustrate more clearly the function of some of the
options described in this chapter. It is not important at this moment to understand how
the tables were created.
Total percentages
Our first example illustrates total percentages. These are calculated by percentaging each
cell against the total number of people in the most recent base. In this example there is
only one base so percentages are calculated using the total number of people in the table.
From this we can see that 21% of our sample were women between 21 and 34 years of
age. Note that we have also overridden the default of 1 decimal place for percentages and
that percentages are printed directly beneath absolutes. This table was produced by the
statement:
Page 1
Absolutes/total percents
Q2. Age
Base: All Respondents
Total Male Female
Base 605 341 264
56% 44%
11-20 yrs 120 73 47
20% 12% 8%
21-34 yrs 290 161 129
48% 27% 21%
35-54 yrs 146 81 65
24% 13% 11%
55+ yrs 49 26 23
8% 4% 4%
The next table is exactly the same except that it uses op=127 to give absolutes and
cumulative column percentages:
Page 2
Absolutes/col percents
Q2. Age
Base: All Respondents
Total Male Female
Base 605 341 264
11-20 yrs 120 73 47
20% 21% 8%
21-34 yrs 290 161 129
68% 69% 67%
35-54 yrs 146 81 65
92% 92% 91%
55+ yrs 49 26 23
100% 100% 100%
Indices
Our third table with op= shows indices created with op=128. This time we have used the
default of 1 decimal place for percentages. Notice, though, that the indices are only ever
shown as whole percentages. Each index is created by taking the column percentage for
the cell and dividing it by the percentage in the most recent base column. For instance,
the index of 108% for women aged 55 or more is created by dividing 8.7% (column
percent) by 8.1 (base column percent).
Page 3
Absolutes/col percents/indices
Q2. Age
Base: All Respondents
Total Male Female
Base 605 341 264
100% 100%
55+ yrs 49 26 23
8% 8% 9%
100% 94% 108%
Table of means
tab q7 ban1;means;dec=2
The row axis for this table was as follows. The meaning of each statement is explained
in the chapters on axes.
l q7
n01Age of Car;c=c121’1/4’;inc=c121
n01Price;c=c122’1/4’;inc=c122
n01Availability of Spare Parts;c=c123’1/4’;inc=c123
n01Reputation of Manufacturer;c=c124’1/4’’;inc=c124
n01Mileage;c=c125’1/4’;inc=c125
n01Sound Bodywork;c=c126’1/4’;inc=c126
n01Reputation of Dealer;c=c127’1/4’;inc=c127
n01Extras (e.g. Radio);c=c128’1/4’;inc=c128
If a ’1’ in a column means that the item is not important, and a ’4’ in that column means
that it is very important, the total mean value for the first row (3.10) tells us that people
think the age of the car is quite important when buying a secondhand car.
Tables of the maximum or minimum values of inc variables can be generated in the same
way simply by replacing the keyword means on the tab statement with maxim or minim
as appropriate.
The keywords minim and maxim are also valid on individual statements in an axis. When
used in this way they create an element that shows the minimum or maximum values of
the inc= variable specified for that element. Here is an axis that creates three elements.
The first is the minimum value paid, the second is the maximum price paid, and the third
is the mean price paid (the n25 does not create a printed element):
l price
n01Minimum price paid;inc=paid;minim
n01Maximum price paid;inc=paid;maxim
n25;inc=paid
n12Mean price paid
✎ The minim and maxim calculations ignore means that are zero.
The axis is an integral part of your tabulation program: without it there can be no tables.
At its simplest level an axis represents a question on the questionnaire, and contains
statements which define the responses to that question and the codes by which Quantum
can identify them.
For instance, if we have an axis called region, we can use it to create tables in which each
row is a different region, or in which each column is a different region. We can also use
it in such a way that each region creates one or more pages in a group of tables.
Items in an axis are called elements and each element may generate one or more lines or
pages in a table. For example, when an axis is used to create the rows of a table, one
element may show the same set of figures presented in three different forms, say,
absolutes, column percentages and row percentages, to name the most common.
In this chapter we introduce some of the statements used in axes and tell you how to
define precisely which respondents should be included in which element.
Quick Reference
To name an axis and define any options applicable to the axis as a whole, type:
l name[; options]
All axes in a program must have a unique name. This name may be up to seven letters
and numbers long, and must start with a letter.
1 axis_name
For example:
l product
anlev=level Defines the level at which the axis should be created. Only used when
data is processed with analysis levels.
☞ For further information about analysis levels, see the section entitled "Table and axis
analysis level" in chapter 28.
axreq=ctype Defines the coding requirements for the axis. The type of coding may
be:
none no requirements (default). May be used on the l statement
to override a different option on the a statement.
sc single-coded
nb not blank
☞ For further information, and an example of the output, see the section entitled "Data
options" in chapter 16.
byrows Exports grid axes on a row-by-row basis when exported in SAS format
in Quanvert.
c=logical expression
Defines the condition which must be met in order for a respondent to
be included in the axis.
clear=logical expression
Determines when flags should be reset in the intermediate table when
trailer cards are read.
colwid=n Specifies the output column width when the axis is used as a
breakdown (banner).
☞ To find out more about column widths and colwid=, read section 20.5.
dsp Causes the elements to be double spaced when the axis is used as the
rows of a table – that is, a blank line is printed between each row of the
table (see Figure 17.1).
figbracket Prints the character defined with figchar in front of each absolute, and
prints the corresponding closing bracket after each absolute.
☞ For further information on the fig group of options, see section 17.8.
☞ For information on ttord=, see the section entitled "Output options" in chapter 16.
inc=arith.expression
Causes cell counts in a table using this axis to be incremented by the
value of the arithmetic expression, rather than by 1 for each respondent
present in the axis.
If inc= is also present on the a/sectbeg/flt or tab statement, then the
increment on the l statement is applied in addition to that at the higher
level. For example:
tab region house;inc=c132
l region;inc=c(115,116)
☞ For a more detailed description, see the section entitled "Data options" in chapter 18.
missing=logical expression
Treat any record that satisfies the logical expression as a record with a
missing value. This option is usually used with inc=.
☞ For further information, see the notes on inc= in the section entitled "Data options"
in chapter 18.
☞ See the section entitled "Accumulation of suppressed elements by net level" for an
example and a more detailed explanation.
✎ For netsort to work, the keyword sort must be present on the same statement as
netsort or on a statement at a higher level. For example, to sort the nets in a single
table, place netsort on the l statement of the row axis and sort on the a, sectbeg, flt
or tab statement.
☞ For examples of nets and sorted nets, see section 17.6 and section 31.3.
See the section entitled "Sorting with subsort and endsort" in chapter 31 for more
details about subsorts.
notstat Sets the default for the axis to be that elements are excluded from
special T-statistics.
numcode Flags an axis as being single coded. When Quantum encounters an axis
flagged in this way, it only allows space for single coding in the
datapass. This reduces the amount of temporary disk space required for
processing large axes during the datapass and accumulation stages of
the run (i.e., when the data is read and the table cell counts are
calculated), as well as when flipping databases for use with Quanvert.
Quantum flags axes as multicoded not only when the data on which
they are based is multicoded, but also when they contain net, ndi, nsw
or n25 elements.
✎ If you flag an axis with numcode, Quantum assumes that it is single coded and does
not check this during the datapass. If a record in the axis is then found to be
multicoded, only the first code is taken. The first code is determined by the order in
which codes are defined in the axis. For example, if the axis is:
col 123;Red;Blue;Green;Yellow;Orange;Black;Brown
and the record is multicoded with ’247’, only code 2 will be accepted so the record
will be treated as if the respondent chose blue only.
nz Causes elements in which all cells are zero to be omitted from the
printed tables.
tstat Include all elements of this axis in the special T-statistics. This is used
to set a default for the axis when only a few elements need excluding
from the statistics. If you neither include nor exclude elements from the
tests, Quantum will include all suitable elements.
uplev=level Defines the level at which the axis should be updated. Only used with
analysis levels.
Certain of these options may also appear on the a, sectbeg, flt or tab statements. To switch
off a global setting for an individual axis, you may precede the following options with
no. Options ending with = lose the = sign when preceded by no:
Text elements
These elements create nothing but text; no cells containing counts or values are
created from these elements.
Arithmetic elements
These are elements which contain arithmetic values rather than counts. For
example, one element may tell you the number of times a product was bought
rather than the number of people who bought it.
Statistical elements
These elements contain totals, subtotals or statistical functions such as means
and standard deviations.
Quick Reference
The general format of a condition is:
c=logical expression
Let’s take the question asking which color the respondent likes best. There are four
choices, Red, Blue, Green and Yellow, coded 1, 2, 3 and 4 respectively in column 25 of
card 1. This will generate four elements, one for each color. What we need to do is find
some way of telling Quantum that anyone with a 1 in column 25 of card 1 belongs in the
first element, while anyone with a 4 should go in the fourth element.
In chapter 5 we talked about various types of expression, one of which was the Logical
Expression which returns a value of true or false. We said that statements of the form
cn’p’ were logical expressions since the expression is true if column n contains the code
’p’ or false if it does not. This is just what we want because it means that we can write
c125’1’ to gather together all respondents having a ’1’ code in column 25 of card 1
(remember that with multicard records the last two digits are the column number and any
previous ones are the card type). If the respondent has a ’1’ in this column, the expression
is true. The respondent satisfies the condition for the element and is included in the counts
for it. If there is no ’1’ in c125 the expression is false and the respondent is rejected from
the counts.
Having found a way of defining the condition, we now need to present it in a way that
Quantum can understand. Quantum knows what c125’1’ means, but if you just write that
by itself Quantum will not know what to do with it. To show that this defines a condition
for a element we write c= (short for condition=) and then the expression, thus:
c=c125’1’
c=logical expression
☞ See section 5.2 for a reminder of the various forms a logical expression can take.
Types of conditions
Quick Reference
Special conditions unique to the tabulation section are:
c=+ respondents in any of the previous elements since the last base
A condition can be any valid logical expression. The conditions are written exactly as
they are in the edit. For example, a condition such as:
c=c234’12’
is read as an ‘or’ condition meaning that any respondent for whom column 234 contains
code ’1’ or code ’2’ or both is eligible for inclusion in the element created by this
condition. Any other codes in this column are ignored.
To specify that a respondent may be included if he has a specific code or set of codes
only, use the form c=c234=’1’. This means that the respondent is added into the counts
if c234 contains a ’1’ and nothing else. Notice here that the statement contains two equals
signs, one for the c= and one as part of the logical expression.
The logical expression c=c234n’12’ is used when the condition requires that the
respondent does not have a ’1’ or a ’2’ or both in column 234.
The expression need not be restricted to single columns. It is quite correct to write:
c=c(121,123)=$101$
if you mean to gather respondents who have a ’1’ in c121, a ’0’ in c122 and a ’1’ in c123.
You might do this when items have been coded with numbers rather than codes; a 101 in
c(121,123) could represent a 1971 Ford Escort car.
Other sorts of logical expression are valid as well. For example, the condition:
c=miles.gt.100
indicates that respondents are eligible for inclusion if the value of miles is greater than
100. The condition:
c=numb(c163,c171,c175).eq.1
only counts respondents having one code overall in columns 163, 171 and 175. That
means that one of those columns must be single-coded and the other four blank for a
respondent to be eligible.
There are two special conditions which can be used to accumulate counts of respondents
who have or have not been included in any element since the last base in the axis:
c=− produces a count of all respondents eligible for inclusion in the axis who
have not been included in any element other than the base so far
c=−n counts respondents eligible for inclusion in the axis but not counted in the
previous n elements
c=+ counts respondents already included in one of the previous elements in the
axis
c=+n counts respondents already included in the previous n elements
c=+ and c=+n are often used to create ‘net’ elements in axes for questions with multiple
choice or open end responses while c=– and c=–n are generally used to deal with Don’t
Know and No Answer when there are no specific codes for these responses.
Both options stop counting if they encounter an nsw element in the axis. These are
elements inserted automatically by Qantum if your run requests special T-statistics.
Count-creating elements are the basis of any table since they tell you how many
respondents gave which responses. There are several statements which will create
numeric elements; which you use will depend upon the type of data to be read and the
complexity of the condition defining eligibility for inclusion in the element. Statements
are:
Quick Reference
To define one count-creating element, type:
n01[element_text] [;options]
n01[Text];[Options]
Each n01 in an axis will create one row/column/page in the table. In a row axis, this
element may consist of several lines, depending on the types of figures requested.
Conditions were explained in section 17.4 above, so let’s now look at a sample table to
see how the elements were created.
Married 122 27 95
61.0% 61.4% 60.9%
Divorced 33 10 23
16.5% 22.7% 14.7%
Widowed 1 1 1
.5% 2.3% 0%
Do not worry about how this whole table was created; for the time being we are only
concerned with how to create the rows entitled Single, Married, Divorced and Widowed,
and the columns named Male and Female. We will come on to how to create the other
elements later in this chapter.
First, let’s assume that marital status is coded as ’1’ to ’4’ in column 109 and that sex is
a ’1’ or a ’2’ in c106. Next we need to name our axes. We’ll call them mstat and sex so
that we know straight away which questions they refer to. To set up the mstat axis, we
write:
l mstat
n01Single;c=c109’1’
n01Married;c=c109’2’
n01Divorced;c=c109’3’
n01Widowed;c=c109’4’
We can deal with the sex axis in exactly the same way:
l sex
n01Male;c=c106’1’
n01Female;c=c106’2’
The first n01 in the axis mstat defines the element text as ‘Single’ and the condition as
c109’1’. As you can see, the element starts with the given text. Notice that it is printed
exactly as it was written in the axis. If we had wanted it all in upper case or indented by
two spaces, we would have had to write it in upper case or precede it by two spaces on
the n01 statement. Normally a semicolon separates the element text from the element
conditions. If you want a semicolon as part of the element text, type in a backslash (\)
before the semicolon, thus:
n01Hotels\;Guest Houses;c=c15’12’
✎ Quantum normally allows 24 characters per line for text. Shorter texts are padded
with blanks, longer ones are split at the nearest blank, hyphen (–) or slash (/) and are
continued on the next line. You may reset the amount of space allocated to side texts
using the option side= on the a/sectbeg/flt/tab statement or you can split long texts
manually using one of the statements.
☞ See the section entitled "Output options" in chapter 16 for information about setting
side text widths with side=.
For details on splitting long row texts manually, see section 17.5
To find out how to set break points in element texts in column axes, read section
20.3
The condition for single people states that only respondents having a ’1’ in c109 will be
included in the counts. Exactly which of those people is included in each cell of the row
depends upon the column conditions. Cells in a table are created by the intersection of a
row with a column. This creates an ‘and’ condition since the respondent must satisfy
both the row and the column conditions to be included in that cell. Take, for example, the
elements Single and Male. The cell created by their intersection has the condition:
There are six respondents satisfying this condition, so we have six men who are single.
So far, we have dealt with simple conditions – those with one column and one code only
– but you may write conditions of any complexity. Here is an example.
Suppose our questionnaire contains two awareness questions, one for awareness of the
product and the other for awareness of advertising about it. Awareness of the product is
tested using aided and unaided responses, with the first unaided response being coded
separately from other unaided responses. Awareness of advertising is also aided and
unaided, but there is no distinction between first and subsequent mentions. Here is part
of the questionnaire:
n01Aware of Sparkle;c=c110’1’.or.c111’1’.or.c112’1’
n01Aware of Gleam;c=c110’2’.or.c111’2’.or.c112’1’
.
n01Aware of Washo Advertising;c=c113’4’.or.c114’4’
As you can see, the conditions for each element are quite long and require careful typing
to collect the appropriate respondents. A more efficient way of writing such conditions
is to merge the codes in columns 110, 111 and 112 into a spare column in the edit as
follows:
ed
/* c181 = aware of product at all
c181 = or(c110,c111,c112)
.
end
.
l q3
n01Aware of Sparkle;c=c181’1’
n01Aware of Gleam;c=c181’2’
.
Here we are saving in c181 any codes which are present in at least one of the columns
c(110,112): that is, any brand that the respondent is aware of, either spontaneously or
after prompting. We then use this variable to determine which respondents are collected
into each element. In our example we have used the or operator, but this method works
equally well for and and xor.
☞ See the section entitled "Assignment with and, or and xor" in chapter 8 for further
details on these operators.
Producing tables for product tests can often be simplified by copying data to different
cards according to the order in which the products were tried. If we take the previous
example, in which half the respondents tried A then B while the rest tried B then A, the
only way of finding which product the respondent was talking about was to look at the
code in column x telling us which product was tested first.
This can lead to unnecessarily complex and lengthy specifications. One of the simplest
solutions to this is to copy data for each group of respondents to a different card, and to
reorganize the order of the data for one set of respondents so that answers about product
A always precede answers about product B, regardless of the order in which they were
tried. Once this recoding is done, the tabulation of the data becomes straightforward.
The questionnaire simply refers to the first and second product tried, but the client wants
to know whether respondents preferred Brand A or Brand B for each task. He also wants
to know whether the item preferred depends upon whether or not it was tried first.
Here is an example of how to write an edit to shift the data and a table specification using
the new card.
ed
/*A then B : copy in existing order
if (c118’1’) c(401,480)=c(101,180)
/*B then A : reorganize data as it is copied
if c(118’2’) c(401,419)=c(101,119); c420=c121; c421=c120; 422=c123;
+ c423=c122; c424=c125; c425=c124; .... ; c(435,480)=c(135,180)
/* rest of edit follows using c(401,480) etc.
end
a;dsp;decp=0;spechar=-*;flush
tab tests order
ttlPreference for Selected Characteristics
l tests
n10Base
n23Washing Woolens
n01Noticed a Difference;c=c427’2/4’
n01 Prefer Product A;c=c427’2’
n01 Prefer Product B;c=c427’3’
n01 No Preference;c=c427’4’
n01Did Not Notice a Difference;c=c427’5’
l order
n10Total
n01Tried A First;c=c418’1’
n01Tried B First;c=c418’2’
Prod A Prod B
Total First First
----------------------------------
Base 330 | 150 | 180
| |
Washing Woolens | |
Noticed a Difference 200 | 97 | 103
61% | 30% | 57%
| |
Prefer Product A 90 | 44 | 46
27% | 30% | 26%
| |
Prefer Product B 82 | 36 | 46
25% | 24% | 26%
| |
No Preference 28 | 17 | 11
9% | 11% | 6%
| |
Didn’t Notice a 130 | 53 | 77
Difference 39% | 35% | 43%
If we had not copied the data to other cards, the definition of someone preferring product
A for washing woolens would have been:
Quick Reference
To define a non-printing count-creating element, type:
n15[text] [;options]
This statement acts exactly like an n01 except that the numbers it generates are not
printed in the table. However, these figures are used in various statistical calculations and
totals.
Creating a base
Quick Reference
To create a printing base element, type:
n10[text] [;options]
n11[text] [;options]
In most tables, including the sample table at Figure 17.1, the first row and column
contain totals. These are the total number of respondents eligible for inclusion in that row
or column. The intersection of the base row and the base column is the table base – that
is, the total number of respondents eligible for inclusion in the table as a whole.
Notice that we say ‘eligible for inclusion in the table’ rather than actually in the table.
Bases are not totals and should not be confused with them. If everybody who is eligible
for inclusion in the table is, in fact, included, the base and the total may well be the same,
but this is not always the case.
In a table of marital status by sex, the base is generally the total respondents in the table
since everyone has an age and marital status. But, if we omitted the element for Males,
our base would still be the total number of respondents, even though the table would only
contain women.
The purpose of a base element is to define a set of figures against which figures in
subsequent elements can be percentaged. With this in mind, let’s look at the sample table
above which has a base row and a base column. The base row has three cells, the latter
two showing us the total numbers of men (44) and women (156) eligible for inclusion in
the table. These are created by the combination of the conditions ‘everyone’ (from the
base row) and ‘male’ and ‘female’ from the column axis. The column base shows the
total number of single, married, divorced and widowed respondents eligible for inclusion
in the table. The table informs us that there are 10 divorced men. This represents 22.7%
of all men in the table (10/44∗100=22.7). This is a column percentage.
There are two statements which create a base: n10 and n11. They are formatted as
follows:
n10[Text];[Options]
n11[Text];[Options]
l sex
n10Base
n01Male;c=c106’1’
n01Female;c=c106’2’
✎ Any table in which percentages are required must have an appropriate base
otherwise no percentages will be calculated. By ‘an appropriate base’ we mean a
base row for column percents (op=2) and a base column for row percents (op=0).
Both bases are needed if you want total percentages (op=&).
There are three statements which are used within an axis to create text-only elements.
These are:
All are formatted in the same way, with the text required starting immediately after the n
statement number (e.g., n23Heading Text). If the text is to be indented, precede it with
spaces. A limited set of options are valid on n03 and n23 statements, most of which may
also be used on other types of count-creating statements.
Options unique to a particular text-creating element are discussed below in the relevant
section.
Quick Reference
To create a text-only element, type:
n03[element_text] [; options]
The n03 statement creates a new row of text whenever the axis is used as a row axis. It is
ignored if the axis is used any other way, that is, as a column or higher dimensional axis.
n03 statements are often used with no text to create extra spacing within tables. In our
sample table above, the blank line between the base row and the row for single
respondents was created with an n03. The axis mstat now looks like this:
l mstat
n10Base
n03
n01Single;c=c109’1’
n01Married;c=c109’2’
If you use an axis containing n03 statements to form the columns of a table, the n03s are
ignored. Thus, the axis in the previous example will create a table whose first column is
the Base and whose second is for single people.
Do not use an n03 to continue long texts from an n01 statement. An n03 creates a
completely new row in the table, which means that if the row created by the n01 consists
of several lines (e.g., absolutes and percentages) the text on the n03 will be printed on the
line after the last line of figures for the n01.
✎ Use an n33 for continuation as described in the section entitled "Text continuation
statements" later in this chapter.
Quick Reference
To suppress text-only rows if all count-creating elements in the block are suppressed,
type:
n03[element_text];nz
Text-only rows created with n03 statements can be suppressed if all count-creating rows
between them and the next text-only element, the next base or the end of the axis,
whichever comes first, are also suppressed. This facility is not applicable in column axes
since n03 columns are always ignored.
To flag an n03 as eligible for suppression, use the option nz. For example:
n03Preferred green;nz
When Quantum decides whether to print n03s flagged in this way it considers two things:
a) It goes to the first text-only row and scans the following elements to see whether this
n03 is one of a block. If so, it checks whether all n03s in the block are flagged with
nz. If they are not, Quantum marks all the texts as printable and skips to the next n03.
b) If all elements in the text-only block are flagged with nz, Quantum then scans all
count-creating between that block and the next text-only element or the next base
which follows at least one row of non-base figures, whichever comes first. If all those
count-creating elements are suppressed, Quantum suppresses the text-only elements
as well.
These two steps are repeated until the end of the axis.
n10base
n03First group. Both rows and all;nz
n03elements flagged for suppression;nz
col 10;one;%nz;two;%nz;three;%nz
In the first group, the two text-only rows will be suppressed if rows one, two and three
are suppressed.
In the second group, the text-only elements will never be suppressed because the second
element on the col statement is never suppressed.
In the third group, the first text-only element will never be suppressed even if all rows in
the group are suppressed because the second n03 has no nz.
In the fourth group the text-only elements will never be suppressed because they are not
flagged with nz.
Quick Reference
To place text in the main body of the table, type:
You may also use an n03 to place text in the body of the table (i.e., above or below the
cell values). A simple example would be to print a row of hyphens above a total or
subtotal row to separate it more clearly from the rows which are included in it. The text
to print is defined on an n03 statement with the option keyword coltxt=:
where:
Suppose our row axis counts self-employed people. The data has been coded to the
nearest 1,000, so that a 1 in column 132 means that there were up to 999 self-employed
people. The axis is as follows:
l people
n10Base
n03Thousands of people
n03;coltxtr+2=000’s
n01Under 1000;c=c132’1’
n011000-2000;c=c132’2’
n012001-3000;c=c132’3’
n013001-4000;c=c132’4’
n014001-5000;c=c132’5’
n03;coltxt=----
n04Total
If we tabulate this against an axis defining the region in which the survey was carried out,
we can see, to the nearest thousand, the number of self-employed people in each region.
Notice how we have use coltxt to print headings at the top of each column of figures
(underneath the base) and again before the total row. The first coltxt element is offset to
the right of the figures by one column, whereas the second one uses the default of right
justification.
Base 50 10 14 10 16
Thousands of People
000 ’s 000 ’s 000 ’s 000 ’s 000 ’s
Under 1000 7 1 2 2 2
1000-2000 7 1 3 0 3
2001-3000 3 1 1 0 1
3001-4000 3 1 1 1 0
4001-5000 5 1 1 1 2
---- ---- ---- ---- ----
Total 25 5 8 4 8
Quick Reference
To define a subheading element, type:
hdlev= allows you to define a hierarchy among subheadings if the axis contains
subheadings at different levels. If heading_text is unsuitable for use when the axis forms
the columns of a table, define a new heading with toptext=.
Sometimes it is nice to give the axis a heading as we have done in Figure 17.1. Here the
axis headings are ‘Marital Status’ and ‘Sex’. These were written using n23 statements.
As you can see, when the axis is used as a row axis, the heading is printed as a row of the
table, at the point at which it appears in the axis definition. In our table it follows the base
element because we wrote the n23 statement after the n10 base-creating element:
l mstat
n10Base
n03
n23Marital Status
n01Single;c=c109’1’
n01Married;c=c109’2’
n01Divorced;c=c109’3’
n01Widowed;c=c109’4’
When the axis is used to create columns, the heading is printed above the individual
column headings and is separated from them by a blank line. If the axis is used at a higher
level (e.g., as a third dimensional axis) the axis heading becomes the title of each
multidimensional table.
If you would like subheadings to be underlined, place one of the options unl1, unl2 or
unl3 on the n23. unl1 underlines the complete text, unl2 underlines everything except
blank strings, and unl3 underlines non-blanks only. Underlining for row subheadings is
done by overprinting the heading text with underscore characters and looks like true
underlining. With column subheadings, Quantum replaces the blank line that it would
normally print after the subheading line with a line of hyphens extending across all the
columns to which the subheading refers.
When an axis is used as a higher dimension and the subheading becomes a table title the
request for underlining is ignored.
If a column axis contains more than one n23, the text for each one is centered above the
elements between it and the next n23 or the end of the axis. The hdlev= keyword allows
you to define various levels of subheading, starting at level 1 for the top subheading down
to level 9 for the lowest level. Quantum uses these level numbers to determine the order
of precedence amongst the n23 texts in the axis and hence the text’s position in the
column headings. For example:
l ban01
n23Visitors to the Museum;hdlev=1
n10Base
n23Sex;hdlev=2
col 110;Male;Female
n23Age;hdlev=2
col 111;11-20=’12’;21-34=’34’;35-44=’56’;55+=’78’
This example has one level 1 heading and two level 2 headings. The level 2 headings are
at a lower level than the level 1 heading, so they are printed beneath that heading. At both
levels, the headings are centered across the columns to which they refer so, for instance,
the level 2 subheading Sex will be printed centrally above the columns for Male and
Female, while the level 1 heading Visitors to the Museum will be printed centrally above
all columns. Here is an illustration of how these headings might be printed (the exact
layout depends on the width of the side text and the page width):
This example starts at level 1 and uses sequential numbers for the lower level headings.
This is not a requirement. As long as the lower level headings have a larger level number
than the higher level headings, you may use any numbering system you like. Possible
substitutes for this example would be 1 followed by 5, 2 followed by 3, or 3 followed by
7, to name but a few. Levels of the form 9 followed by 1 are invalid because the higher
level has a larger number than the lower level.
Headings which are too long to fit in a column or across a group of columns, can be
defined on a block of n23 statements all at the same level so that they will be printed one
below the other. The heading Visited Museum Before in the example below illustrates
this point.
l ban01
n23Visitors to the Museum;hdlev=1
n10Base
n23Sex;hdlev=2
col 110;Male;Female
n23Age;hdlev=2
col 111;11-20=’12’;21-34=’34’;35-44=’56’;55+=’78’
n23Visited;hdlev=2
n23Museum Before;hdlev=2
col 116;Yes;No
Entering blocks of n23s in this way does, of course, mean that you’ll also have two lines
of subheading in the rows if you use the axis as a row axis. If this is not satisfactory, you
may wish to consider using the toptext= option to define different headings for row and
column texts for that element. For example, to replace the two-line heading Visited
Museum Before with the single line Been Before, you would write:
Each heading or block of headings at the lowest level must be followed by some elements
which produce counts; for example, basic elements such as n01 or col, or statistical or
totalling elements such as n12 or n04. In our example above, each group of level 2
subheadings is followed by a col statement.
Headings at higher levels need not be followed by count creating elements and are
therefore useful for creating extra lines of subheading in the middle of the headings.
Quick Reference
To define the justification of a subheading above its columns when the axis forms the
columns of a table, type:
n23heading_text; hdpos=x
where x is l for left justification, r for right justification or c for centered text (the default).
Unless directed otherwise, Quantum centers n23 texts above the elements to which they
refer. If you would prefer the text to be left justified above the columns to which it refers,
add the option hdpos=l to the n23. If you would prefer the text to be right justified, use
hdpos=r instead. (hdpos=c is also available for centered text but since this is the default
you are unlikely to need it). The example below uses hdpos=l for the main heading and
the subheadings Sex and Age, and hdpos=r for Visited Museum Before:
l ban01
n23Visitors to the Museum;hdlev=1
n10Base
n23Sex;hdlev=2;hdpos=l
col 110;Male;Female
n23Age;hdlev=2;hdpos=l
col 111;11-20=’12’;21-34=’34’;35-44=’56’;55+=’78’
n23Visited;hdlev=2;hdpos=r
n23Museum Before;hdlev=2;hdpos=r
col 116;Yes;No
Quick Reference
To continue a long element text on a n01 or n10 statement, type:
n33continuation_text
The n33 is used to continue long texts from an n01 or an n10. We have already said that
Quantum splits long texts automatically at a blank, hyphen or slash, but this may not
always provide an acceptable solution. Using an n33 means that you can write your
element texts exactly as you want them to appear in your table. As with n01s, texts on
n33s which are longer than the element-text width (side=) will be split at a blank, slash
or hyphen.
If the n01 creates more than one line of figures, the n33 text will be printed adjacent to
the second line of figures. Additionally, if the table is sorted (ranked) the n33 will remain
with the statement whose text it continues rather than being ignored or sorted to the end
of the table.
An n33 placed immediately after an n03 (rather than an n01 or an n10) will be ignored.
17.6 Netting
Quick Reference
To create an element which nets respondents present in previous elements, type:
netnet_level[element_text] [;options]
endnetnet_level
netendnet_level
Nets are generally used with multicoded responses to show how many people answered
rather than how many answers they gave. For example, if five people said the product
was badly made and expensive, and these two comments were coded separately, the table
would show two lines each containing five people. A net line including people who gave
one response or the other or both would tell us that five people thought the product was
badly made or expensive or both.
Nets may follow or precede the lines to which they refer. Nets which come before the
lines to be netted are created by the statement:
netn[Text];[options]
where n is the net level number and options are any options valid on elements, except c=.
The first net in an axis must be net1. Any respondent fulfilling the conditions for at least
one of the subsequent elements will be added into this net.
There are three ways of ending a net. The first way is to add the option endnetn, to the
last element in the net, where n is the net level number, thus:
The second is to follow the last element in the net with a netendn statement, where n is
the level number of the net to be terminated:
The third method is to terminate the net by starting a new net with a net statement with
the same or a higher level number. For example, a net3 statement terminates nets at levels
3, 4, 5, and so on; a net1 statement terminates nets at all levels:
Here, the net entitled ‘Efficacy Comments’ counts respondents who satisfy any of the
conditions up to the net entitled ‘Fragrance Comments’. Each respondent is counted once
only, regardless of the number of comments he makes. The second net counts
respondents who comment about the product’s fragrance. This time the net is terminated
by a netend statement (an endnet1 option would have been equally acceptable) because
the next element is not a net element.
Nets may contain subnets nested up to nine levels deep (net2 to net9). As with top-level
nets, each subnet may be terminated by:
c) another net with a higher level number (i.e., net1 terminates net2)
d) another net at the same level (e.g., net2 followed by another net2)
Any respondent who is part of a subnet is automatically included in the net for the parent
level. Therefore, if net2 is a subnet of net1, everyone who is part of net2 is automatically
part of net1 as well. If endnet or netend terminate two or more nets at different levels, the
level number must be that of the highest level (i.e., endnet1 or netend1 to terminate net2
and net1). Let’s look at an example.
l q27;hd=Orange Juice
n10Base
n03
net1Favorable Comments;unl1
net2Packaging
n01 Liked Bottle;c=c(123,124)=$12$
n01 Liked Label;c=c(123,124)=$13$
n01 Liked Color;c=c(123,124)=$15$
n03
net2Taste
n01 Tasted Sweet;c=c(123,124)=$22$
n01 Tasted Fizzy;c=c(123,124)=$23$
n01 Refreshing;c=c(123,124)=$25$
n03
net1Unfavorable Comments;unl1
net2Packaging
n01 Too Big;c=c123,124)=$51$
n01 Not Strong;c=c(123,124)=$52$
n03
net2Texture
n01 Too Thick;c=c(123,124)=$57$
n01 Too Many Bits;c=c(123,124)=$60$
netend1
n03
n01No Comments;c=-
Here, the net for favorable comments has two independent subnets. All respondents
commenting favorably about the packaging are included in the first subnet, and everyone
commenting favorably about the Taste are included by the second. Anyone who is
present in one or both subnets is automatically included in the overall net for favorable
comments. The same principle applies to the net for unfavorable comments and its
subnets.
Notice that the last element of the Texture subnet is followed by a netend1 statement
because we are terminating a net1 and a net2.
Sometimes you will want the net row printed underneath the elements which it includes.
Nets of this type can either have the condition written in full:
or you can use the shorthand c=+n. Using the previous example we would write:
n01Cleans Well;c=c132’2’
n01Cleans Automatically;c=c132’9’
N01Don’t have to Scrub;c=c133’1’
.
n01All Efficacy Comments (Net);c=+10
We have used c=+10 because c132’29’ and c133’1/79’ refer to ten lines.
Quick Reference
To have element texts for subnets indented automatically, type:
netsort[=spaces_per_level]
In the example we have just shown, we have started the element texts for the subnets with
spaces, so that when the table is printed, the row texts for those elements are indented.
Quantum can do this for you automatically.
If you place the keyword netsort on the a or l statement, nets at each level below level 1
will be indented by 2 spaces per level. Thus, nets at level 2 are indented by 2 spaces (1×2
spaces); nets at level 3 are indented by 4 spaces (2×2 spaces), and so on. The elements
comprising each net are indented by an additional two spaces.
✎ Unlike most options that are valid on the a statement, netsort is not valid on sectbeg,
flt or tab statements.
If we rewrite the example about orange juice using netsort, the axis will look like this:
l q27;hd=Orange Juice;netsort
n10Base
n03
net1Favorable Comments;unl1
net2Packaging
n01Liked Bottle;c=c(123,124)=$12$
n01Liked Label;c=c(123,124)=$13$
n01Liked Color;c=c(123,124)=$15$
n03
net2Taste
n01Tasted Sweet;c=c(123,124)=$22$
n01Tasted Fizzy;c=c(123,124)=$23$
n01Refreshing;c=c(123,124)=$25$
n03
net1Unfavorable Comments;unl1
net2Packaging
n01Too Big;c=c123,124)=$51$
n01Not Strong;c=c(123,124)=$52$
n03
net2Texture
n01Too Thick;c=c(123,124)=$57$
n01Too Many Bits;c=c(123,124)=$60$
netend1
n03
n01No Comments;c=-
Favorable Comments
Packaging
Liked Bottle
Liked Label
Liked Color
If you want to indent by something other than 2 spaces per level, use the option netsort=n,
where n is the number of spaces by which to indent. To turn off netsorting for an
individual table when netsort has been specified on the a statement, use either nonetsort
or netsort=0 on the l statement for that table’s row axis.
Quick Reference
To define a text-only net element, type:
nttnet_level[element_text]
Sometimes you will want to group a number of elements together under a net heading,
but will not want to see any figures printed for that row. This usually happens when you
have a group of miscellaneous comments which you want to list at the end of the table.
There are two ways of dealing with this. The first is to write a standard n03 with the text
required; the second is to use an ntt statement as follows:
nttnet_level[element_text]
For example:
ntt1Miscellaneous Comments
This creates a text-only net heading at level n in which the text is indented by the number
of spaces appropriate to that level. For instance, with netsort (or netsort=2) an nnt1 is not
indented, an ntt2 text is indented by 2 spaces, and an ntt3 text is indented by 4 spaces.
Although there are no figures associated with this element, any elements between this
statement and a corresponding endnet option or netend statement will be included in any
nets at lower levels. Thus, elements following an ntt2 statement will be included in the
net created by the previous net1.
If the table is sorted, the elements in the ntt group will be sorted although the group as a
whole, including the ntt element will retain its original position in the axis.
☞ For an example of this, see the section entitled "Sorting with subsort and endsort" in
chapter 31.
Quick Reference
To control how small suppression elements are counted with smsup+ in tables of nets,
use netsm and nonetsm on the a, sectbeg, flt, tab or l statement.
To have net levels honored so that an smsup+ element contains only suppressed elements
for that element’s level, use netsm. To ignore net levels and include suppressed elements
in the next element with smsup+, use nonetsm.
When all cells in an element are less than a given value, Quantum can suppress that
element and, optionally, add the suppressed values into the corresponding cells of
another element. The options which define the value below which figures will be
suppressed are smsupa, smsupp and smsupt. smcol and smrow determine whether small
suppression is required for rows, columns, or both.
The option which marks the row into which suppressed elements will be added is
smsup+. If this option is not present, suppressed elements will not appear in the table at
all.
☞ All sm options except smsup+ are described in the section entitled "Output options"
in chapter 16.
When an element definition includes smsup+, Quantum adds into it not only the records
which satisfy the element’s condition, but also any previous elements which have been
suppressed since the start of the table, the most recent base, or the most recent element
with smsup+.
The same principle applies to tables of nets. Let’s take the axis below as an example:
l netaxis;c=numb(c232,233).gt.0
n10Base
net1Color (Net)
n01Green;c=c232’1’
n01Yellow;c=c232’2’
net2Red (Net)
n01Scarlet;c=c232’3’
n01Vermillion;c=c232’4’
n01Crimson;c=c232’5’
n01Other reds;c=c232’6’;smsup+
net2Blue (Net)
n01Navy;c=c232’7’
n01Royal;c=c232’8’
n01Sky;c=c232’9’
n01Other blues;c=c232’0’;endnet2;smsup+
n01Other colors;c=c233’-&’;endnet1;smsup+
n01Other comments;c=c233n’ ’
Each net and subnet ends with an element for shades not mentioned specifically in the
net. These elements include smsup+ so that any elements suppressed in that net will be
added into the cell counts for these elements. However, if you create a table with this axis
as it is here, you’ll find that Quantum takes no account of the net levels and simply adds
suppressed elements into the next element with smsup+. Thus, if the Green or Yellow
elements are suppressed, they will be added into the element for Other Reds which is not
what is required. They will not be included in the Other Colors element which is where
you’d expect them to go.
When you use nets and smsup+ together, you should also use one of the options netsm
or nonetsm to indicate how you want Quantum to deal with suppressed elements. These
options may be used on a/sectbeg/tab/flt/l statements, but the option on the l statement
overrides options on any of the other statements.
With netsm, Quantum honors the different net levels and only includes suppressed
elements in an smsup+ element at the appropriate level. In the example above, Other reds
will include records which have c232’6’ and any suppressed red elements; Other blues
will include records which have c232’0’ and any suppressed blue elements; Other colors
will include records with c233’-&’ and also Green or Yellow if they are suppressed. With
nonetsm, Quantum ignores the net levels when placing suppressed elements in smsup+
rows.
When you request percentages on a table of nets, Quantum calculates the percentages
against the number of respondents in the base rather than on the number of responses in
the net. If you want percentages based on net figures there are a number of things you can
do.
The first option is simply to place the keyword base on the net element. This flags the net
element as a base and percentages for all elements that come after this net will be
calculated using the net figure as a base. For example:
l opinion
n10Base
n03
net1Favorable Comments;base
fld c123 :2;Liked bottle=12;Liked label=13;Liked color=14;
+Tasted sweet=22;Tasted fizzy=23;Refreshing=24
n03
net1Unfavorable Comments;base
fld c123 :2;Too big=51;Not strong=52;Too thick=57;Too many bits=60
Here the two level 1 nets are flagged as bases. This means that the percentage for
favorable responses about the bottle will be calculated using the number of respondents
making favorable comments of any sort as a base, rather than against the total number of
respondents in the axis. The same is true for unfavorable comments where, for instance,
the percentage for the comment ‘Too big’ is calculated against the number of people
making unfavorable comments.
Base 126 59 67
Favorable Comments 74 45 29
Liked bottel 19 13 6
25.7% 28.9% 20.7%
Like label 15 5 10
23.3% 11.1% 34.5%
Like color 24 16 8
32.4% 35.6% 27.6%
Tasted sweet 15 10 5
20.3% 22.2% 17.2%
Tasted fizzy 10 7 3
13.5% 15.6% 10.3%
Refreshing 10 7 3
13.5% 15.6% 10.3%
Unfavorable Comments 51 26 25
Too big 20 11 9
39.2% 42.3% 36.0%
Not strong 18 9 9
35.3% 34.6% 36.0%
Too thick 10 4 6
19.6% 15.4% 24.0%
Too many bits 8 6 2
15.7% 23.1% 8.0%
With this method you have percentages calculated against the net figures, but you have
lost the percentages of the favorable and unfavorable comment nets against the table
base.
An alternative is to create the table using op=126 and to place base on the net elements.
With this method the net elements have percentages calculated against the table base,
while other elements have two percentages, one calculated against the table base and the
other calculated against the appropriate net.
A third choice is to define nonprinting base rows immediately after the net elements:
l opinion
n10Base
n03
net1Favorable Comments
net2;base;norow
fld c123 :2;Liked bottle=12;Liked label=13;Liked color=14;
+Tasted sweet=22;Tasted fizzy=23;Refreshing=24
n03
n11;base
net1Unfavorable Comments
net2;base;norow
fld c123 :2;Too big=51;Not strong=52;Too thick=57;Too many bits=60
This gives you one percentage for each row. The net rows are percentaged against the
table base and the other elements are percentaged against the preceding net:
Base 126 59 67
Favorable Comments 74 45 29
58.7% 76.3% 43.3%
Liked bottel 19 13 6
25.7% 28.9% 20.7%
Like label 15 5 10
23.3% 11.1% 34.5%
Like color 24 16 8
32.4% 35.6% 27.6%
Tasted sweet 15 10 5
20.3% 22.2% 17.2%
Tasted fizzy 10 7 3
13.5% 15.6% 10.3%
Refreshing 10 7 3
13.5% 15.6% 10.3%
Unfavorable Comments 51 26 25
40.5% 44.1% 37.3%
Too big 20 11 9
39.2% 42.3% 36.0%
Not strong 18 9 9
35.3% 34.6% 36.0%
Too thick 10 4 6
19.6% 15.4% 24.0%
Too many bits 8 6 2
15.7% 23.1% 8.0%
Quantum provides facilities for defining an axis in which various of the elements are
combined to form a subaxis (or pseudo-axis) which may be cross-tabulated as any
standard axis. For example, the main axis may contain all the comments made about a
new type of chocolate bar. However, you also need to produce some tables which include
only comments about the flavor, color or price of the product. Instead of creating a
separate axis for these comments, you can just flag the appropriate elements with a group
name in the main axis. The subaxes may then be tabulated in the usual way by using the
group name as the axis name on the tab statement.
Defining subaxes
Quick Reference
To mark the start of a subaxis, type:
If the elements forming a subaxis are scattered throughout the axis, flag each one with
the option:
There are two ways of allocating elements to a subaxis, depending on whether or not the
elements come one immediately after the other in the main axis. If the elements forming
a subaxis do come one after the other, they can be grouped by placing the statement:
groupbeg groupname
groupend groupname
For instance:
l allchoc
n10Base
n01Trouble undoing wrapping;c=c142’8’
n01Didn’t like plastic wrapping;c=c141’1’
groupbeg taste
n01Too sweet;c=c141’4’
n01Chocolate coating was too sweet;c=c141’7’
n01Chocolate coating was awful;c=c141’0’
n01Not sweet enough;c=c141’5’
n01No taste at all;c=c141’3’
groupend taste
n01DK/NA;c=c(141,142)=$ $
Here a subaxis containing comments about the taste of the chocolate bar has been
defined. It can be used as an axis in its own right by naming it on a tab statement, or its
elements can be treated as part of the main axis when that axis is used in a table. In the
latter case, the groupbeg/end statements will be ignored.
If an element forms the start (or end) of more than one group, the group names may be
listed on the same statement, separated by commas, thus:
groupbeg taste,coating
n01Chocolate coating was too sweet;c=c141’7’
n01Chocolate coating was awful;c=c141’0’
Quantum allows you to enter the same group name for several blocks of elements in the
main axis. This provides for occasions when a subaxis comprises more than one set of
consecutive elements. An example might be when you want to create a subaxis of
comments about the chocolate bar itself, but to exclude comments about the wrapping or
the price:
l allchoc
n10Base
n01Trouble undoing wrapping;c=c142’8’
n01Didn’t like plastic wrapping;c=c141’1’
groupbeg bar
n01Too sweet;c=c141’4’
n01Not sweet enough;c=c141’5’
n01No taste at all;c=c141’3’
groupend bar
n01Too expensive;c=c142’1’
n01Not what I would buy;c=c142’2’
groupbeg bar
n01Chocolate coating was too sweet;c=c141’7’
n01Chocolate coating was awful;c=c141’0’
groupend bar
n01DK/NA;c=c(141,142)=$ $
An axis may contain up to 32 subgroups of this type, and the groups may overlap (i.e. a
second subaxis may be started before the first one is finished) or they may be nested (i.e.
a subaxis may contain another subaxis). The example below shows a nested subaxis for
comments specifically about the taste of the chocolate coating:
groupbeg taste
n01Too sweet;c=c141’4’
groupbeg ctaste
n01Chocolate coating was too sweet;c=c141’7’
n01Chocolate coating was awful;c=c141’0’
groupend ctaste
n01Not sweet enough;c=c141’5’
n01No taste at all;c=c141’3’
groupend taste
l allchoc
n10Base
n01Trouble undoing wrapping;c=c142’8’
n01Didn’t like plastic wrapping;c=c141’1’
groupbeg bar,sweet
n01Too sweet;c=c141’4’
n01Not sweet enough;c=c141’5’
groupend sweet
n01No taste at all;c=c141’3’
groupend bar
n01Too expensive;c=c142’1’
n01Not what I would buy;c=c142’2’
groupbeg bar,sweet
n01Chocolate coating was too sweet;c=c141’7’
groupend sweet
n01Chocolate coating was awful;c=c141’0’
groupend bar
n01DK/NA;c=c(141,142)=$ $
When the elements in a group are scattered throughout the axis (as in the previous
example), or when an element belongs to all subaxes in the main axis, the element may
be flagged with the option:
group=groupnames
on the element itself, where groupnames is a comma-separated list of the groups in which
the element belongs:
l allchoc
n10Base;group=all
.
n01Too sweet;c=c141’4’;group=sweet
n01Chocolate coating was too sweet;c=c141’7’;group=sweet,coating
n01Chocolate coating was awful;c=c141’0’;group=coating
n01Not sweet enough;c=c141’5’;group=sweet
n01No taste at all;c=c141’3’
The special group name ‘all’ is used for elements which belong in all subaxes. In our
example, the Base element is to be included in all groups.
An axis may contain both types of grouping, and an element may be present in more than
one group. Elements which are not part of a groupbeg/end group and which do not have
the group= keyword are part of the main axis only: it is not an error for an element to be
omitted from all subaxes.
✎ Quantum does not allow a subaxis to be created from the elements of more than one
axis. To do this, it is still necessary to create a completely new axis and to copy the
required elements into it.
To create a table using a subaxis, write a tab statement but instead of the main axis name,
enter the name of the subaxis as it is defined by groupbeg or group=.
✎ When subaxes are used, the name of the main axis must be mentioned on a tab
statement otherwise Quantum will skip the whole axis and will complain that
subgroups have been tabbed but not defined.
☞ The tab statement is described in chapter 21.
Quantum provides a group of keywords for printing additional characters in the cells of
a table apart from the standard counts and percentages. You might find this useful for
printing base figures in brackets or for printing $ or a £ signs in front of currency figures.
The keywords work only with absolutes and are:
figbracket shorthand for figpre;figpost (i.e., print character before and after absolutes)
You can use these keywords on the l statement to make them apply to the whole axis or
on n01, col, val, fld and bit statements to apply these facilities to individual elements. If
you specify figpre, figpost or figbracket on the l statement you may turn it off for an
individual element by respecifying the keyword on the element, preceded by no; for
example, nofigbracket on an element to turn off figbracket for that element only.
The character you specify with figchar= may be a single character or a single character
enclosed in dollar signs. You may choose any character you wish, but certain bracketing
characters will be treated in a special way. Characters that count as bracketing characters
are:
( ) { } [ ] < >
If you set figchar to one of these characters and figpre and figpost are both operative, or
figbracket is operative, then that character is printed before each absolute and the
matching bracket in the pair is printed after them. Here is a some code and a table which
illustrate this:
If you specify figbracket but do not declare a fig-character, Quantum uses the ( symbol
so that figures are enclosed in parentheses.
If you tabulate two axes that have different fig-characters and a cell results that has two
fig-characters specified, the column character is printed and the row character is ignored.
If you import a table with fig-character into Excel (via q2cda), any value with
fig-characters in it will be treated as a text cell rather than a number cell.
Introduction to axes – Chapter 17 / 261
18 More about axes
This chapter tells you more about the statements used in axes. Statements covered
include the n00 to filter groups of rows on a specific condition, the n09 which controls
the pagination of long tables, and col, val, bit and fld which in some cases provide
alternatives to the n01 statement.
We also discuss keywords which may be used on all row-creating statements thus
increasing their flexibility.
Quick Reference
To define a list of elements with codes all in the same column, type:
col n;[base];Rtext1[=’p1’];Rtext2[=’p2’]
where n is the column containing the codes for this question, base creates a base element,
and Rtext1=’p1’, Rtext2=’p2’ and so on define the texts and conditions for the individual
elements.
To explain more clearly how the col statement works, let’s take the axis mstat that we
wrote earlier and rewrite it using a col statement. Originally it consisted of five
statements:
n10Base
n01Single;c=c109’1’
n01Married;c=c109’2’
n01Divorced;c=c109’3’
n01Widowed;c=c109’4’
col 109;Base;Single;Married;Divorced;Widowed
The texts Single, Married, Divorced and Widowed define the element texts and their
positions in the statement tell Quantum which code represents which response.
When Quantum encounters a col statement which just contains element texts, it assumes
that those responses are single-coded in the order 1234567890–& and blank. Thus, in our
example, Single is the first element so it is assumed to be a code ’1’ in column 109;
Widowed is the fourth element so it is assumed to be a code ’4’ in column 109.
Suppose we want to print the elements in mstat in a different order, say, Widowed,
Single, Married, Divorced. We can either write each response followed by its code, thus:
col 109;Base;Widowed=’4’;Single=’1’;Married=’2’;Divorced=’3’
or we can just define the code for Widowed and let Quantum assign the other codes by
default. The default for the first response without a specific code is always ’1’:
col 109;Base;Widowed=’4’;Single;Married;Divorced
The code list for col consists of 12 codes followed by blank. This means that blank is the
13th code in the list. If a col statement lists 13 responses, the code for the 13th response
is blank. For example:
Codes may be combined to form ‘or’ conditions, just as on n01 statements. If the
condition for a element is one code only, the code may be written without the single
quotes. Obviously this does not apply when the code is a blank: these must always be
enclosed in quotes. Here’s an example showing both ways of entering codes:
col 109;Base;Divorced/Widowed=’34’;Single=1;Married=’2’
This statement creates four elements; a base for everyone eligible for inclusion in the
table, and elements for single people, married people and respondents who are divorced
or widowed.
Codes on col statements need not be exclusive. For example, you may write:
to create a net for a multicoded response. Notice that this time we have used a
user-defined data variable in place of a column in the C array. Here, the variable called
color is a single data variable, just like a single column in the C array. If it was an array
and had more than one column, we would then type the variable name and the number of
the column containing the data. For instance:
The respondent’s first choice is read from column 3 of the color array, and the second
choice is read from column 4 of the same array.
✎ You are advised to enter codes for all responses in the list if the code order is
anything other than 1234567890-& blank. That way you will know exactly which
code represents which response straight away rather than having to look to see
which codes have been assigned elsewhere.
Take care if the element text contains a semicolon. Semicolons separate the individual
responses on a col statement, so if you want a semicolon to be printed as part of a text,
precede it with a backslash. When the text is printed, the backslash will be replaced by a
space. For example:
col 157;Base;One;Two;Three;Four
col 157;Base;One\;Two;Three\;Four
generates three rows, the second of which has the text ‘One ;Two’ and the third of which
is ‘Three ;Four’.
Col statements may be continued using a + in column 1 of the next line If a response will
not fit on the current line, you should split the statement after the semicolon separating
the two responses:
creates two rows – England and Outside London, which is not what we want, but
✎ The continuation must not occur within a set of semicolons otherwise Quantum
assumes that the continued parameter is, in fact, two items not one.
Quick Reference
To create an element on a col statement that counts respondents present in the base but
not in any other element since then, type:
In chapter 17 we described how to deal with Don’t Knows using the notation c=– on the
n01 statement. To achieve the same thing on a col statement we use the keyword =rej:
In this example the row entitled ‘Don’t Know/Not Answered’ will include all
respondents for whom c121 is blank or contains any code other than ’1/5’. As with c=–,
rej checks back to the previous Base or, if there is no base, to the beginning of the table.
If an axis contains two col statements, the first containing the word Base and the second
containing the keyword rej but no base, the line with rej will contain anyone not already
included by either statement.
Quick Reference
To include a respondent in an element if the column contains codes for that element and
no other, type:
The condition on an n01 statement can be any valid logical expression. This is not true
for the col statement which can be used only when the conditions can be represented by
the codes of one column. Nevertheless, the flexibility of col is somewhat increased by the
fact that one can type the letter u before the column number to mean ’exactly equal to’.
This is the same as writing a logical expression using Cn=’p’.
☞ For further information on the = operator in logical expressions, see the section
entitled "Comparing data variables and data constants" in chapter 5.
Suppose c114 contains information on the age ranges of children in the household as
follows:
(c114)
Under 5 years 1
5 - 10 years 2
11 - 15 years 3
16 - 18 years 4
If the household contains children in more than one age group, the column will be
multicoded. You may wish to set up a table for people whose children are in one age
group only; that is, they may have any number of children but they must all be under 5
or all aged 5 to 10; people with one child under five and one aged between 5 and 10 will
be ignored. We would set up our table as:
l child1
col =114;Base;Under 5;Aged 5-10;Aged 11-15;Aged 16-18
The = includes respondents who have the specified code and no other codes in c114. It
is the same as writing the condition c=c114=’1’. The first row after the base will be a
count of all households having children under 5 only; the next row will tell us how many
households have children aged 5 to 10 only.
Quick Reference
To define a base element on a col statement, type:
If the keyword base is given by itself, the base element will be labeled ‘Base’.
As is stated above, the keyword base creates a base row or column with the text Base, the
text being printed exactly as it appears on the col statement. To have a base with a
different text, follow the keyword base with an equals sign and the desired text. For
example:
will print a base row with the text ‘All Using Brand A’.
Tables may also be weighted in which case any base created with base or base= will also
be weighted.
Quick Reference
To define a subheading element on a col statement, type:
The hd= option on a col statement performs the same task as an n23 in the axis. To create
the axis heading ‘Marital Status’ with hd= on a col statement, we would write:
Quick Reference
To define a text-only element on a col statement, type:
The option:
tx=text
on a col statement creates a text-only element in the axis in the same way that an n03
element does. For example:
When this axis forms the rows of a table, the two tx elements form text-only rows
defining the types of colors which follow. When the axis is used as the columns of the
table, the tx= elements are ignored.
Val is used when the conditions defining eligibility for inclusion in an element are
positive numbers or ranges of positive numbers rather than codes; that is, where the
question in the questionnaire requires a numeric response rather than a single or
multicoded answer; for example, the number of people in the household, or the number
of telephone calls made.
Quick Reference
To define elements whose condition is that a variable contains a specific value, type:
If the elements contain text as well as a number, the number may appear anywhere in the
text.
The base, hd=, tx= and =rej options described for col statements are also valid on val
statements of this type.
Val can be used to test whether the value of a variable is equal to a given value. If it is
equal, the cell count is incremented by 1. The format is:
where variable is the data, integer or real variable whose value is to be tested, n1 to nn
are the values against which the variable is to be compared, and Text1 to Textn are the
row descriptions to be printed in the table.
The equals sign indicates that the test is for arithmetic equality rather than ranges. Base,
hd= and tx= are optional and create the base, sub-heading and text-only rows of the table
as described for col statements.
Let’s work through an example to illustrate this. Suppose c(110,111) contains data on the
number of people in the household, and we wish to set up a table showing how many
respondents live in households containing 1, 2, 3, 4, 5 or 6 people, so we write:
If the arithmetic value of c(110,111) is equal to 1, the respondent is included in the second
row of the table (the first row is the base row). If the value is equal to 6, he will fall into
the row reading ‘6 People’.
The text is printed exactly as it appears between the semicolons. The expression ‘One
person’ is not valid because the program can’t read ‘One’. Each text has to have a number
in it or associated with it so that Quantum knows what to check for. Texts may contain
any characters you like except semicolons and other numbers. It may also come on both
sides of the number: the parameter ‘For 5 people’ is valid.
Now let’s look at the table itself (Figure 18.1). You will notice that the last row is entitled
‘Others’: it gathers together all respondents living in households of more than six people.
This was created by Others=rej at the end of the val statement and collects all
respondents not included in any of the previous elements. We could equally well have
written an n01 statement with the condition c=–6 or just c=– to collect anyone not
included in the previous six elements.
Absolutes/Row Percentages
Base: All Who Bought Fabric Conditioner
Brand Bought Most Often
Base Brand A Brand B Brand C Others
-----------------------------------------
Base 190 54 50 55 31
100.0% 28.4% 26.3% 28.9% 16.3%
Number in Household
1 Person 32 12 11 2 7
100.0% 37.5% 34.4% 6.3% 21.9%
2 People 82 16 14 42 10
100.0% 19.5% 17.1% 51.2% 12.2%
3 People 18 6 6 2 4
100.0% 33.3% 33.3% 11.1% 22.2%
4 People 19 6 6 5 2
100.0% 33.3% 33.3% 26.3% 11.1%
5 People 9 3 1 4 1
100.0% 33.3% 11.1% 44.4% 11.1%
6 People 7 2 3 0 2
100.0% 28.6% 42.9% 0% 28.6%
Others 23 9 9 0 5
100.0% 39.1% 39.1% 0% 21.7%
Sometimes, as is the case with Other, the numbers are not a valid part of the element text.
Suppose c(132,133) contains codes for the first brand of fabric conditioner bought. If
c(132,133)=$77$, the first brand purchased was Brand A, and so on. To set up the table
shown above, we used an equals sign between the Brand and the number which
represents it:
If the values on the val statement are just numbers which are incremented by 1 for each
element, they may be abbreviated using the notation:
val c(m,n);Base;=;start:end
where start is the value required for the first element and end is the value required for the
last element. Note that if there are spaces between the start and end values and the colon,
the range will not be recognized.
When the axis is used, Quantum will create as many elements as there are numbers in the
range. For example:
will create ten elements in all: one for the base and nine for the numbers in the range 1:9.
When elements are defined in this way, no element texts may be entered since Quantum
will use the individual numbers in the range as the row and column texts.
Usually the values for each element will be positive, but val can also cope with negative
numbers. If the data contains three records:
-10
-0010
10
val c(1,5);=;-10;0;10
will report that there are two records with the value –10, and one with the value 10.
Testing ranges
Quick Reference
To define elements whose condition is that a variable contains a value within a given
range, type:
if it is not.
If the elements contain text as well as ranges, the range may appear anywhere in the text.
Ranges which are not part of the element text may be entered after the text and separated
from it by an = sign.
The base, hd=, tx= and =rej options described for col statements are also valid on val
statements of this type.
Val may be used to test whether the value of any variable (data, integer or real) is within
a given range. The format is almost identical to that for testing equalities, except that the
equals sign is replaced by I or i for inclusive ranges (the maximum value is part of the
range) or R or r for exclusive ranges (the maximum is not part of the range):
Ranges are entered as the minimum and maximum values separated by a hyphen
(min-max) or slash (min/max). If no upper value is given, a maximum of infinity is
assumed. Thus we might have:
If the value in c(110,111) is in the range 1-2, the respondent is included in the element
entitled ‘1-2 People’; if the value in c(110,111) is 10, he is included in the element ‘7 or
More People’. Note that all ranges are inclusive because we have used the I operator.
The R operator may be used to indicate that the maximum value specified is not part of
the range. We might have:
In this situation, the first element will contain the number of households in which the
number of liters bought was between 0 and 10.4999 per person; the second will contain
the number of households in which the number of liters purchased per person was
between 10.5 and 19.9999, and so on. The last element will include all households in
which the number of liters per person is 35.5 or more.
Where the numbers defining the ranges are not part of the element texts, the text may be
entered followed by an equals sign and the range specification, as described in the
previous section.
If both numbers in a range are negative, the smaller value (i.e., the one farthest away from
zero) must come first. For example:
val c(1,5);i;-10--5;-4-0;1-10
l hshld
val c(110,111);Base;hd=Size of Household;=;1 Person;2 People;
+I;3-4 People;5-6 People;7-8 People;9 or More People
✎ If you combine an I or R operator with elements that specify a single value only,
Quantum reads those values as open-ended ranges. For example, if you write:
val c120;i;Male=1;Female=2
Quantum increments the count for Male every time it finds a value of 1 or greater in
c120, and the count for Female every time it finds a value of 2 or greater. To correct
this example you would replace the i operator with the = operator.
Records whose values are missing_ fail all the standard conditions on a val statement. To
create an element on a val statement that counts missing values, enter the condition as
missing_. For example, you could write the axis for the video rental example you saw
earlier as:
l rental
val c(9,10);Base;i;None=0;1-5;6-10;11-20;21-30;31+;
+=;DK/NA=missing_
Here, the DK/NA element counts any record in which c(9,10) is not numeric.
Quick Reference
To define elements whose condition is that a field contains a specific numeric code, type:
The base, hd=, tx= and =rej options described for col statements are also valid on fld
statements.
Most of the data you tabulate consists of codes in columns. Each code in each column
represents a different response. For example, if the questionnaire shows:
Q6A: Which fils did you see on your last three visits to the
cinema?
(12) (13) (14)
Columbus .................... 1 1 1
Aliens 3 .................... 2 2 2
Pretty Woman ................ 3 3 3
Green Card .................. 4 4 4
Batman 2 .................... 5 5 5
and the respondent saw Columbus, Green Card and Aliens 3, you could count the number
of respondents who saw Columbus by writing an n01 statement:
Alternatively, you could use integer variables in the edit and set each one to 1 if the
respondent saw the film. The n01 statement might then be:
n01Columbus;c=t1 .gt. 0
Now suppose that the films are coded with two-digit numeric codes instead. The
questionnaire shows:
Q6A: Which fils did you see on your last three visits to the
cinema?
(12-13) (14-15) (16-17)
Columbus .................... 01 01 01
Aliens 3 .................... 02 02 02
Pretty Woman ................ 03 03 03
Green Card .................. 04 04 04
Batman 2 .................... 05 05 05
There is nothing wrong with writing statements of this type if this is what you want.
However, Quantum offers the choice of using either a fld statement in the axis, or a
combination of a field statement in the edit and a bit statement in the axis as a means of
avoiding long conditions of this type.
fld is like a col or val statement because a single statement can create a number of
elements. The format of a fld statement is as follows:
fld col_specs;[base[=btext]];[hd=hdtext];[tx=text];element_specs
The options base, base=, hd= and tx= define base elements, subheadings and text-only
elements as they do with col and val statements.
The column specs on a fld statement define the columns to be read. There are three ways
of entering them. First, you may list each column or field reference one after the other,
separated by commas. The list must be enclosed in parentheses. In our example this
would be:
Second, if you have sequential fields as you do here, you can type the start columns of
each field followed by the field length. The list of start columns is separated by commas
and enclosed in parentheses, and the field length comes after the closing parenthesis and
starts with a colon. If you use this notation for the film example you would write:
If you wish, you can abbreviate this further by typing just the start columns of the first
and last fields, followed by the field length. This time you do not use parentheses:
Third, if the fields are not sequential, you may list the start columns and field width of
each group of columns (as shown above) and separate each group with a slash. For
example, to read data from columns 12 to 17 and 52 to 57, with each field being two
columns wide, you would type:
You can also use this notation for single non-sequential fields. For example:
The element specs part of the statement defines the element texts and the codes which
represent those responses. If you enter element texts by themselves, Quantum assumes
that the first text is code 1, the second text is code 2, and so on. The codes apply to all
fields named in the column specs part of the statement. Therefore, to define elements
which will count the number of people who saw each film, you would write:
Remember that there are several ways of defining the columns: this example is not the
only way you can write this statement.
If the response codes are not sequential, or you do not wish to list them in sequential
order, you must follow the element text with an equals sign and the code number. You
can allocate a number of codes to one element by listing them separated by commas; if
the codes are a range, type the first and last codes separated by a hyphen. For example:
When you type codes in this way, Quantum checks that none of the codes is longer than
the given field lengths, and flags any which exceed the field length with an error message.
In our example, fields are two columns long so Quantum will reject codes greater than
99 and strings longer than two characters.
If you have responses such as No Answer or Don’t Know with non-numeric codes, type
the code enclosed in dollar signs, as shown below:
You may specify an element to gather responses not counted since the last base by typing
an element with the value =rej as you would on a col or val statement.
Quick Reference
To define elements whose data is to be read from an integer array created in the edit
section, type:
Using the statements field and bit together has the same effect as fld. You would use them
when you want to reorganize or otherwise manipulate the data before tabulating it.
☞ You’ll find a full explanation of field in section 8.6, but we’ll explain it briefly here
so you can see exactly how it relates to bit.
field is an edit statement that counts the number of times a particular code appears in a
list of fields for each respondent. It stores these counts in an integer array that consists of
as many cells as there are fields to count. For example, if the questionnaire contains the
question and response list:
Q6A: Which fils did you see on your last 3 visits to the cinema?
(12-13) (14-15) (16-17)
Columbus .................... 01 01 01
Aliens 3 .................... 02 02 02
Pretty Woman ................ 03 03 03
Green Card .................. 04 04 04
Batman 2 .................... 05 05 05
you could use field to count the number of times each film was seen by each respondent.
If you call the array films, films1 would contain the number of times the respondent saw
Columbus, films2 would contain the number of times he/she saw Aliens 3, and so on. If
the respondent did not see a film, the cell for that film will be zero.
The input_array is the name of the integer array you created with field, and
element_specs are the element texts and codes. The rules for defining elements with bit
is the same as for fld, so you could tabulate the films by typing:
Quantum will increment the count for Columbus by 1 whenever cell 1 of the films array
is greater than zero.
✎ The value of a cell in the array has no bearing on the number by which Quantum
increments the element count. As long as the cell value is greater than zero,
Quantum increments the element count by 1. It does not increment the element
count by the cell value.
You can group and re-order responses in the same way as with fld. For example:
Quantum will increment the count for Fiction/Fantasy whenever cells 2 and/or 5 are
non-zero; it will increment the count for Historical whenever cell 1 is non-zero; it will
increment the count for Others whenever cells 3 and/or 4 are non-zero.
bit does not allow detailed analysis of out-of-range or non-numeric codes. The field
statement counts out-of-range codes (e.g., a code 6 or 7 when the array contains five
cells) in cell zero which Quantum creates for itself. Non-numeric codes such as $&&$
for ‘None of these’ are stored in a cell which you define as part of the field statement. If
you want to include these types of responses in the axis, you may do so with an element
of the form Out of range=rej.
bit allows use of the keywords base, base=, hd= and tx=.
When you have numeric codes and you want simple counts of respondents, it is quicker
and easier to use fld than it is to use field and bit. However, if you want to count the
number of times each film was seen, for example, you will need to increment the
elements of the axis by the number of times each response code appears. Normally you’d
use inc= to name the column or field to use as the increment, but with fields of numeric
codes this is not possible. This is where you’d choose field and bit instead of fld, because
Quantum has already done some of the work for you.
☞ For information on inc=, see the section entitled "Data options" below.
You’ll remember that field increments cells of the array each time it finds a particular
code. If code 01 appears in the list of fields twice, the value in cell 1 will be 2. This is the
value by which you want to increment the element count, so you write a bit statement
which tells Quantum to do just that. If you want an axis that shows the number of times
each film was seen, you will write the bit statement as:
If we assume that all respondents are eligible for inclusion in this axis, the base will be
the total number of respondents, even though a cell may be incremented by more than
one per respondent.
In the previous section you saw how bit provides an quick and easy method of updating
the cells in a table by more than one per respondent. In the example, the cells of the table
will show the number of viewings per film, but the base will show the number of
respondents eligible for inclusion in the table.
Although a table of this type immediately highlights films which were seen more than
once per respondent, there may be times when you want the base to be calculated on the
same basis as the rest of the table; that is, it should be a count of viewings rather than of
respondents. To obtain a table of this type you need to increment the base once per field
rather than once per respondent. This means reading and tabulating one field at a time.
The statement which lets you do this is process. This is an edit statement which sends the
record temporarily to the tabulation section. Let’s rewrite the films example using
process and compare the results with the ones we obtained using bit.
☞ For full details on process and further examples, see section 9.10.
In the edit we write a loop which copies the contents of each field in turn into some spare
columns and then sends the record to the tabulation section for analysis. Since we do not
want all tables to be incremented more than once per respondent, we set a flag in another
spare column which we can use as a filter on the table which requires the extra updating.
The edit is:
ed
/* set a flag for use as filter in tab section
c180=’1’
/* take each field in turn
do 5 t1 = 112,116,2
if (c(t1,t1+1)=$ $) go to 5
c(181,182) = c(t1,t1+1)
/* skip to tab section
process
/* back from tab section, so try the next field
5 continue
/* reset filter flag for other tables
c180=’ ’
.
end
The record passes through the entire tabulation section but only tables in which the spare
column contains a ’1’ are updated. Once all fields have been dealt with, the edit resets
this column to blank so that when the edit finishes Quantum can create the other tables
in the usual way.
To compare the results of using bit and process, let’s take the record:
0 ---+--- 1
020402
The counts for the individual films are the same with both methods. bit and inc increment
each table cell by the value of the film’s cell in the array but, because the record passes
through the tabulation section once, the base is incremented only once per respondent.
With process the record passes through the tabulation section once per film; the base and
the individual film counts are incremented each time this happens.
Quick Reference
To define a condition that applies to a group of consecutive elements, type:
n00;c=logical_expression
Often you will have a condition that applies to a group of rows in a table, in addition to
the individual row condition. There are two ways of dealing with this. One way is to make
each row condition an ‘and’ condition by combining the global condition with the row
condition, for instance:
n01First Row;c=c136’1’.and.c140’1’
n01Second Row;c=c136’1’.and.c140’235’
The other is to enter the global condition separately using an n00 (N-zero-zero)
statement.
An n00 defines a condition applicable to all subsequent rows until another n00 is read or
until the end of the axis, whichever is the sooner. Its format is:
n00[;c=condition]
One of the most frequent uses of this statement is in tables which show whether or not a
respondent liked a particular product, and then lists reasons why he did or did not like it.
For example, the axis:
l pref1
col 321;Base;Liked Product;Disliked Product;DK/NA=rej
n03
n00;c=c321’1’
col 322;hd=Reasons for Liking Product;Cleans Well;
+Lasts a Long Time;Smells Nice; ....
n03
n00;c=c321’2’
col 325;hd=Reasons for Disliking Product;Inconvenient to Use;
+Too Expensive; ....
has three sections, each one separated by a blank line (n03). The first tells us how many
respondents liked the product (c321’1’) and how many disliked it (c321’2’). So that we
include all eligible respondents in this section, we set up the third row to include anyone
who does not have a 1 or a 2 in that column. These would be people who did not have a
preference or who refused to answer the question.
The second section deals only with those who said they liked the product, so we have
used the n00 to exclude everyone else. Use of an n00 here is efficient as regards your time
and the computer’s: without it, you would have had to write one statement for each
reason, with each reason having a complex condition, as shown previously.
The third section is exactly the same as the second, except that it includes respondents
who did not like the product.
In this example, the first filter was turned off by another n00 defining a new filter, and
the second was terminated by the end of the axis. To turn off a filter before the end of the
axis without defining a new filter, enter an n00 without a condition. This causes all
respondents to be eligible for inclusion in the following elements.
Tables may contain several base elements. The most common occurrence of a second or
redefined base is in preference tables as in the example above. Percentages for the
products themselves are calculated using the number of respondents eligible for inclusion
in the table, while percentages for the reasons for preference are generated using the
number of people preferring that product as a base.
If you wish, you can also have the reasons for preference percentaged against the table
base. This is done with the option op=6 on the a/flt/tab statement.
☞ For further information about output types, see the section entitled "Output options"
in chapter 16.
When the two sets of percentages are printed, the percentage against the redefined base
(e.g., all preferring product A) will be printed on the line immediately beneath the
absolute figures, with the percentages against the table base on the line below.
l pref
col 145;Base;hd=Product Preferred;Brand A;Brand B;Brand C
n00;c=c145’1’
col 146;Base=All Preferring A;Cleans Better;Pleasant Smelling;
+Lots of Bubbles; ...
n00;c=c145’2’
col 146;Base=All Preferring B; ....
Region
Base North South Ease West
-----------------------------------------
Base 282 78 81 57 66
Product Preferred
Brand A 110 37 42 21 10
39.0% 47.4% 51.9% 36.8% 15.2%
Brand B 95 15 28 15 37
33.7% 19.2% 34.6% 26.4% 56.1%
Brand C 77 26 11 21 19
27.3% 33.4% 13.5% 36.8% 28.7%
Pleasant Smelling 27 9 12 5 1
54.0% 24.3% 28.6% 23.8% 10.0%
9.6% 11.5% 14.8% 8.8% 1.5%
Lots of Bubbles 33 15 10 5 3
30.0% 40.6% 23.8% 23.8% 30.0%
11.7% 19.2% 12.4% 8.8% 4.5%
All Preferring B 95 15 28 15 37
.
.
Figure 18.2 Using n00 as a filter in an axis
There are, of course, several minor variations which would work just as well: for
example, in place of the n00s and the Base parameters on the col statements we could use
n10s with conditions on them. However, this would mean replacing the cols with separate
n01s for each element and adding the condition to each of those statements as well.
Alternatively, since the same col statement applies to all brands and the n00 is almost the
same, we could put them in a separate file and include it in the relevant places with an
*include statement.
18.7 Pagination
When a table is too large to fit on a single page, Quantum starts a new page automatically.
If you wish to have more control over where pagination occurs you may insert statements
in the axis marking points at which a new page must always be started, or points at which
a new page should be started if there is insufficient room for a given number of lines.
Automatic pagination
If the row axis is too long for the page, Quantum prints as much as will fit on the first
page and then reprints the column headings at the start of the next page before printing
the rest of the row axis. If the column axis is too wide for the page, Quantum prints the
full row axis with as much of the column axis as will fit on the first page. It then starts a
new page and reprints the row axis with the remainder of the column axis.
Quantum repeats the table headings and the base row or column, as appropriate, before
printing the rest of the table. Any texts created by foot and bot statements are also carried
forward. The table number will remain the same, but the page number will be increased
by 1.
☞ For further details about foot and bot, see section 22.3 and section 22.4 respectively.
When a table is very large and has both its row and column axes split across pages,
Quantum normally prints all the pages which make up the full column heading before it
prints those containing the continuation of the row axis. For example, if the table spec is:
and each page has room for five columns and three rows, Quantum prints the table in the
following order:
Page 1
North North
Base North East West East
Base
11-20 yrs
21-34 yrs
Page 2
South South
West South East West Central
Base
11-20 yrs
21-34 yrs
Page 3
North North
Base North East West East
35-54 yrs
55+ yrs
Page 4
South South
West South East West Central
35-54 yrs
55+ yrs
To alter the pagination so that rows take precedence over columns, place the keyword
rinc (short for ‘rows in columns’) on the a, flt, sectbeg or tab statement. The order of
printing for our example would then be:
Page 1
North North
Base North East West East
Base
11-20 yrs
21-34 yrs
Page 2
North North
Base North East West East
35-54 yrs
55+ yrs
Page 3
South South
West South East West Central
Base
11-20 yrs
21-34 yrs
Page 4
South South
West South East West Central
35-54 yrs
55+ yrs
Manual pagination
Quick Reference
To override automatic pagination in long axes with your own page breaks, type:
hug= defines an optional page break which should take place if there is insufficient room
for num_elms elements on the current page.
To override the automatic page turnover within an axis, insert the statement:
n09[Text]
at the point at which a new page is required. ‘Text’ is an optional text which will be
printed beneath the table headings at the top of the next page.
Quantum always honors n09 elements in row axes. If the axis is used as a column axis,
Quantum starts a new page at the n09 if the full set of column headings is too wide for
the page. You may change this for an individual n09 using the keywords norow and col.
Use norow to suppress the page break when the axis is a row axis; use col to force a page
break when the axis is a column axis. In the example below Quantum prints the sex and
age columns on different pages even though there would be room to print them side by
side on the same page.
l agesex
col 109;Base;Male;Female
n09;col
val c(110,111);I;11-20 yrs;21-34 yrs;35-54 yrs;55+ yrs
The presence of n09 in an axis turns off the automatic page throw for the whole axis, so
in long axes you may need to insert more than one n09 statement to get the required
pagination.
An alternative to forced page throws is to say that a new page is necessary if there is not
room for a given number of lines at the bottom of the page. To do this, add the option
hug=n to the n09, where n is the number of lines still needed at the bottom of the page.
For example:
n09Continued;hug=5
will start a new page if there is not room for five more lines on the current page. The new
page will have the heading ‘Continued’ in the top left-hand corner with the table titles.
When Quantum reads hug= and decides whether or not to start a new page, it counts all
elements which would normally be printed, even if in this particular table they are
suppressed by options such as those in the nz or smsupp groups. This count includes n23s
and n03s if the axis is a row axis, and n23s if the axis is a column axis (n03s are ignored
in column axes). Elements such as n15 which are never printed are not counted at all.
✎ As with automatic page turnovers, n09 prints a base row before continuing with the
next line of the table. To suppress the base, follow the n09 with an n11.
A limited set of options is available for n09 statements – these are discussed below.
Quick Reference
To flag elements for printing at the top of continuation pages in addition to or instead of
the table base, include the option:
hold=position
on those elements, where position is a number between 1 and 9 indicating the position in
which that element is to be printed in relation to other held elements.
When a table is too long to fit onto one page, the base is automatically printed as the first
row of all subsequent pages of the table. By flagging elements in the row axis with the
option hold=n, you can determine which other elements should be printed in addition to
or instead of the base.
This option may appear on any n statement, or an equivalent. The value of n may be
between 1 and 9, with elements with lower values being printed before those with higher
values. If two or more elements have the same hold value, the most recent one before the
page break supersedes any previous elements with that value.
The default for non-base elements is no hold at all; the default for bases is hold=0 which
causes them to be printed at the top of the table before any other held items. You may
place your own hold value on base elements to override the default.
Here is an example:
l likes
n10Base
n01Unweighted base;hold=1
n03Color;hold=2
col 123; Color comments ......
n03Flavor;hold=2
col 126; Flavor comments .....
n03Packaging;hold=2
col 129; Packaging comments ....
n03Other comments;hold=2
col 132; Miscellaneous comments
In a table created with this axis as the row axis, the base and the unweighted base will be
printed at the top of every page of the table. Each of the subcategories in the axis has the
option hold=2. This means that only the current category heading will be printed at the
top of the continuation page. So, if the table breaks in the middle of the flavor category,
the Flavor heading will appear underneath the two base elements at the top of the next
page. The Color category is finished, so its hold flag has been overridden; the Packaging
and Other comments categories have not yet been reached so their flags have not been set.
l likes;sort
n10Base
net1Color;hold=2
col 123; Color comments ......
n03
net1Flavor;hold=2
col 126; Flavor comments .....
n03
net1Packaging;hold=2
col 129; Packaging comments ....
netend1
n03Other comments;hold=2
col 132; Miscellaneous comments
the n03s before the nets will hug with the nets when the hold flags are assigned. This can
cause unwanted blank lines to appear at the top of continued tables. To prevent this, place
nosort on the n03s before the nets.
There are three situations when Quantum produces more than one page per table:
• There is no room on the current page for the row about to be printed.
Quantum looks at the hold value on that row, if any, and cancels any previously held
rows with that value.
Quantum looks at the hold value of the row immediately after the n09 and cancels
any previously held elements with that value.
• An n23 is encountered with insufficient room on the page for the n23, any following
text-only rows, and the next row containing figures.
First, Quantum looks at the hold value, if any, on the n23 and cancels the previously
held rows at this level. Then it looks at the row containing figures. If this is a base
with the default hold=0, it cancels the previous row at that level; if not, it doesn’t
cancel any hold= value associated with that row.
In all cases, Quantum then prints the table headings and any held rows, followed by the
row about to be printed. Finally, it puts the hold= in force for this row. In cases 2 and 3,
Quantum does not distinguish between printing and non-printing rows when looking
ahead.
Once hold is in force at a given level, the only way to cancel it is to introduce a new hold
at that level. To cancel a hold without generating a new printing element, enter a
statement of the form:
n03;hold=n;norow
Options may be used on n01, n15, n10, n11, col, val, bit and fld statements to define more
specifically when and how a row should be printed. In many cases, an axis, table, filter
or run-level option may be switched off by entering the appropriate keyword on the
element, preceded by the letters no. For example, if you have applied a scaling factor to
the table as a whole, you may switch it off for a single element by placing the keyword
noscale on the statement which defines that element.
A small subset of these options may be used on n03, n23 and n09 statements:
[no]row [no]col
When used on an n statement, options are listed after the row text and are separated from
it and each other by semicolons:
whereas on a col or val statement the option follows the relevant row text, still separated
by a semicolon, but also preceded by a percent sign:
Where an option appears on an element and also at at higher level (i.e., a/sectbeg/flt/tab),
the option on the element will always override the same option at the higher level. Thus,
dec=2 on an n01 will override dec=1 on the a statement for that element only.
✎ Where conflicting options appear on the row and column axes of a table, the column
option takes precedence.
Output options
Output options determine how the axis will be printed in the tables. They have nothing
to do with how the counts in the tables are created.
base Indicates that the row or column should be used as a base for
percentaging.
colwid=n This option defines the column width for this element when the axis is
used as a breakdown (banner) and column widths are not set with p
statements.
☞ For more information about using axes to create columns, see chapter 20.
dec=n Determines the number of decimal places for absolute figures in this
element. The default is dec=0.
decp=n Sets the number of decimal places for percentages when op=0, 2, 7, or
&. The default is decp=1.
dsp This leaves one blank line between each row of data in a table. Without
this, one line follows directly underneath another.
flt=name
dummy Defines a dummy element used to ‘pad’ the axis to the size required to
accommodate the final table. This may be used when two or more
tables of different sizes are added together with the add statement.
Dummy is only valid on n01 and n15 statements.
☞ Adding tables is discussed in the section entitled "Creating a table" in chapter 21.
endsort Denotes the end of a group of rows to be sorted within the axis. If a row
terminates more than one level of sorting, the notation endsort=n must
be used, where n is the number of levels terminated.
figbracket Prints the character defined with figchar in front of each absolute, and
prints the corresponding closing bracket after each absolute.
☞ For further information on the fig group of options, see section 17.8.
☞ For an example and further information, see the section entitled "Repeating
elements on continued tables" earlier in this chapter.
id=name Assigns an identity code of up to six characters for use in row or table
manipulation.
indent=n Defines the number of spaces by which the second and subsequent lines
of a long element text should be indented when it is broken by
Quantum. n may be any number between 1 and 7; the default is 0.
nocol Indicates that the element should be ignored if the axis is used as a
breakdown axis. This is the default for n25 elements. To print these
elements in a table add the option col to the n25 statement.
norow When this is used, the element is ignored when the axis is used as a row
axis. This is the default for n25 elements. To print these elements in a
table add the option row to the n25 statement.
nosort Indicates that the row is not to be sorted with the other rows in the axis
or group.
notstat Exclude this element from the special T statistics. This is the default for
n10, n11 and their equivalents on col, val, bit and fld statements.
nz Indicates that the element should not be printed if all its cells are zero.
Elements that contain nonprinting cells that are not blank are
suppressed as long as all the printing cells are blank. This means that if
you have an element that is all blank apart from a non-printing base
column that is suppressed by nocol, the row element will be
suppressed.
op=n The same as op= on an a statement.
☞ See the section entitled "Output options" in chapter 16 for a description of global
output options for the a statement and below.
☞ See the section entitled "Output options" in chapter 16 for further details.
scale=n Defines a scaling factor by which all values in the element must be
multiplied before printing. To have elements divided by a given value,
the notation is scale=/n. For instance, scale=/100 will divide all
numbers by 100 before printing them.
smsupa=n Suppress the element if all absolutes are below the given value. At least
one of the options smcol or smrow must also be present on the
a/sectbeg/flt/tab statement to determine which types of elements in the
table are affected.
smsupp=n Suppress the element if all column percentages are below the given
value. At least one of the options smcol or smrow must also be present
on the a/sectbeg/flt/tab statement to determine which types of elements
in the table are affected.
When columns are suppressed in an axis whose column headings are
defined on g statements, Quantum ignores the g statements and creates
its own column headings using the texts defined on the elements
themselves.
☞ For more information on defining column headings on g statements, see chapter 20.
smsupt=n Suppress the element if all total percentages are below the given value.
At least one of the options smcol or smrow must also be present on the
a/sectbeg/flt/tab statement to determine which types of elements in the
table are affected.
smsup+ Creates an element which includes the sum of all elements suppressed
by the smsup group of keywords, in addition to any other records which
normally belong in that element. Suppressed elements are collected
from the start of the axis, the most recent base, or the most recent
smsup+ element, whichever is the most recent. Elements suppressed
after the smsup+ are not included. For example:
n10Base
col 132;Red;%smsupa=10;Blue;Green;Yellow
n01Other primary;c=c132’5/&’;smsup+
col 133;Pale blue;Mauve;Grey;%smsupa=10
Here, the Other element will include all answers coded as 132’5/&’ and
also the count for red if it is less than 10. It will not include the count
for grey, even if that element is suppressed. To create an element which
is solely a count of suppressed elements, write a statement of the form:
n01Suppressed elements;dummy;smsup+
If you’re using smsup+ in a table of nets, and you want the suppressed
elements to be accumulated into other elements at the correct net level,
include netsm on the a/sectbeg/flt/tab/l statement.
supp Suppresses percentages for a single row when percentages have been
requested for the rest of the table.
toptext= Defines a text to be used as a column heading when the element text is
unsuitable for this purpose (e.g., it is too long). For example:
n23Visited Museum Before;toptext=Been Before
tstat Include this element in the special T statistics. This is the default for
n01, n15 and their equivalents on col, val, bit and fld statements.
unln Underlines the element text. The amount of underlining is determined
by the value of n:
Data options
Data options determine how the counts in the table will be calculated. These options are
not concerned with the way the axis is laid out.
c=logical expression
Defines the conditions for an element. c= is only valid on n statements.
☞ For further information on the effective base, see the section entitled "The effective
base" in chapter 32.
ex=expression Alters the figures in the row using row manipulation techniques.
✎ The factor does not alter the figures in the row itself: if you want to define a scaling
factor to increase or decrease the figures in the row by a given amount, use the
option scale= which is described below.
☞ For further examples of fac=, see chapter 19.
inc=arithmetic expression
In most tables cells are counts of people because each cell is
incremented by 1.0 for each respondent included in that cell. Cells may
also be incremented by the value of an arithmetic expression; for
instance, we may wish to know how many boxes of dog biscuits a
respondent bought, or the number of pre-school-age children in
households of varying sizes.
In both cases we would produce the table using the appropriate axes for
row and column definitions, but instead of incrementing the cells by 1
for each respondent, we would increment it by the number of boxes of
dog biscuits bought or the number of pre-school-age children in his
household. (This presupposes, of course, that this information is
available somewhere in the data file).
To increment cells by the value of an arithmetic expression rather than
by 1, we use the option inc=arithmetic expression.
Let’s work through an example. Suppose we have a question asking
how many times the respondent visits various shops in a week. The
information is stored in the following columns:
c(109,110) = Safeways
c(111,112) = Sainsburys
c(113,114) = International
c(115,116) = Tescos
We need to create a table showing the total number of times each shop
is visited. This entails incrementing each cell in a given row by the
number in the relevant columns, for instance, if row 1 refers to
Safeways, cells in that row need to be incremented by the value in
c(109,110).
What we do is set up an axis in which each row refers to a different
shop, and use inc= to define the columns whose values are to be added
into the cell:
l shop
n04Total Visits;base
n01Safeways;inc=c(109,110)
n01Sainsburys;inc=c(111,112)
n01International;inc=c(113,114)
n01Tescos;inc=c(115,116)
Area of residence
Base Area 1 Area 2 Area 3
Shops Visited
Safeways 635 143 257 235
30% 35% 29% 29%
✎ inc= may also be used on elements when tables of means, medians, maximum or
minimum values are required.
If you are using n25;inc= as the basis for calculating means in an uplev’d axis in a
levels job, and a record has more than one set of data at the lower level, Quantum
uses only the data in the last set in the mean calculation. Data in the other sets for
the record is ignored
See section 16.5 for examples of a table of means and a table of minimum/maximum
values of incs.
levbase Increments the base element of an uplev’d axis for all records at the
anlev level.
☞ For further details, see the section entitled "Axis update level" in chapter 28.
maxim Used with inc= to produce an element that shows the maximum value
of the inc= variable. Zero values are ignored.
median Used with inc= to produce an element that shows the median value of
the inc= variable. Zero values are ignored.
minim Used with inc= to produce an element that shows the maximum value
of the inc= variable. Zero values are ignored.
missing=expression
Treats values defined by the logical expression the same as missing
values if missing values processing is switched on for the tabulation
section.
This option is valid anywhere that inc= is valid, but you will usually
use it on definitions for statistical elements such as n25s or on n01s that
create medians. For example:
n01Median value;inc=c(123,127);median;
+missing=c(123,127).le.0
This calculates a median value by summing up the values in columns
123 to 127. Values that are less than or equal to zero are treated as
missing values and are excluded from the calculation if missing values
processing is switched on for the tabulation section.
missingval Exports non-numeric data as missing_. This facility is always available
regardless of whether missing values processing is on or off.
When you use Quantum to export data for use with SAS or SPSS,
Quantum codes the data according to the position that the element
occupies in the axis. For example, in the axis:
l sex
col 110;Base;Male;Female
n01Not answered;c=-
men are coded as 1, women as 2 and anyone who did not answer is
coded as 3. If you want the data to be exported so that respondents in
the Not answered element have the value missing_, add the option
missingval to the element. For example:
n01Not answered;c=-;missingval
noround Prevents the cell count for this element being altered when rows or
columns are rounded to 100%.
op=A/B Selects an element for use in a percentage difference calculation.
rej=n This is used to reject records belonging in one row from one or more
other rows. The rows from which records are rejected have the option
rej=0, and the row which is to be rejected from them has the option
rej=1. For example:
l ax01
n10Base;rej=0
col 238;Brand A;Brand B;BrandC;
n01DK/NA;c=c238n’1/3’;rej=1
Here, any respondent in the category ‘DK/NA’ will be excluded from the
base row; that is, anyone who does not have a 1, 2 or 3 in column 238.
Quick Reference
To calculate the difference between two percentages, type:
keep[=pc_type]
on one of the elements to be used in the calculation. The default is to use unrounded
figures in the calculation; to use rounded figures, enter pc_type as print.
op=A
op=B
to subtract the percentage for this element from the keep= percentage.
Quantum can calculate the difference between two percentages. The difference is simply
the result of subtracting one percentage value from the other (e.g. 25%−15%=10%). This
involves three options:
n11Base
n01Brand A: January;c=c115’1’;keep;op=12
n01Brand A: February;c=c115’2’;op=12
n01Increase(+) / Decrease(-) Jan to Feb;c=c115’2’;op=A
calculates the difference between the percentages for January and February by
subtracting the January percentage from the current percentage, thus:
Base
Brand A: January 100
50%
Brand A: February 150
75%
Increase(+) / Decrease(-) 25%
Jan to Feb
In this example, we have created a separate element for the percentage difference. If you
prefer, you can create an axis in which all elements except the first one contain
percentage differences. If we rewrite our example as:
n11Base
n10Brand A: January;c=c115’;keep;op=12
n10Increase/Decrease to February;c=c115’2’;op=A
Base
Brand A: January 100
50%
Increase/Decrease to 25%
February
If we wanted to see the actual figures for February as well, we would write the op=A
element as:
n01Increase/Decrease to February;c=c115’2’;op=12A
Base
Brand A: January 100
50%
Increase/Decrease to 150
February 75%
25%
The order of options with op= is important with this facility if you are to get the figures
you expect. When Quantum encounters a keep element, it looks at the options for that
element (or the defaults for that table or run if none are defined) and keeps the ‘last’ one.
In this instance, ‘last’ means last in the order:
1 Absolutes
2 Column % on current base
6 Column % on first base
0 Row percentage
& Total percentage
8 Indices
3 Row ranks
so with op=120, keep will store the row percentage for calculation purposes. In the
examples above, the options were always op=12, so all calculations were based on
column percentages.
If a cell in the op=A/B element does not have the percentages required by the keep
element, it will use whatever figures are available in the op=A/B element, even if these
happen to be absolute figures rather than percentages. This might happen if the op=A/B
element has a row percentage flag (i.e., op=10A), but the table or run does not have the
option acr100 for percentages on the base column.
Where a table contains keep and op=A/B options in more than one axis, the keep/op= in
the column axis overrides that in the row axis, which in turn overrides that in the third
dimension axis. Keep/op= in higher dimensions are ignored.
Keep and op=A/B may not appear on the same element. Therefore, if you want a table of
rolling differences (i.e., the difference between the current element and the previous
element), you need to create a separate element for each percentage difference element,
which may or may not be printed in the table. Here are two variations of such an axis:
n01January;c=c115’1’
n01February;c=c115’2’;keep
n01% change Jan to Feb;c=c115’1’;op=B
n01March;c=c115’3’;keep
n01% change Feb to Mar;c=c115’2’;op=B
n01April;c=c115’4’;keep
n01%change Mar to Apr;c=c115’3’;op=B
In this example, notice that the % change element refers to the previous rather than the
current month, so that the difference is always calculated by subtracting the current
month from the previous month. This might produce the table:
January 25%
February 28%
% change Jan to Feb 3%
March 26%
% change Feb to Mar -2%
April 22%
% change Mar to Apr -4%
To create the same table so that only percentage difference elements are shown for each
month, the axis may be written with suppressed elements, thus:
n01January;c=c115’1’;keep;op=2
n01Change January to February;c=c115’2’;op=2A
n15;c=c115’2’;keep;op=2
n01Change February to March;c=c115’3’;op=2A
n15;c=c115’3’;keep;op=2
n01Change March to April;c=c115’4’;op=2A
January 25%
Change January to February 3%
Change February to March -2%
Change March to April -4%
Our examples so far have all referred to elements in axes, but keep and op=A/B are also
valid on tab statements to generate tables in which the cell values are the differences
between corresponding cells in two tables. Generally, you’ll use this facility when the
two tables refer to different sections of the population. For example:
The first table is the whole population and the second table is, say, everyone who works
full-time. Quantum does not print this second table as such. Instead, it creates it in
memory but then subtracts the figures for that table from those in the kept table and prints
the results as the second table. Thus, if the population base for the eastern region is 17%
and the base of full-time workers for the same region is 10%, the figure in the final table
will be 7%.
✎ Percentage signs are printed according to the requirements for the table as a whole.
op=A and op=B are mutually exclusive (i.e., cannot be present together).
Quick Reference
To define an element whose counts are to be distributed among the other elements in the
axis, type:
Quantum allows you distribute records belonging in one chosen element between the
other elements in the axis. One example might be a No Answer element in a brand list
axis, where those respondents who gave no answer are distributed amongst the brand
elements of the axis.
ndi elements are normally suppressed when the axis is used as the rows or columns of a
table, and are always ignored when the axis is a higher dimension. Use the options row
and/or col to override this suppression with row and/or column axes.
Records belonging in an ndi element are distributed proportionally across all totalizable
elements between the most recent base or ndi statement, whichever is the latest, and the
ndi statement being processed. All totalizable elements are eligible to receive distributed
values regardless of whether or not they are printed. (Totalizable elements are elements
such as n01 and its equivalents which may be added up to form a total. Statistical and
base elements are not totalizable because they are excluded from totals). If you want to
prevent an element from receiving distributed values, include the keyword ntot in its
definition. This flags it as a non-totalizable element which Quantum ignores for
distribution purposes.
✎ The distribution of values is proportional to the sum of the values in the totalizable
elements, not the base, since there may be records which are present in the base but
not in any other elements of the axis, or certain of the elements may be multicoded.
Nx
cell = count * 1.0 + -------------
∑ Nj
where:
Compare the two tables below. The first is a standard table with no ndi element; the
second is a variation of this in which the No Answer element has been defined with an
ndi statement instead of an n01. The 5 respondents in this element have been distributed
proportionally amongst the other elements according to the number of records in those
elements. Hence, the increase from 31 to 33 respondents for Brand A is the result of the
calculation (31*(1+5/95)).
In this example, the base is the sum of the totalizable elements, but this is not always the
case.
Part of Quantum’s power lies in the fact that it offers you the ability to create various
types of statistical output without having to know the formulae necessary to calculate
them. This chapter describes these statements and provides examples of how to use them.
It also discusses the statements available for creating totals and sub-totals within a table.
☞ Other less frequently-used statistics are described in chapter 29, chapter 30 and
chapter 32.
Quick Reference
To create a total, type:
n04[element_text]
n05[element_text]
Totals in Quantum are sums of all rows or columns created by n01, n15, col, val, bit or
fld statements. Statements which generate totals are n04 and n05.
Regardless of its position in the axis, the n04 sums all counts from the beginning of the
table up until the end of the table, or until another base-creating element is read in the
axis, whichever comes sooner. Therefore you need not put the n04 at the end of the axis
if that is not where you want it to go. For instance,
n10Base
n04Total
col 257;First Row;Second Row;Third Row
If the n04 is placed between the top of the table and a base, it provides a total of all counts
between the top of the table and that base. Likewise, if it occurs between a base and the
bottom of the table, the total is the sum of all counts between the base and the end of the
table. If it is between two bases, the n04 totals all counts between the two base rows. If
one of the other rows is a subtotal (n05), the n04 will ignore the figures in that row.
There is one thing to beware of. Some tables will include net rows which are counts of
respondents included in the previous or following groups of rows. If the net is created
using an n01;c=+ statement, the total calculated by the n04 will include not only the
individual rows but also the nets of those rows. The n04 is a sum total of counts, just as
if you had typed the figures from each row into a calculator. If you want to produce a total
which excludes nets you will need to replace the n04 with an n01 containing an
appropriate condition, or create the net using a net statement.
l try1
n10Base
col 153;Base;hd=Overall Rating;Excellent;Very Good
n01Excellent/Very Good (Net);c=+2
col 153;Good=’3’;Fair=’4’;Poor=’5’;DK/Refused=’&’
n04Total
The total for this axis will be the sum of the two rows created by the first col statement,
plus the n01 row, plus the three rows on the second col statement. To exclude the net row
from the total, replace the n04 by an n01, as follows:
l try1
n10Base
col 153;Base;hd=Overall Rating;Excellent;Very Good
n01Excellent/Very Good (Net);c=+2
col 153;Good=’3’;Fair=’4’;Poor=’5’;DK/Refused=’&’
n01Total;c=+
The condition c=+ says that anyone who has been included in any of the previous rows
should be included in this total row. Each person is counted once only, regardless of the
number of rows he appears in. If column 153 could be multicoded, we would have used
the condition c=c153’1/5&’ to be sure that our total was a count of codes rather than of
people.
The n05 creates subtotals starting from the beginning of the table, or the most recent base,
or the most recent n05, whichever comes last, and ending with the element immediately
before the n05 itself, as illustrated in Figure 19.1.
The table and axes below provide a good example of how n04s and n05s can be used:
If you are new to Quantum, you may be glad of a little more explanation about this table.
The table base is the number of people buying bread, and the rest of the table shows the
number of loaves purchased by those people. The cell with the conditions ‘White’ and
‘11-20’ under the subheading ‘Bought from Supermarket’ tells us that 429 loaves were
bought by people who bought white bread in a supermarket and who bought between 11
and 20 loaves of bread last month. The cell is a count of loaves bought rather than of
people buying bread because of the inc= on the n00 statement in the axis. This says,
whenever a respondent satisfies the condition for a cell, increment that cell by the value
in c(132,133) rather than by 1.
Without the inc=, this cell would tell us how many people who bought 11 to 20 loaves
last month bought white bread in a supermarket.
n07 average
n12 mean
n13 sum of factors
n17 standard deviation
n19 standard error of the mean
n20 error variance of the mean
n30 medians
In most cases, all you need to do is enter the statement at the point at which it is to be
printed. All statements are formatted in the same way, namely:
naa[text];[options]
where ‘aa’ is the n statement number. The only options valid on all statistical statements
are dec=, id= and scale=. The n30 also needs a fac= to define the factors for each range
of values.
If the table contains absolute and percentage figures, the percentages are suppressed for
the rows or columns created by these elements.
✎ Quantum differentiates between means and other statistics (e.g., standard deviation)
in which the sum of values is zero, and those in which the sum of cases (i.e.,
respondents) is zero. If your run or table uses the option spechar to print special
characters in place of zeroes, Quantum will only print the special character if the
number of cases going into the mean is zero. Thus, 0/3 is always printed as zero,
whereas 0/0 will be printed as the special character for zero if one is defined.
19.3 Averages
Quick Reference
To create an element whose value is the average of the values in the other elements of the
axis, type:
n07[element_text]
In Quantum, the average (n07) is calculated by summing up the cell counts generated by
n01, col, val, bit and fld statements, and dividing the result by the number of elements
summed. Only rows before the n07 are included. If the axis contains more than one n07,
the second average deals only with those rows between it and the previous n07.
If we had placed an n07 at the end of the row axis in Figure 19.1, the average for the base
column would be:
Do not confuse the average with the mean. The average is based on counts in the table,
and the rows containing those counts; the mean is based on values in the data (or factors
associated with those values), and respondents.
☞ For further information on means, see the section entitled "The mean, standard
deviation, standard error and error variance" below.
The cell counts are the same as for the average because they are read directly from the
data with inc=, but instead of dividing by the number of elements we divide by the
number of respondents giving those answers.
✎ Be careful if you use averages in axes with nets: the net figures will be included in
the average which is probably not what you want.
19.4 The mean, standard deviation, standard error and error variance
Quick Reference
To create an element whose value is the mean of the values in the other elements of the
axis, type:
n12[element_text]
n17[element_text]
n19[element_text]
n20[element_text]
All rows that are to be included in these calculations need factors. These may be defined
using fac= on each row or by the option inc= on an n25 statement. If the latter, the n25
must come before the statistical statement.
When an axis contains both fac= and n25;inc=, and statistics were requested, Quantum
must decide which values to use in the statistical calculations. To do this, it works
backwards from the statistical element to the start of the axis looking for elements with
fac= or an n25 statement. If it finds an n25 first (i.e., latest in the axis), it uses the values
named with inc=. If the first thing found is a group of elements with fac=, then the
statistics are based on those factors. If neither is found, Quantum issues an error message
to that effect.
The mean (n12) reports the mean value of the factors or values belonging to each
respondent.
The standard deviation (n17) tells you the amount be which you would expect
respondents’ answers to differ from the mean. You can apply the standard deviation to
the mean at three levels. If you just take the standard deviation as reported by Quantum,
you would expect 67% of answers to lie within the range mean±std.dev. You may also
expect that 95% of answers lie within the range mean±2std.dev, and that 99% of answers
will lie within the range mean±3std.dev.
You use the standard error (n19) to estimate the mean for the population as a whole,
based on the mean of your sample. You can be 95% certain that the mean score for the
population as a whole will lie within the range mean±2std.err.
The sample variance (also known as the error variance) is the square of the standard error.
Quick Reference
To define incremental values for means, standard deviations, standard errors and error
variances, type:
The n25 does not normally print anything in the table. Use row and/or col to print these
values as the rows and/or columns of the table.
The n25 does not normally print out any rows in the table, but it creates three rows which
are used by Quantum as part of the other statistical work. They are the sum of x2, the sum
of x and the sum of n, and they are always in that order. To clarify this, here are examples
of Quantum statements which you could write to produce the same figures. Throughout
the examples we have used arbitrary columns and codes. The sum of n is the number of
people in the table, which Quantum calculates as:
n01Number of people;c=c115’2’
The sum of x is the sum of the multipliers (defined using fac= or inc=):
n01sum of mult;inc=c(122,125);c=c115’2’
If you want to print these figures in your table, add the options row and/or col to the n25
statement. If the axis is used to create the rows of a table, and row appears on the n25,
three rows will be printed labeled text.1, text.2 and text.3 for the sum-of-x2, the sum-of-x,
the sum-of-n respectively. Text is the element text defined on the n25. The same applies
when the axis is used as a column axis and the n25 contains the option col, except that
three columns are produced, rather than rows.
Using factors
As we said in the section entitled "Data options" in chapter 18, fac= defines factors when
the numbers in the data are not to be used (e.g., the data may be multicoded) whereas
inc=, also mentioned in the Data Options section, reads the data from the column and
uses that as the factor for each row. What to use when is best illustrated by examples,
although in general you should try to use fac= whenever possible since, in processing
terms, it is more efficient than inc=.
The respondent has been asked to say how much he agrees or disagrees with a particular
statement. If he agrees very much, he has a code ’1’ in, say, C210. If he agrees somewhat,
he has a ’2’; if he neither agrees nor disagrees he is coded as ’3’; disagrees somewhat, a
’4’ and disagrees very much, a ’5’. People who refuse to answer are coded as C210’&’.
We wish to obtain a numerical mean value of these opinions using factors of +2 for agrees
very much down to –2 for disagrees very much. These are not the same as the codes
representing these responses in the data, so we enter them with fac=. People who refused
to answer will appear in the table but will not be included in the mean.
We can write this axis using n01s or a col statement, both are equally correct:
l vers1
n01Agrees Very Much;c=c210’1’;fac=2
n01Agrees Somewhat;c=c210’2’;fac=1
n01Neither Agrees Nor Disagrees;c=c210’3’;fac=0
n01Disagrees Somewhat;c=c210’4’;fac=-1
n01Disagrees Very Much;c=c210’5’;fac=-2
n01Refused;c=c210’&’
n12Mean;dec=2
l vers2
col 210;Agrees Very Much;%fac=2-1;Agrees Somewhat;
+Neither Agrees Nor Disagrees;Disagrees Somewhat;
+Disagrees Very Much;Refused=’&’;%nofac
n12Mean;dec=2
where ‘n’ is the number of respondents having an opinion – c210’1/5’. If there were 25
people who agreed very much, 19 who agreed somewhat, 7 who neither agreed nor
disagreed, 15 who disagreed somewhat, 3 who disagreed very much, and 6 who refused
to answer, the calculation would be:
Mean = [(2x25)+(1x19)+(0x7)+(−1x15)+(−2x3)] / 69
yielding a mean value of 0.70. Notice that the 6 people who refused to answer were
ignored completely by the n12.
Using n25;inc=
Now suppose that instead of using factors of +2 to −2 we are asked to use factors of 1 to
5, where 1 indicates that the respondent agrees very much and 5 means that he disagreed
very much. Since these match the way the responses have been coded, we can use an n25
with an inc= to define the factors:
l vers3
col 210;Agrees Very Much;Agrees Somewhat;
+Neither Agrees Nor Disagrees;Disagrees Somewhat;
+Disagrees Very Much;Refused=’&’
n25;inc=c210;c=c210’1/5’
n12Mean;dec=2
This time we exclude respondents refusing to answer by adding a condition on the n25,
indicating that it refers only to those respondents having a 1, 2, 3, 4 or 5 code in c210. If
we keep the same number of respondents in each group as before, the mean will be 2.30
as shown below:
Mean = [(1x25)+(2x19)+(3x7)+(4x15)+(5x3)] / 69
When you have missing values processing switched on in the tab section, Quantum
automatically excludes records with the value missing_ from the inc= element. In the
axis:
l rental
val c(156,157);Base;i;None=0;1-5;6-10;11-15;16-20;21-30;31+
n01DK/NA;c=-
n25;inc=c(156,157)
n12Mean number of videos rented by all who rent videos;dec=2
the n25 statement will include any respondent for whom c(156,157) contain a numeric
value. Respondents for whom the field is non-numeric or totally blank will be excluded.
If you want to create the same axis without using missing values processing, you can
write:
✎ If you want other values to be treated the same as missing values, add the option
missing=logical_expression to the n25 statement.
☞ For further information, see the description of inc= under the heading "Data
options" in chapter 18.
Weighted runs
Quick Reference
Quantum normally calculates the standard error and the error variance using the
unweighted count of respondents. If you would prefer to use the effective base in the
calculation, place the keywords:
nsw;useeffbase
on the a statement.
In weighted runs, the standard error (n19) and error variance (n20) of the mean are
calculated using an unweighted count of respondents (sum-of-n) rather than the weighted
count of respondents. This unweighted value is calculated automatically and is stored as
an n15 element with the option nontot to prevent it being included in totals created by
n04 or n05 elements. Additionally, standard deviations, standard errors and error
variances for which the weighted base is less than 1.0 result in a value of zero.
If you would prefer to calculate the standard error using weighted figures, place the
keywords useeffbase and nsw on the a statement. This tells Quantum to use the effective
base rather than the unweighted count of respondents. The effective base is a value that
is based on weighted totals but takes into account the possibility that the weighting may
have drastically altered the proportion of one group of respondents relative to another. It
is calculated as:
An axis may contain more than one block of factors and associated statistics: each block
will be dealt with independently.
☞ See the section entitled "The n15 statement" in chapter 17 for information on the n15
statement.
Formulae for the mean, standard deviation, the standard error of the mean and the
error variance of the mean are as shown at the end of this chapter.
For a more detailed description of the effective base and the nsw option, see the
section 32.3.
Quick Reference
To create an element whose value is the sum of the factors used in the other elements of
the table, type:
n13[element_text]
The sum of factors created by an n13 is similar to the mean (n12) except that there is no
division by the number of respondents. The calculation merely involves taking each row
and multiplying the number of respondents in that row by the factor defined by fac= or
inc=, and then adding these results together to get the sum of factors for all respondents.
In the examples used for means, the sum of factors where factors of +2 to –2 were used
is 48, but the sum of factors when they are 1 to 5 is 159.
19.7 Medians
Quick Reference
To create a median element using factors, type:
n30[element_text];fac=factor[;dec=dec_places]
where factor is 50 for a median but may be any number between 1 and 100 depending on
the calculation required.
n01[element_text];inc=variable;median;c=log_expr
There are two ways of calculating medians: with factors (fac=) or with multipliers
(inc=). Multipliers are more accurate and are used when the data contains the actual
answer the respondent gave rather than a code representing a range of responses. Factors
are used when responses have been coded into ranges or when deciles or quartiles are
required. The difference between the two will become clear as we explain how to create
them.
n30[Text]; fac=n[;dec=p]
where n is a number between 1 and 100 indicating the desired level (e.g., fac=50 for
medians, fac=10 for the first decile, and so on), and p is the number of decimal places
required (the default is dec=0).
The n30 must follow a set of n01/n15/col/val statements with factors defining the highest
point of each range. The statements must be arranged so that the factors are listed in
sequential order. Quantum uses the factors to create a single median value to be
calculated within the median range. These factors are incompatible with those required
for means. For example, suppose we have ranges of income coded into c281. We could
set up our axis for medians and quartiles as follows:
l income
n10Base
n23Household Income (in Pounds)
n01No Income;c=c281’0’;fac=0
col 281;1-1000;%fac=1000+1000;
+1001-2000;2000-3000;3001-4000;4001-5000;5001-6000;
+6001-7000;7001-8000;8001-9000;9001-10000;
+10001-10500=’–’;%fac=10500+500;10501+=’&’
n03
n30First Quartile;fac=25;dec=2
n30Median Income;fac=50;dec=2
n30Third Quartile;fac=75;dec=2
The effort involved in writing out this axis has been minimized by using a col statement
instead of n01s. However, if you only wanted to show the statistics (i.e., the three n30
rows) you would have to suppress the other rows either by writing each one separately
on an n15, or by using the option smsupa= on the rows to be suppressed.
If the midpoint is the range 6001-7000, Quantum will interpolate to produce a median
value of, say, 6258.
The n30 calculation for medians requires factors to be assigned to elements in ascending
sequential order. If you define factors in reverse sequential order, that is, from highest to
lowest, Quantum still calculates a median but it is incorrect when compared with the data.
Here are two simple tables to illustrate this. The row axis for the first table is:
l rating1
col 137;Base;Excellent (fac=1);%fac=1+1;Very Good (fac=2);
+Good (fac=3);Fair (fac=4);Poor (fac=5)
n12Mean;dec=2
n30Median;dec=2;fac=50
l rating2
col 137;Base;Excellent (fac=5);%fac=5-1;Very Good (fac=4);
+Good (fac=3);Fair (fac=2);Poor (fac=1)
n12Mean;dec=2
n30Median;dec=2;fac=50
rating 1 rating 2
Base 25 Base 25
Excellent (fac=1) 4 Excellent (fac=5) 4
Very Good (fac=2) 5 Very Good (fac=4) 5
Good (fac=3) 3 Good (fac=3) 3
Fair (fac=4) 4 Fair (fac=2) 4
Poor (fac=5) 9 Poor (fac=1) 9
Mean 3.40 Mean 2.60
Median 3.13 Median 2.88
The first table shows that the median rating was between Good and Fair. This is correct
because the value 3.13 comes between the code 3 for Good and the code 4 for Fair.
The second table also shows that the median rating was between Good and Fair.
However, when compared with the data, the median of 2.88 implies that the rating comes
between the code 2 for Very Good and the code 3 for Good. This is wrong.
To create a table that uses the reverse order factors and produces a correct median, define
the axis is two sections. The first section defines the elements as they are to be printed
and with the factors required to create the mean. The second section defines the factors
for the median but because it is defined using n11 and n15 statements it does not create
lines in the table. In this section the elements are defined in the opposite order (Poor to
Excellent) so that the factors for the median appear in ascending sequential order. Here
is the revised axis:
l rating3
col 137;Base;Excellent (fac=5);%fac=5-1;Very Good (fac=4);
+Good (fac=3);Fair (fac=2);Poor (fac=1)
n12Mean;dec=2
n11
n15Poor;fac=1
n15Fair;fac=2
n15Good;fac=3
n15Very Good;fac=4
n15Excellent;fac=5
n30Median;dec=2;fac=50
Base 25
Excellent (fac=5) 4
Very Good (fac=4) 5
Good (fac=3) 3
Fair (fac=2) 4
Poor (fac=1) 9
Mean 2.60
Median 2.88
Very occasionally Quantum may produce medians that look wrong. This will usually
happen when the medians are based on factors and all respondents appear in the first
element with a factor. For example, if you have a rating scale with factors of 6 for
Excellent through to 1 for Poor, and all respondents rated the product as excellent,
Quantum will report the median value as 3.0. You might have expected the median to be
6.0 since that is the rating everyone gave. The reason for the difference is due to the way
that factored medians work.
Factored medians attempt to mimic the way real medians work by calculating artificial
ranges for each element. These ranges are based on the factor assigned to the current
element and the element immediately before it. In our example, the factor for the second
element, very good, is 5 and the factor for the previous element, excellent, is 6, so if all
respondents answered very good, the median would be the midpoint between 5 and 6,
namely 5.5.
When Quantum looks at the first element in the rating scale, it finds that there is no
previous element so it uses a factor of zero for the nonexistent element. The range for
Excellent is therefore 0 to 6 and its median is 3.0.
If you have a table of this type, and you want to see a median that matches the factor of
the element in which all respondents are held, insert a nonprinting n15 element with an
impossible condition above the first real element, and give it the same factor as the real
element. In our example, if we inserted a dummy element with a factor of 6 before the
Excellent element, the median for all respondents in Excellent would be the midpoint of
6 to 6 which is 6.0. Here is the axis:
l rating
n10Base
n15;c=c143’1’.and.c143n’1’;fac=6
col 143;Excellent;%fac=6-1;Very good;Good;Satisfactory;
+Poor;Very poor
n30Median;fac=50;dec=1
This use of a dummy element does not affect means, nor does it affect the median
calculation if respondents are more evenly spread across the elements. It also works with
reverse factors (1 to 6, say) although you would have to change the factor on the dummy
element to match the lowest rather than the highest factor in the scale.
✎ In statistical terms, medians are only useful on data that is spread out across the full
range of the scale, that is, where respondents are present in all elements. Medians
based on data that is clustered at one or other end of the scale may be misleading.
Quantum can calculate medians using values read directly from the data file. An example
is where the data contains the actual family income rather than a code representing a
range of incomes. This allows you to obtain a more accurate median by using inc= to read
the values from the data. The format of medians with inc= is:
For example:
n01Median Value;inc=c(123,127);median;c=c(123,127)u$ $
Notice that this calculation uses an n01 for both the elements and the median: it does not
use an n30. Thus, our example might read:
l income
val c(123,127);Base;hd=Household Income (in Pounds);=; 0; i;
+1-1000; 1001-2000; 2001-3000; 3001-4000; 4001-5000; 5001-6000;
+6001-7000; 7001-8000; 8001-9000; 9001-10000; 10001-10500; 10501
+n01Median Income;inc=c(123,127);median;c=c(123,127)u$ $
If you are in doubt as to how to request medians on a table, follow the rules below:
As with the other statistical statements, the keyword median suppresses percentages for
the row or column of median values.
When Quantum calculates real medians it writes information to temporary files on the
disk which it then reads later in the calculation. If you see error messages to do with
Quantum being unable to locate or open files at this point it is possible that you filled up
the disk and the file was not created. If this happens, try to make some extra space on the
disk before rerunning your job. Alternatively, remove some of the median requests or
split the job into two smaller jobs.
Quantum provides four methods of interpolation for calculating median values. This is
defined using the option medint= on the a, tab or flt statement, as described below. In the
examples which we use to illustrate the various types of interpolation, the table was as
follows:
Base
Base 20
Value is 1 5
Value is 2 6
Value is 3 3
Value is 4 3
Value is 5 3
medint=0 Interpolate between the value exceeding the 50% mark and the
previous value. This is the default. In the sample table, the median is
1.83, created by interpolating between the first two rows – that is,
between the 11th and 10th values.
medint=1 No interpolation. Quantum reports the exact value which exceeds 50%.
If the value is exactly 50%, then the median is reported as being the
midpoint between this and the next value. The median is reported as
2.00 since this is the 11th value found.
medint=2 Interpolate between:
Quick Reference
To calculate and print an F value and a triangle of T values for a group of columns, type:
nft
The nft element creates an F value and a triangle of T values for groups of columns. A
group of columns starts at a base or n23 statement and continues until another base or n23
is read, or until the end of the axis, whichever is sooner. Columns which are
non-totalizable (e.g., n04, n05, n12) or not printed are ignored. Groups with less than two
non-ignored elements are ignored.
This element is meaningful only in row axes and is therefore ignored in column or higher
dimension axes. It is specified simply as nft with no row text or options.
With ordinary formatted output, the F value for a group is printed under the middle of
that group. With PostScript output, it is printed under the rightmost column in the group.
The probability, expressed as a percentage, is printed underneath the F value.
The triangle of T values is lined up with the values in the columns to which they refer.
The probability for each T value, expressed as a percentage, is printed underneath the
corresponding T value. Asterisks are printed down the leading diagonal of T values.
Here’s a simple spec and the table it produces. The spec is:
Base 340 65 93 76 70
Excellent 95 21 20 28 16
Very Good 21 4 5 7 5
Satisfactory 70 15 21 15 19
Poor 69 14 21 17 17
Abysmal 49 11 21 9 13
F stat 2.40
F prob 6.55
T stat * -1.90
T prob 5.68
T stat *
T prob
19.9 Formulae
The formulae for the statistics described in this chapter are given below.
The table below shows a typical rating axis with a number of statistical values. The
factors assigned to each element are shown in parentheses after the element text. Notice
that the NA/DK element has no factor so it is excluded from the statistical calculations.
The explanations of the formulae refer back to this table to illustrate how the figures were
produced:
Base 22 8 5 4 5
Very satisfied (5) 3 1 2 0 0
Satisfied (4) 3 0 2 1 0
No opinion (3) 6 2 0 3 1
Dissatisfied (2) 4 2 0 0 2
Very dissatisfied (1) 4 2 0 0 2
NA/DK 2 1 1 0 0
In the formulae:
A dot suffix indicates summation over the replaced index; so, for example, the formula
for a column total is:
nj = ∑ nij
i=1
Sum of factors
x = ( ∑ ni xi )
Taking the column for respondents aged 18-24 in the sample table, the sum of factors is
calculated as follows:
Mean
∑ nixi
x = ----------------
N
Taking the column for respondents aged 18-24 in the sample table, the mean is calculated
as follows. The sum of factors is 17 as described above, and the number of respondents
included in the sum of factors is:
1+0+2+2+2=7
Viewed in relation to the element texts, this tells you that the mean rating given by 18-24
year-olds is between Dissatisfied and No opinion.
Standard deviation
2 1---
( ∑ ni xi ) 2
n x 2 – ---------------------- -
∑ i i N
s = ----------------------------------------------
N–1
The standard deviation tells you the amount by which you would expect respondents’
answers to differ from the mean. For 18-24 year-olds this is:
You would expect 67% of answer to lie within mean±std.dev; that is with in the range
2.4±1.4 (between 1.00 and 3.80).
You can be 95% certain that the mean score for the population as a whole will lie in the
range mean±2std.err. In this example the population means should lie within the range
2.4±2*0.53; that is, somewhere between 1.34 and 3.46.
The variation of the formula used when the run contains the useeffbase keyword is as
follows.
Let the sum of the squared weights, as calculated by the nsw statement, be:
∑ wik
2 2
wi =
k
where k moves through all respondents in the axis.
2 2
( wi ) ( wi )
e i = ------------ = --------------
∑ wik
2 2
wi
2
s
se ( x ) = ----
ei
2
sv ( x ) = ( se ( x ) )
0.53½ = 0.28
colnx be the sum over all cases in column n of the fac= or inc= values.
colnxx be the sum over all cases in column n of the squared fac= or inc= values.
Then:
2
ncol
ncol ∑ colnx
2
( colnx ) -----------------------------
∑ coln
i=1
-------------------- – ncol
i = 1
∑ coln
i=1
F = ----------------------------------------------------------------------
ncol
-
( colnx )
2
∑ colnxx – --------------------
coln
---------------------------------------------------
i=1
ncol
∑ coln – ncol
1=1
Let: colPnxx, colPnx and colPn beanswers to differ the same as colnxx, colnx and coln,
defined above, for values P=1 and P=2.
Then:
---------------- – ----------------
col2nx col1nx
col2n col1n
T = -------------------------------------------------
( ts1 + ts2 )
2
colPnx
colPnxx – -------------------
colPn
tsP = -------------------------------------------------
colPn ∗ ( colPn – 1.0 )
The text across the top of the table defining the contents of the columns is usually called
the breakdown or banner, and is defined in the column axis. All axes may be used to
create either rows or columns, so there is no need to write two versions of the same axis
simply because at some times it forms the rows of an axis and at other times it forms the
columns. There are three ways that Quantum can create column headings:
i) fully automatically using element texts. The column width is calculated by dividing
the space available for column headings by the number of columns to be printed.
ii) semi-automatically using element texts and a user-defined column width defined
with colwid= on the a, sectbeg, flt, tab or l statement or on the individual elements
themselves.
iii) manually using heading texts defined on g statements and column widths defined on
p statements.
Before it prints any tables, Quantum calculates the maximum amount of space available
for table headings using the formula:
where pagwid is the page width (default 132), side is the width of the row text (default
24). %_sign is an additional position needed at the end of each line to print the percentage
sign for the last column, and pcpos_value is the amount of offset defined for percentages
with the option pcpos=. These options are ignored if there are no percentages, or if
percentages are printed without percentage signs (nopc), or if there is no pcpos= option
for this table.
Any tables in which the amount of space required for the columns of the table is wider
than this must be split. For example, suppose you have:
a:pagwid=78;side=20;pcpos=-1
which is 58 characters.
With this method, the width of the column is calculated automatically and the text of the
breakdown is taken from the row text of the count-creating statements in the axis. Long
texts are broken up by Quantum.
The width of each column is calculated by dividing the maximum width available for the
table heading by the number of columns required. If the table is 132 columns wide with
a row-text width of 24 (both defaults), the maximum heading with will be:
If the axis has ten count-creating elements, each column will be allocated ten spaces. If
there are 15 count-creating elements, each one would be seven characters wide.
The maximum and minimum column widths allowed are 16 and eight characters
respectively. If the result of the division is greater than 16, the column width is set to 16
characters; if the result is less than eight, the column width is set to eight characters.
The overall column headings are also generated automatically. The top line contains the
axis heading (from hd= on the l statement) centered across the entire breakdown. The
second line contains the axis subheadings (from an n23 statement or hd= on a col or val)
centered across the columns to which they refer. The text for each column is taken from
the relevant n01/col/val statement. This text is split up above the column as logically as
possible, using spaces, hyphen or slashes in the text as breakpoints. Column headings
will be right-justified and may be split over four lines. A simple example is:
l region;hd=Area of Residence
col 15;Base;hd=London;Inner London;Outer London;
+hd=Southern England;Cornwall/Devon;Kent/Surrey/Sussex; ...
Area of Residence
Kent/
Inner Outer Cornwall/ Surrey/
Base London London Devon Sussex
Here, we have made each column eleven characters wide. The axis heading ‘Area of
residence’ is centered over all columns, including the base, whereas the axis subheadings
are centered over the columns to which they refer.
If you would prefer left or right justified subheadings, use hdpos on the n23 statements.
☞ See the section entitled "Laying out subheadings" in chapter 17 for further
information on positioning of subheadings across columns.
If the axis contains more than one 23 subheading, the text on each one will be centered
above the elements between it and the next n23 or the end of the axis. If you want to alter
this so that you can create blocks of text above the same group of columns, set up nested
headings using hdlev= on the n23 statement as described in the section entitled "Nested
subheadings for column axes" in chapter 17. For example:
l ban01
n23Visitors to the Museum;hdlev=1
n10Base
n23Sex;hdlev=2
col 110;Male;Female
n23Age;hdlev=2
col 111;11-20=’12’;21-34=’34’;35-44=’56’;55+=’78’
n23Visited;hdlev=2
n23Museum Before;hdlev=2
col 116;Yes;No
If you would prefer to use a different text for when an element forms a column in a table,
you may define this text with the option toptext= on the element. For example to replace
the two-line heading Visited Museum Before with the single line Been Before, you
would write:
l ban01
n23Visitors to the Museum;hdlev=1
n10Base
n23Sex;hdlev=2
col 110;Male;Female n23Age;hdlev=2
col 111;11-20=’12’;21-34=’34’;35-44=’56’;55+=’78’
n23Visited Museum Before;hdlev=2;toptext=Been Before
col 116;Yes;No
and, with the default of centralized texts, the headings might be printed as:
Absolute figures are printed with the right-most digit one print position to the left of the
right-most character in the column heading. Percentages are printed with the right-most
digit under the right-most character in the column text, unless flush or pcpos are present
on the a, sectbeg, flt or tab statement. If flush is used, percentages are printed directly
under absolutes. For example:
When Quantum prints column headings, it prints two blank lines before the first line of
text and one blank line after the last line. You may alter one or both of these settings using
the keywords linesbef= and/or linesaft= on the a, flt, setbeg or tab statement. For
example, if you type:
a;linesbef=3;linesaft=2
Quantum will print three blank lines before column headings and two blank lines after
them in all tables in the run.
Quick Reference
To mark breakpoints in element texts for use when those texts are used to form column
headings, use the characters ! or |.
When you have a long element text in a column axis, Quantum will normally fit as much
text as possible in the column width. If the text is wider than the column, Quantum goes
back to the previous space character and breaks the line there. If there are no spaces,
Quantum breaks the line in the middle of words.
Sometimes this method produces satisfactory column headings and other times not. One
way round this is to define breakpoints for use only when the axis is used as the columns
of a table. If the element text is too wide for the column, Quantum will split the text at
the breakpoints. The characters you use to mark breakpoints are exclamation marks and
bars (! and | ). For example:
would create at most a four-line heading when the axis is used as a column axis:
Respon
dents Who
Did Not
Buy Bread
! and | are only applicable to column axes. If the axis is used as the rows of a table the !
and | are replaced by a space and nothing respectively.
If you want to print either of these characters as part of the element text you must switch
off their special meanings using either the qtform file or the QTFORM environment
variable. Any changes you make in this way apply to the whole run, not just the axis in
which you wish to print these characters. If you need to print ! as part of the element text
but still wish to define breakpoints, remove the special meaning from ! and use | to mark
the breakpoints.
Quick Reference
To define widths for columns in table headings, type:
colwid=num_spaces
The option colwid=n defines the number of print positions to be allocated to a column.
It may appear on a, flt, sectbeg, tab and l statements to define column widths for an axis,
table, or group of tables, as well as being used on n, col or val statements to define column
widths for individual items.
Any texts which exceed the colwid= column width will be split at blanks, hyphens or
slashes, or at the special characters ! and | .
If colwid= is present anywhere in a run, it will override the automatic set-up for the tables
or axes to which it refers. If colwid= is present at more than one level, the setting at the
lowest level overrides those at any higher levels. For example, colwid= on an element
overrides colwid on the l statement or higher for that element only. colwid= on an l
statement overrides colwid on the a statement for that axis only.
Column headings are defined manually either when they differ from the row texts or when
the automatic breakdown is not acceptable. The statements which do this are g and p.
In certain circumstances, Quantum will ignore the heading texts on the g statements and
will use the element texts instead. It will, however, retain the column widths defined on
the p statements. Quantum ignores g statements when:
c) the table is sorted with csort on the a, sectbeg, flt or tab statement
d) the column headings defined on the g statements are too wide to fit on the page. In
this case you’ll see the message ‘error: G-card width xx too wide for tab G space yy’
printed at the output stage of the run.
✎ When setting up a breakdown make sure that each column is wide enough to
accommodate the largest value to be set in the column. If the value has percentages
as well as absolutes, the breakdown must be large enough to contain the percentages
set one character to the right of the absolutes, unless flush or pcpos is used.
☞ See section 20.2 above for information on positioning percentages in columns.
Quick Reference
To define column headings, type:
Column headings must be written on g statements exactly as they are to appear in the
printed output. An axis may contain as many as 20 g statements grouped together to form
blocks of column text.
For instance:
g Marital Status
g Base Single Married Divorced Widowed
An axis may contain more than one block of g statements as long as each group is
followed by a p statement.
The texts on each line of the second block will be placed adjacent to those on the first.
Do not worry if some blocks have more g statements than others: Quantum always starts
with the bottom line in each block, and adds extra spaces where necessary.
When printed in the table, these two lines will be printed as one:
Column headings start in the column immediately after the g, and are placed in the table
in the column after the end of the side-text. Thus, if our row-text width is 24 characters,
the column headings start in print position 25 unless you leave extra spaces before the
leftmost text on the first block of g statements.
When groups of g statements are placed side by side, Quantum does not leave any spaces
in between the two unless you tell it to. This can be done in one of two ways. You may
either insert the number of spaces required at the start of the headings on the second group
of g statements, or you may place an ampersand (&) on the end of each of the first group
of g lines in the column in which the texts from the second group are to start.
Quick Reference
To define the positions in which cell counts will be printed relative to the column
headings defined on g statements, type:
p x [x .... ]
p x x ...
where x is any non-blank character. Absolute figures are then right-justified in the
columns marked with a character.
Columns of figures may be separated by vertical lines. These are initiated by a vertical
bar in the appropriate columns of the p statement, thus:
l ax04
col 109;Base;Single;Married;Divorced;Widowed
g Marital Status
gBase Single Married Divorced Widowed
p x | x | x | x | x
produces:
Marital Status
Base Single Married Divorced Widowed
Base 200 | 44 | 122 | 33 | 1
Male 44 | 6 | 27 | 10 | 1
Female 156 | 38 | 95 | 23 | 0
Quick Reference
To underline text defined on g statements, type:
un
after the last g statement in a block. n is 1 to underline the whole text, 2 to underline
everything apart from blank strings, or 3 to underline no-blanks only.
un
u1 underlines the whole line up to and including the last non-blank character; u2
underlines the whole line (as u1) except that strings of blanks are not underlined; u3
underlines non-blank characters only. This is the same as unl1, unl2 and unl3 on
n/col/val/fld/bit statements. Here is are the spec and table from the previous section
updated to include underlining:
l ax04
col 109;Base;Single;Married;Divorced;Widowed
g Marital Status
u2
gBase Single Married Divorced Widowed
u1
p x | x | x | x | x
produces:
Marital Status
Base Single Married Divorced Widowed
Base 200 | 44 | 122 | 33 | 1
Male 44 | 6 | 27 | 10 | 1
Female 156 | 38 | 95 | 23 | 0
If u does not meet your needs, you can enter your own underlining on a g statement with
hyphens in the appropriate places. This is often useful in complex breakdowns using
categories from more than one axis. For example:
l demog
n10Base
col 112;Under 30;30-50;Over 50
col 111;Male;Female
g Age Sex
g Base Under 30 30-50 Over 50 Male Female
g----- ---------------------------- ---------------
If you will be converting your Quantum tables into a comma-delimited ASCII file using
the program q2cda, you may want each column in the column heading to be treated as a
separate text (the default is to treat each line as a single text). If so, you must underline
the column texts, and each set of underlining must extend the full width of the column.
Do not, however, use u for this type of underlining. Use = or – characters on a g statement
instead since it is these characters, rather than the underlining character, that q2cda uses
for finding the start and end of each column.
When it is ready to print a table, Quantum checks whether it needs to split the table and,
if so, where. To do this it adds up the column widths and compares the amount of space
needed with the amount of space available. Quantum adds each column width separately
and checks the amount of space needed after each addition. As soon as the amount of
space required exceeds the amount of space available, Quantum splits the table just
before the current column. For example, if the sum of columns A, B, C and D exceeds
the amount of space available, Quantum will split the table between columns C and D.
After a split, Quantum resets its counter to zero and starts counting again in case it is
necessary to split the heading more than once.
If you convert your tables for printing on a PostScript printer, the size of the font you use
may make it possible to fit more columns on the page than in the standard format.
Information on fonts is not available to Quantum at the time it formats the tables so it
always splits tables as if they were to be printed in the standard format.
☞ For information on how Quantum paginates split tables, see section 18.4.
Tables are created from axes, each cell being generated by the intersection of two
conditions, one from the row axis and one from the column axis
In this chapter we will discuss the statements needed to produce various types of table
and discuss methods of combining tables by addition and division
Quick Reference
To create a table, type:
In order to create a table, Quantum needs to know which is the column axis and which is
the row axis. If the table has more than two dimensions you will need to say which axes
should be used for the extra dimensions.
Tab statements must precede the axes definitions in your program file.
Multidimensional tables
Multidimensional tables are ones created from more than two axes. They occur when a
series of tables has the same rows and columns, but each table in the group has additional
characteristics which are themselves the conditions of other axes. This sounds
complicated, so let’s take an example.
We have been asked to produce a separate table of age by sex for each region of the
country. Whereas before each cell had two conditions (age and sex) it now has three
(region, age and sex).
There are two ways of writing this specification. You may either:
a) write as many tab statements as there are regions, and filter each table of age by sex
to include only those respondents resident in a given region, or
Both methods produce the same results – the main advantage of (b) over (a) is that (b)
involves you in a lot less work.
l region
col 135;Base;hd=Area of Residence;North;South;East;West
When these tables are printed, the region names from the col statement will be printed at
the top of the table, as you can see from the sample table.
Abolutes/col percents
Area of Residence - North
Q2. Age
Base: All Respondents
Base Male Female
Base 153 76 77
11-20 yrs 39 17 22
25% 22% 29%
21-34 yrs 64 36 28
42% 48% 36%
35-54 yrs 38 19 19
25% 25% 25%
55+ yrs 12 4 8
8% 5% 10%
Tables may have up to six dimensions: that is, up to six axes may be named on a single
tab statement.
It is difficult to visualize a four, five or six-dimensional table. The easiest way is just to
think of the higher dimensions as extra conditions, and the multidimensional tab
statement as a simpler way of writing them. Again, let’s work through an example. This
time we need a four-dimensional table using the axes mstat, region, age and sex.
Age and sex are the rows and columns respectively, region is the third dimension and
mstat (marital status) is the fourth. Mstat is set up as follows:
l mstat
col 134;Base;Single;Married;Divorced;Widowed
The first table of age by sex is for all respondents, since the conditions defined by the
higher dimensions are both ‘All respondents’ (i.e., Bases). The seventh table is a
cross-tabulation of age by sex for all respondents who are single and live in the North. In
all, this tab produces 25 tables because there are five rows in the third and fourth
dimension axes.
If the tab statement has options on it, they will apply to all the tables in the set. All tables
will have different page numbers, but they will all have the same table number.
All options that are valid on the a statement, apart from dp and netsort, are also valid on
a tab statement. Since tab is the fourth level down in the hierarchy of the tabulation
section, any option occurring at a higher level will be overridden by that same option at
the tab level. Thus, dsp set globally may be overridden by nodsp for a specific table;
dec=2 set globally may be reset to dec=1 for the current table only.
The exception to this is the option c= which is additive at lower levels. If a group of tables
has the condition c=c12’1’ and one of that group of tables also has the condition
c=c15’6’, that table will only include people for whom both conditions are True (i.e.,
c=c12’1’.and.c15’6’).
☞ Options which are valid on the a, sectbeg, flt and tab statements are listed in the
section entitled "Options on a, sectbeg, flt and tab statements" in chapter 16.
celllev=name This is used with analysis levels to define the level at which cell
counts are to be incremented for the table.
✎ For further information on analysis levels, see the section entitled "Table update
level" in chapter 28.
id=name This assigns an identifier of up to six characters to the table for use
in table manipulation or T-statistics.
☞ For further details on table manipulation, see the section entitled "Referring to tables
in the current run" in chapter 27.
Quick Reference
To cross-tabulate two data variables, type:
When we were discussing the edit section, we said that the holecount was a useful means
of gaining an overall view of the data. A similar facility exists in the tabulation section
for seeing how respondents are distributed across the cells of the table.
tab cm cn
where m and n are column numbers. This produces a table in which the rows are the
twelve positions (1/&) in cm and the columns are the twelve positions (1/&) in cn. In a
table of c15 by c34 the cell for position 4 by position 3 would tell us the number of
respondents who had c15’4’ and c34’3’.
Quick Reference
To request a series of tables with a common axis, type a tab statement for the first table
and follow it with:
where n is blank or 1 for row axes, 2 for column axes and 3 to 6 for higher dimensions.
tab statement And is a very useful statement if you have to create a series of tables using
the same row, column or higher dimension axis. For instance, to set up a series of tables
all of which have the same row axis, we could write:
Each and statement may contain any number of axes as long as they are separated by
spaces and the total length of the statement does not exceed 200 characters.
Any text associated with the table specified on the tab statement also applies to the tables
created using the axes on the and line. By default, each table in the series will have a
different table number and page number. You may, if you prefer, use the option nand on
a tb statement to cause all tables created by ands to have the same table number as the
parent table.
There are also and statements for defining additional rows or higher dimension axes.
They are and2 to and6, where the number following and is the dimension number of the
axis. Incidentally, and for rows may also be written and1. As you can see, the dimension
number is found simply by counting the number of axes in the table, starting with the
column (right-most) axis.
Any number of ands may follow a tab statement as long as they are all of the same type;
that is, all and or all and2: a mixture is not acceptable.
These statements may not stand alone: they must always follow a tab statement defining
the parent table. For example, with und, the tab statement would define the table to be
printed at the top of the page while und would define the table to be printed underneath
that table.
All tables combined with these statements must generate the same number of cells in
order for them to be matched correctly. Where tables are of differing sizes the smaller
tables should be padded with dummy elements as shown below.
Quick Reference
To place tables side by side, type a tab statement for the first table and follow it with:
Options are any of anlev=, c=, celllev=, inc=, maxim, means, median, minim and wm=.
To place tables one underneath the other, type a tab statement for the first table and
follow it with:
Options are any of anlev=, c=, inc=, maxim, means, median, minim and wm=.
As we mentioned in section 21.3, there will be times when you have several tables all
using the same row or column axis. We also said that such tables could be most
efficiently specified using and statements. However, this prints each table on a separate
page, making it slightly difficult to compare figures in different tables.
The Quantum statements sid and und provide a means of printing two or more on the
same page, either side by side or one underneath the other, making any comparisons
much simpler. There are, however, some provisos.
If sid is used:
a) The overall width of the tables must not exceed the designated table width. The
default is 132 characters – this may be changed with pagwid= on the a statement.
e) If more than one table contains statistics such as means (n12) or standard deviations
(n17) you must create the statistics using n25 statements in the axes rather than fac=
options on the elements. If you use factors, Quantum takes the factors as they are
defined in the earliest table in the group and applies them to all subsequent tables,
overriding the factors defined for those tables.
With und:
c) The total number of elements in all axes in the table must not exceed 500.
where axis1 and axis2 are the axes forming the table, and options are any of the
keywords:
Output options such as the number of decimal places for absolutes, or the amount of
spacing between lines in the output are taken from the tab, flt, sectbeg or a statement in
that order.
✎ A table created using tab with sid or und has become one table by the time output
options are applied at the table formatting (qout) stage of the run. This means that
any output options are applied to all elements in both the tab table and the sid or und
tables together. Quantum does not apply output options to the tab and sid/und tables
independently and then merge the tables.
So, for example, if you place sort on the tab statement of a tab/und pair, Quantum
places the two tables one underneath the other and from then onwards treats the
figures as a single table. At the output stage the figures are sorted with no distinction
between the tab section and the und section.
☞ For a full list of output options, see the section entitled "Options on a, sectbeg, flt
and tab statements" in chapter 16.
Any table headings for the second table must be entered as part of the text following the
tab statement. If you are printing tables side by side and you want to define the layout of
the column headings yourself, you must place the g/p statements in the axis which defines
the columns and codes for each element, not all in the one named on the tab statement.
For example:
If percentage figures are requested or the table is to be sorted (ranked), these tasks are
carried out once the tables have been arranged. With sid, if both column axes have base
columns, row percentages for the tab and sid tables will be created using the appropriate
base column. If the tab column axis has a base, but the sid one does not, row percentages
for the whole table will be generated using the base from the tab column axis. If there is
no base in the tab column axis, but one exists in the sid column axis, row percentages will
be created for the sid table but not for the tab table.
Up to 40 sids and any number unds may follow the tab as long as the overall limits for
table sizes are not exceeded.
Adding tables
Quick Reference
To add tables, type a tab statement for the first table and follow it with:
add[col_offset[,row_offset] ] axis_names
where axis_names is the same number of axis names as appears on the tab statement.
Another way of combining tables is to add them together. Quantum can add tables by
placing them one on top of the other and adding the corresponding cells in each table, or
by offsetting them so that, for example, the cells in row 1 of the first table are added to
those in row 3 of the second.
The easiest way of adding two tables is to add all cells in the first row of the first table to
the corresponding cells in the first row of the second table, and so on. All that’s needed
is a tab statement to create the first table followed by an add statement generating the
second, for example:
Here we are creating the table ax02 by bk02 and adding it to the table ax01 by bk01. A
practical example might be an office equipment survey in which a set of 3-column fields
store the number of different makes of typewriter that each company owns. If the first ten
fields refer to manual typewriters and the next ten refer to electric typewriters, we may
want a table showing the total number of typewriters of each brand owned. If we write:
the row for Brand A will tell us how many Brand A manual and electric typewriters each
company owns. Notice that we are using inc= to count the number of typewriters rather
than using c= to note the presence or absence of typewriters.
Tables to be added may be offset by any number of rows or columns; that is, the table
defined by add may be shifted a number of columns to the right or down a given number
of rows before the cells are added. For instance, you may want to add cell 2 of row 3 in
the first table to cell 1 of row 1 in the second table. This is an offset of 2 columns and 3
rows.
To offset a table by a given number of columns, the add statement must be entered as:
where n is the number of columns towards the right that the table is to be shifted before
it is added to the tab table. Note that there is no space between the keyword add and the
offset. So,
shifts the second table three columns to the right before adding it to the first table:
column 3, row 1 of ‘ax01 by bk01’ will be added to column 1, row 1 of ‘ax02 by bk02’,
and so on.
where m is the number of rows to offset. When doing this, you must be sure to enter a
zero for the column offset otherwise Quantum will combine the tables horizontally
instead. A single number after add is assumed to be a column offset:
To define row and column offsets for the same table, we write:
where m is the row offset and n is the column offset. If the row or column axis contains
an n25 statement, remember that it creates three elements even though none of them are
printed in the table.
A tab statement may be followed by any number of adds with the same or varying offsets.
✎ Offsets are always based on the tab table rather than on the intermediate adds
themselves.
For example:
takes the table ax02 by bk02, offsets it by 2 columns and adds it to the tab table. Then the
table ax03 by bk03 is offset by 2 rows and 1 column and added to the tab table. To clarify
this further let’s look at some numbers - the tables shown below are the individual tables
before adding takes place:
ax01 by bk01
c1 c2 c3 c4 c5
r1 12 9 3 0 0
r2 6 4 2 0 0
r3 6 5 1 0 0
r4 0 0 0 0 0
r5 0 0 0 0 0
By adding the first add table (ax02 by bk02) to the tab, we have:
c1 c2 c3 c4 c5
r1 12 9 10 2 5
r2 6 4 5 2 1
r3 6 5 5 0 4
r4 0 0 0 0 0
r5 0 0 0 0 0
This table is only temporary and it is never printed, but as you can see, the numbers in
column 3 are the sum of the numbers in column 3 of table 1 and column 1 in table 2.
The next step is to add the table created by the second add to this temporary table.
Remember that the offsetting is based on the original table rather than the previous add,
so, ignoring the row offset for the moment, we will be adding column 1 of table 3 to
column 2 of table 1, which is also column 2 of the temporary table. The row offset is 2,
so row 1 of the third table is added to row 3 of the first table. The final printed table is as
follows:
c1 c2 c3 c4 c5
r1 12 9 10 2 5
r2 6 4 5 2 1
r3 6 10 9 1 4
r4 0 4 3 1 0
r5 0 1 1 0 0
The number 9 in r3c3 is the sum of table 1 r3c3, table 2 r1c3 and table 3 r1c2; that is,
1+4+4=9.
Because of the way Quantum creates tables, you must make sure that the axes on the tab
statement have as many rows or columns as there will be in the final table (see output of
ax01 by bk01 above). To do this it may be necessary to pad these axes with dummy n01
statements. If the dummy elements are omitted, any extra columns or rows in the other
axes will be ignored when the tables are added.
All row and column headings are taken from the axes on the tab statement, so the dummy
elements should also define any texts that are to be used for the additional rows and
columns. If no text is given, the rows and columns will be printed without headings.
In our example, the original table has three rows and columns compared to five of each
in the final table, therefore we must pad out the axes ax01 and ax02 so that they too have
five each. For example:
l ax01
col 27;r1;r2;r3
n01r4;dummy
n01r5;dummy
If the axis is later used in another table, the dummy rows are ignored.
If you fail to allow sufficient rows and columns in added tables with offsets, you may
well find that counts for some cells are larger than you would expect because counts for
the ‘extra’ cells have been added into the next valid cell in the table.
Any of the options listed in section as valid on sid and und statements may also be used
on an add.
Quick Reference
To divide one table by another, define the top table on a tab statement followed by:
where axis_names is a list of as many axis names as there are on the tab statement, and
options is any of the keywords anlev=, c=, inc=, maxim, means, median, minim or wm=.
When tables are divided, the table on the tab statement is always the numerator (table to
be divided) and the table on the div statement is the denominator (table by which to
divide). Both tables must have the same number of rows and columns since the final table
is achieved by placing tables on top of one another and dividing the corresponding cells.
If the division for a cell has a remainder, the result is rounded up to the next highest whole
number. Where a cell in the second table is zero, the corresponding value in the first table
will remain unchanged (e.g., 4/0 prints a value of 4).
Only one div statement may follow a tab – any more will be ignored.
Output options such as the number of decimal places for absolutes, or the amount of
spacing between lines in the output are taken from the tab, flt, sectbeg or a statement in
that order.
☞ For a full list of output options, see the section entitled "Options on a, sectbeg, flt
and tab statements" in chapter 16.
The statements:
produce a table showing the number of loaves bought per person. The tab table creates a
table containing the total number of loaves bought whereas the div table contains the
number of people buying bread. The axes are the same; the difference is inc= on the tab
line. By dividing the first by the second we arrive at the average number of loaves bought
per person.
Quick Reference
To print more than one table per page use:
hitch=number
to print the first page of the current table on the same page as the previous table, or:
squeeze=number
to print as many pages of the current table as there is room for on the same page.
In both cases, number determines how Quantum deals with table titles, footnotes and
bottom texts. Acceptable values for hitch= are 0 to 4 and for squeeze= are 0 to 2.
There may be times when you will want to print more than one table on a page. The und
statement described earlier in this chapter will do this, but it is only useful if the two
tables have the same column axis, and you want Quantum to treat the two tables as a
single table.
When this not not what you want, you should find that the keywords hitch= and squeeze
offer a solution. Examples of when you will find hitch or squeeze useful are:
• When you want to suppress page breaks between tables simply to save paper on test
or checking runs.
• When you have a short but very wide table and you want to print the continuation
page of the table on the same page as the first part.
• When you have a number of short tables that would look satisfactory printed on the
same page.
• When you have tables with T-statistics on overlapping data that you want to print one
underneath the other, and und does not produce the correct results.
hitch= and squeeze=, are both valid on the tab statement. Exactly how Quantum prints
your tables depends on which values you assign to these keywords. The table shown
below summarizes these values and specific examples are given later:
Option Action
hitch=0 Starts this table on a new sheet of paper (the default).
hitch=1 Prints the first page of the current table immediately below the previous
table, if there is room for the whole page.
hitch=2 As hitch=1 but suppresses bot and foot texts and titles that are identical
to those on the previous table, and inserts the current table between the
end of the previous table and its foot/bot texts, if any.
hitch=3 Prints the column headings and rows of the current table before the
footnotes or bottom texts of the previous table. If the value of squeeze=
is 0 this action continues for all pages of this table.
hitch=4 As hitch=3 but prints only the rows of the current table before the
footnotes or bottom texts of the previous table. If the value of squeeze=
is 0 this action continues for all pages of this table.
squeeze=0 Prints each page on a separate page (the default).
squeeze=1 Prints the entire page below the previous page if there is room for it.
squeeze=2 As squeeze=1 but suppresses bot and foot texts and titles that are
identical to those on the previous table, and inserts the current table
between the end of the previous table and its foot/bot texts, if any.
You may use hitch= and squeeze= singly or together. The sections which follow explain
how various combinations produce different types of layout. When you use hitch= or
squeeze= Quantum calculates the amount of space remaining in the page by subtracting
the number of output lines used on the page so far from the maximum number of output
lines allowed on the page. The page length is the value of paglen on the a statement for
which the default is 60.
You use hitch=1 when you want Quantum to print the current table, or the first page of
the current table if it is a multipage table, on the same page as the previous table. If there
is not enough room for this to happen Quantum ignores the request. For example:
defines two tables that will be printed on the same page if there is room. If the second
table is longer than one page, Quantum will print the first page on the same page as the
first table and will print the second page on a new sheet as it normally does.
You’ll find hitch=1 by itself useful if you have a set of short tables and you want to
suppress the page break between them.
You use squeeze=1 with tables that spread over more than one page, to tell Quantum to
fit as many pages of the table as possible on one printed page. You might use this if you
have a short table with too many columns to fit across the page. When Quantum finds a
table that is too wide to print on one page, it prints as many columns as it can on the first
page and then starts a new page and prints the remaining columns.
☞ For a complete explanation of how Quantum splits and paginates very wide tables,
see section 18.7.
By specifying squeeze=1 you can force Quantum to print the two parts of the table on the
same page if there is room. For example:
When you use squeeze=1 by itself, Quantum squeezes tables onto one page simply by
suppressing the page break that it would normally insert between pages. Titles that would
normally be printed at the top or bottom of each page are still printed for each table page,
even if it means that the same text appears twice or more on the same printed page. You
can suppress some or all these titles by specifying a different value for squeeze as
described below in "Controlling titles, footnotes and bottom text".
Paper saving
The way to save most paper is to use hitch=1 and squeeze=1 on the same tab statement,
as shown below:
This example defines two tables, both of which may spread over more than one page.
squeeze=1 on the first line tells Quantum to fit as many pages as possible of that table on
each printed page. hitch=1 and squeeze=1 on the second line tell Quantum to print the
first page of this table on the same page as the last page of the previous table, and to
continue printing as many pages as possible on each physical page until it reaches the end
of the table.
This combination is handy if you just want a draft set of tables to check before printing
the final copy, or if you are using continuous stationery where page breaks may not be
important.
As the examples so far have explained, setting hitch or squeeze to 1 suppresses page
breaks but does not do anything with the table texts themselves. You will often find that
you can improve the appearance of the printed page by suppressing some or all of the
titles for the second and subsequent tables. As an illustration of what can be achieved,
consider the following table that was created by the specification:
a;op=1
ttcWASHING POWDER SURVEY
bot
ttlPrepared for XYZ Marketing
tab prefer ban1;squeeze=1
ttlBrand preference
foot
ttl*Biological powders only
There are a number of ways you can improve this table simply by controlling the table
titles, footnotes and bottom text. The way to do this is to use squeeze=2.
If you create these tables with squeeze=2, Quantum still prints the first page of the second
table on the same page as the previous table but it also:
• Suppresses title lines that are identical to the previous page until it finds a
non-identical title line. Trailing spaces are ignored in comparisons.
✎ Quantum considers not only tt texts as titles but also table numbers, page numbers
and output type descriptions. If you want to suppress all titles at the top of the page
you must suppress page and table numbers on that table. If the output types are
different you may need to suppress them too.
• Prints the second page between the foot text and the bot text of the first page.
• Removes the foot text from the first page if it is identical to the foot text of subsequent
pages. Trailing spaces are ignored in comparisons.
The effect of joining the two tables in the previous example using squeeze=2 is as follows
(page numbers were suppressed by placing nopage on the a statement):
Now that you have seen what can be achieved, you may be interested to know more about
how Quantum decides whether to suppress titles, in case you have tables where you
expected Quantum to suppress text and it did not.
When it compares titles on tables, Quantum generally looks at blocks of titles rather than
individual lines. When comparing the footnotes of the two tables, Quantum compares all
the tt texts for the first table with all the tts for the second and suppresses the footnote for
the second table only if the two blocks are identical. If the two footnotes have the same
first line but different second lines, Quantum sees the blocks as different and prints each
footnote in full below the table to which it belongs.
When Quantum compares titles at the top of the table it again looks at the titles as a block,
only this time it suppresses titles on the second table as long as they are identical to the
corresponding title lines for the first table. In the tables shown here, the job title, the
output type description and the first title line are identical for both tables so Quantum
prints them at the top of the first table only.
Once Quantum finds a title line that differs, it stops comparing lines. This is one reason
why you are advised to suppress page and table numbers. They are printed above table
titles and can prevent Quantum from suppressing otherwise identical table titles.
Another reason for suppressing page numbers is that Quantum does not recalculate page
numbers to take into account changes made by hitch and squeeze. If you have three tables
which would normally be printed on pages 1, 2 and 3, and you use hitch=1 to print tables
1 and 2 on the same page, the page number printed on table 3 will still show page 3 even
though it is now physical page 2.
If you have two one-page tables with the same column headings you can paste just the
rows of the second table under the rows of the first table so that the printed table looks
like a single table (similar to the types of tables you can create with und but without the
restrictions). Write your table specs as usual and place the keyword hitch=4 on the tab
statement of the first table. For example:
The table created by this spec would look line a single table.The table titles and column
headings would come from the first table. These would be followed by the elements from
likes and the elements from dislike just as if they had been specified as a single axis. If
you have been having trouble using und with tables with T-statistics on overlapping data
you should find that using hitch=4 solves your problems.
Quick Reference
To suppress percentages or statistics in a cell if the base is less than a value of your
choice, place the option:
smbase=number
The smsup group of options suppress all the figures in a cell if the specified values
(absolutes or percentages) are below a given value. The smbase= option allows you to
suppress just percentages in a cell without also suppressing the absolutes, if the base for
the percentage is less than a figure of your choice. You may use it on the a, sectbeg, flt
or tab statement and its setting at a higher level will carry through to all lower levels,
unless specifically overwritten at a lower level, as is the case with most a-statement
options.
When suppressing percentages with small bases, the word percentage means any of the
following:
Tables that are specified without any of these output types, either explicitly with op= on
the tab statement or implicitly from an op= carried over from a higher level, are not
subject to small percentage suppression even if the run requests it. (This allows you to
specify smbase on the a statement but prevents Quantum from manipulating the output
options for tables, such as those with absolutes only, where small percentage suppression
is inappropriate.)
n12 mean
n17 standard deviation
n19 standard error
n20 sample (error) variance
All other statistics are always left untouched regardless of the size of the base.
Requesting suppression
smbase = number;smcol
where number is an integer or real number greater than zero. Specifying a suppression
value of 0 or 0.0 effectively switches small percentage suppression off unless you have
tables with bases less than zero.
Column suppression is the default, so if you forget to specify the type of percentages you
wish to suppress this is what Quantum will do. It applies to all types of percentages
except row percentages. To request suppression of row percentages with small bases,
type:
smbase = number;smrow
You can, of course, use smcol and smrow together to suppress all percentages from small
bases.
The notes that follow explain how Quantum applies small percentage suppression. They
use column percentages as an example, but the same rules apply to row percentages. Just
substitute the word row wherever you see references to columns, and vice versa.
When Quantum creates a table where small percentage suppression is possible, it looks
at each cell separately and checks which output types have been requested for that cell.
If the cell belongs to a row that is not a statistic Quantum then compares the most recent
base for the current column with the smbase value. If the column base is less than the
smbase value Quantum switches off all percentage output types for the current cell. If
the cell is part of a statistical row, Quantum compares the total number of respondents
contributing to the statistic (the sum-of-n) with the smbase value and switches off all
percentage output types if it is less than the smbase value.
Next, Quantum checks whether the current cell is a base cell and, if so, switches absolutes
(op=1) back on. If the current cell is a mean Quantum switches absolutes off.
If these tests result in no output types being set for the cell, Quantum prints a blank in the
table when it reaches that cell. Otherwise it prints the remaining output types specified
for the cell (absolutes or ranks, for example).
The final check that Quantum makes is on the base or sum-of-n itself. If the base for a
column is less than the smbase value, Quantum prints two asterisks to the right of that
base value. If the sum-of-n for a statistic is less than the smbase value, Quantum prints
two asterisks to the right of the sum-of-n (or the blank if absolutes were switched off). If
there is insufficient room to print the asterisks at the side of the column, Quantum prints
them in the column instead.
If there are any rows before the first base row these tests are not applied since the
information required by the tests is not available.
Quantum prints footnotes on tables with small percentage suppression, explaining the
meaning of the double asterisks:
If the table also contains T-statistics and these cause the footnote:
Examples
The examples in this section give you an idea of how smbase works in different
situations. The first specification is:
a;smbase=30;flush
tab prefer region;op=1
tab prefer region;op=2;nopc
l region
col 115;Base;North;South;East;West
l prefer
col 120;Base;Brand A;Brand B;Brand C;Brand D;Brand E
This creates two similar tables, one showing absolutes only and the other showing
column percentages only. Percentages in the second table are suppressed if the base is
less than 30.
Absolutes
Base 170 27 54 40 49
Brand A 38 6 11 8 13
Brand B 57 8 24 13 12
Brand C 29 5 8 5 11
Brand D 34 7 9 8 10
Brand E 11 1 2 5 3
Col percents
Base 170 27 ** 54 40 49
Brand A 22.4% 20.4% 20.0% 26.5%
Brand B 33.5% 44.4% 32.5% 24.5%
Brand C 17.1% 14.8% 12.5% 22.4%
Brand D 20.0% 16.7% 20.0% 20.4%
Brand E 6.5% 3.7% 12.5% 6.1%
** very small base (under 30): percents suppressed
The first table is flagged with op=1 only so the specification of smbase=30 on the a
statement is ignored for this table. The second table has the same rows and columns but
shows column percentages only. When Quantum processes the specification for this table
it takes each cell in turn and compares the column base for that cell with 30. If the base
is less than 30 Quantum suppresses all percentages for that cell. This is how the column
percentages for North were suppressed.
The asterisks next to the base of 27 indicate that it is a small base; the footnote explains
this. Notice that it reports the value specified with smbase=.
The next specification has a mean element and a No answer element that does not
contribute to the base:
a;smbase=40;flush;nopc
tab rating grid;op=2
l rating
n10Base
n01Brand A;col(a)=131
n01Brand B;col(a)=132
n01Brand C;col(a)=133
n01Brand D;col(a)=134
side
col a00;Base;Excellent;%fac=5-1;Very Good;Good;Satisfactory;Poor
n01No Answer;c=ca00n’1/5’;nofac
n12Mean;dec=2
Col percents
Base Brand A Brand B Brand C Brand D
Base 181 34 ** 58 40 49
Excellent 21.0 19.0 20.0 26.5
Very Good 31.5 41.4 32.5 24.5
Good 16.0 13.8 12.5 22.4
Satisfactory 18.8 15.5 20.0 20.4
Poor 12.2 10.3 12.5 6.1
No Answer 0.6 0.0 2.5 0.0
Mean 3.31 ** 3.43 ** 3.45
** very small base (under 40): percents suppressed
The main point to notice in this table is the difference between the columns for Brands
A and C. The mean is suppressed in both columns and the cells are flagged with two
asterisks indicating that suppression is because the numbers of respondents contributing
to the means (in statistical terms, the sums-of-n) are less than 40.
Looking at the base for Brand C you might not expect any percentages to be suppressed
because the base is not less than the smbase value, but since the percentage for No
Answer in this row is greater than zero you may assume that there is at least one person
excluded from the mean. The sum-of-n is therefore less than the required value so the
mean is suppressed.
The column for Brand A has all the rating percentages suppressed because the base is less
than 40. The mean is suppressed because the number of people contributing to it is less
than 40.
a;smbase=40;flush
tab brand region;op=12
l brand
col 112;Base;Brand A;Brand B;Brand C;Brand D
l region
tstat prop;elms=ABCD;clevel=90
n10Base
n01North;c=c111’1’;id=A
n01South;c=c111’2’;id=B
n01East;c=c111’3’;id=C
n01West;c=c111’4’;id=D
Absolutes/col percents
The point to notice in this table is that the T-statistic footnote about very small bases has
overridden the one you would normally see to do with small bases for percentages.
Here is the same table run with the small base for T-statistics set to 25. Notice the
different footnote and the fact that column D is now included in the tests even though the
percentages are suppressed:
Absolutes/col percents
In this chapter we will discuss how to generate table titles and how to print page and table
numbers for each table.
Quick Reference
To define a table title, type:
ttxtitle_text
where x defines the position of the title and is l, r or c for left, right or centered
justification, a number between 1 and 9 to indent the title by 10 times x spaces, g to line
the text up with the start of the column headings, and a or b to switch justification from
left to right on alternate pages. tta prints on the left on the first, third, etc., pages, while
ttb prints on the left on the second, fourth, etc., pages.
Titles of any sort are created using tt statements at the appropriate place in the program.
Their format is:
ttx[Text]
ttg line up text with the first column allocated to the column headings. If you are
using the standard row text width of 24 characters, ttg will print the first
character of each table title in position 25 of each line. For example:
a;side=20;pagwid=57
tab age sex;op=12;decp=0;flush
ttgBase: All Respondents
1 age
ttgQ2: Age
val c(110,111);Base;11-20 yrs;21-34 yrs;35-54 yrs;55+ yrs
1 sex
col 109;Base;Male;Female
Page 1
Absolutes/total percents
Q2. Age
This produces a printed table with the title ‘Q7: Type of holiday taken’,
which appears in the table of contents as ‘Type of holiday taken (standard
demographic breakdown)’.
ttis do not appear in the printed tables, nor do they appear in the table of
contents unless you add the keyword index=1 to the a statement in your
tc.def file.
You may type up to 200 characters on each tt statement. tt statements may not be
continued, but you may group any number of them together to form blocks of text. Text
may be in upper or lower case, or both, and should be entered exactly as it is to be printed.
Titles can be made to refer to different levels in the tabulation hierarchy. tts following the
a statement define the run title which is printed at the top of each page. Text after a flt
statement refers to a group of tables (e.g., a sub-report), whereas tts following a tab
statement are relevant to that table only. The tt statement may also appear in an axis
immediately after the l statement in which case the text will be printed whenever that axis
is used for rows in a table. Titles in column axes are ignored.
a;side=20;spechar=-*;dsp;op=12
ttcProduct Awareness Test
tab age sex
ttlBase: All Respondents
tab brand area
ttlQ.5: Which brand did you try first?
ttlBase: All Respondents
When there are titles from more than one level to be printed at the top of the table, you
may decide what the printing order should be. The default is:
This may altered globally or for a group of tables or for an individual table using one of
the options ttord=, ttbeg= or ttend= on the a, flt or tab statement.
☞ These options are described in the section entitled "Output options" in chapter 16.
Quick Reference
To define a title that will be printed only on tables with special T-statistics, type:
ttx title_text;tstat
You may wish to specify titles that apply only to tables that contain T-statistics. To do
this define the title using a tt statement of your choice and append the keyword tstat to
it. For example:
a;dsp;clevel=90
ttcSouth East Local Electors’ Survey
ttlConfidence level: 90%;tstat
tab age sex
tab prefer ban1
stat ntd;elms=ABCD
Here, the job title defined with ttc will be printed at the top of both tables, whereas the
title to do with confidence levels will be printed on the second table only.
tstat is valid on tt statements at tab level and above. If an a, sectbeg, flt or tab statement
is followed by more than one tt statement, only one of those statements may contain the
tstat keyword.
Quick Reference
To underline a title, type:
ttx title_text;unlnumber
where number is 1 to underline the complete text, 2 to underline everything except blank
strings, or 3 to underline non-blank characters only.
Titles may be underlined by appending the option unl to the end of the tt lines to be
underlined. Unl must be separated from the rest of the text by a semicolon to stop it being
printed as part of the title, and should be followed by a number indicating the type of
underlining required.
produces
The statement:
produces
Using unl3
gives us
Quick Reference
To introduce titles to be printed after the last line of the table, type:
foot
Some tables require footnotes. These are defined using a foot statement followed by tt
statements with the footnote required, for example:
Footnotes are printed on the line immediately after the last line of figures in the table. To
have the footnote separated from the table, insert a few blank tts before the tts containing
the text of the footnote, as shown below:
foot
tt2
tt2
tt2This footnote is preceded by two blank lines
Foot and its tt statements may follow the a and flt statements to define a footnote globally
or for a group of tables. To set up a footnote for one table only, put it after the tab
statement for that table. Each level may define a footnote of up to 30 tt statements. If any
line is underlined, it counts as two lines.
Foot may also appear in an axis to introduce a footnote which is required whenever that
axis is used. It must appear after the l statement, and after any tstat elements in the axis.
An axis may contain a maximum of eleven tt statements altogether.
When Quantum reads an l statement, it takes all titles to be titles for the top of the page
until it reads a foot statement. Thereafter, titles are assumed to belong at the foot of the
table until another foot is read.
Once a footnote has been set up for a specific level, it remains operative until replaced by
another footnote at the same level. Therefore to replace a footnote for a group of tables
we would enter another flt statement followed by a foot and the tt statements with the new
footer.
When bottom or footer texts are defined at different stages of the run, the order of printing
is determined by the ttord option, and unless you specify otherwise, axis-level texts will
be printed before tab-level texts. To turn off a footer, just enter a foot statement followed
by a blank tt statement, thus:
flt
foot
ttlFootnote for a group of tables
tab ax1 bk1
/*The next table has no footnote
tab ax4 bk1
foot
ttl
/*Footnote still applies to this table
tab ax5 bk2
Quick Reference
To introduce titles to be printed at the bottom of the page, type:
bot
Printing text at the bottom of a page is virtually the same as printing it at the foot of a
table, the only difference being that the foot statement becomes a bot instead. For
instance:
bot
tt1Print this text at the bottom of the page
tt1and indent it by 10 spaces
Tables may have both foot and bot texts if you wish.
Quick Reference
To print table numbers, type:
where x determines the position of the table number on the line and is one of l for left
justified, r for right justified, c for centered, or a or b for alternate justification. tba prints
on the left on the first, third, and so on, pages, while tbb prints on the left on the second,
fourth, and so on, pages.
nand forces tables created by and statements to have the same table number as the parent
tab statement.
Alternatively, type:
<<tab>>
on a tt statement.
Table numbers are not printed unless you request them either with a tb statement, or with
the notation <<tab>> on a tt statement. The format of the tb statement is:
to have table numbers printed at the top left of the table, at the top right of the table, or
centrally within the page width. In all three cases, n is the number the next table is to have.
If tables are to be numbered consecutively from the given table number, only one tb
statement is required. However, if some tables have non-consecutive numbers, the tab
statements creating these tables must be preceded by a tb with the appropriate number:
tbl 5
tab brand region
tbl 9
tab tried region
You may need to do this when you are rerunning a selection of tables from a previous job.
Table numbers may also be defined on tba and tbb statements so that their justification
moves from left to right justification on alternate pages. If a title is defined on a tba
statement, it will be printed left justified if it is on the first, third, fifth page, and so on,
and right justified if it is in the second, fourth, sixth page, etc. If a title is defined on a ttb
statement, the reverse will be true; that is right justification on first page, left justification
on the second page, and so on.
Tables generated by and statements are generally assigned individual table numbers even
though they all come from the same statement. If you prefer, you may force all tables
created by ands to have the same table number as the table on the tab statement. This
option is invoked by the option nand on the tb statement:
tbr 1;nand
To switch off table numbering once it has been initiated, use the option notbl (this applies
to cancels tbl and tbr). To reset it, simply enter another tb statement carrying the required
table number as before.
If you want more control over the positioning of the table number, switch off the
automatic table numbering with notbl and define the table number on a tt statement, using
the notation:
<<tab>>
to mark the position in which the table number is to be printed. For example:
When the table is created, Quantum will substitute the appropriate table number in place
of <<tab>>. Since the title is defined on a ttl statement, the table number will be printed
left-justified at the top of the table, together with any other titles from the axes.
You may use this notation on tt statements at any position, either inside or outside the
axis.
Quantum numbers the tables sequentially from 1, as it does when you use tb. If you want
to force a table to have a specific number, use a tb statement with the required number
just before that table.
a;dsp;op=12;notbl
tba 1
tab age sex
ttlTable <<tab>>: Age by Sex
tab marry sex
foot
ttl
ttl
ttcTable <<tab>>: Marital Status by Sex
tba 10
tab prefer region
l region
ttlTable <<tab>>: Regional breakdown
In this example, the first table is Table 1, and this text is printed at the top left corner of
the page. The second table is Table 2. The title is printed at the foot of the table, separated
from the last line of the table by two blank lines. The third table is Table 10. This text is
printed at the top left corner of the page. Notice, that this title is defined as part of the row
axis.
Quick Reference
To define the page number to be printed on a page, type:
pag number
or type:
<<pag>>
on a tt statement.
Page numbers are always printed in the top right-hand corner of the table unless you use
the option nopage on the a, flt or tab statement. The first page is Page 1, the second is
Page 2, and so on. Note that page numbers are completely independent of table numbers:
one table may cover several pages with different numbers, but the table number will
remain the same.
Page numbering is controlled by the pag statement which looks like this:
pag n
As with tb, page numbers for various tables may be set individually by preceding the
relevant tabs with a pag.
Page numbering may be turned off for a single table by including the option nopage on
the tab statement. To switch off page numbering for a group of tables, place nopage on
a flt statement. When page numbers are switched off, Quantum does not stop
incrementing the page count for each new page, it simply does not print the page number.
Therefore, if we write:
If you want more control over the positioning of the page number, switch off the
automatic page numbering with nopage and define the page number on a tt statement,
using the notation:
<<pag>>
to mark the position in which the page number is to be printed. For example:
bot
ttrPage <<pag>>
tab age sex
ttlTable <<tab>>: Age by Sex
When the table is created, Quantum will substitute the appropriate page number in place
of <<pag>>. Since the page number is defined on a ttr statement under bot, it will be
printed right-justified at the bottom of the page.
If you want to force a page to have a specific number, use a pag statement with the
required number just before that table.
Let’s expand the example we used above with tb by adding page numbers:
a;dsp;op=12;notbl
tba 1
tab age sex
ttlTable <<tab>>: Age by Sex
tab marry sex
foot
ttl
ttl
ttcTable <<tab>>: Marital Status by Sex
tbl 10
tab prefer region
l region
ttlTable <<tab>>: Regional breakdown
In this example, all tables have their page number in the lower right corner of the page.
The first table starts on Page 1, the second starts on Page 5, and the third starts on Page 6
(assuming the previous table did not spread over more than one page).
Quick Reference
To override the default justification of titles defined by the keywords tta, ttb, tba and tbb,
use ori. Type:
ori 0
to print text at the top of the current table in the same position as on the previous table, or:
ori 1
The ori statement determines the justification (page orientation) for text at the top of the
table, and may used as to give finer control over the conditions imposed with tta, ttb, tba
and tbb. If used, it must come before a tab statement. Its syntax is:
ori n
If n is 0 (zero), the text at the top of the current table will be printed in the same position
as in the first table (i.e., tta on the left and ttb on the right). If n is 1, the text will be printed
the other way around (note that this may mean that two consecutive pages have the same
orientation). For example:
The first table will have its table number and title left justified on the first page and right
justified on the second.The second table will have its table number and title left justified.
Because it is preceded by ori 0, the third table will have its table number and title left
justified on the first page and right justified on the second. Without ori, the third table
would have right justification on the first page (the 4th page overall) and left justification
on the second page (the 5th page overall).
Filter statements are the second level in the tabulation hierarchy. They define conditions,
options and text applicable to the tables on all subsequent tabs until another filter is
defined.
There are two types of filter: general (flt) and named (flt=).
Quick Reference
To define filters and options for a group of tables, type:
The flt statement provides conditions, options and text for the group of tab statements
which follow it, and should be placed immediately before the first tab to which it refers.
The filters and conditions remain operative until another flt statement is read or until
overridden by different options for a single table.
where filters and options are any of the keywords mentioned in section 21.2 as valid on
the tab statement. Each option in the list must be separated by a semicolon. The
statement:
flt;c=c106’2’;nz;decp=2
causes all subsequent tables to be filtered to include women only (c=c106’2’), all rows
in which all cells are zero to be omitted and percentages to be shown to two decimal
places.
Conditions (c=) defined on a flt statement are additional to those defined later on the
individual tabs, whereas options apply to all tables unless overridden by a different
version of themselves on the tabs. If we write:
flt;c=c106’2’
tab occup region;c=c132’1’
ttlBase: Women in Full Time Employment
c=c106’2’.and.c132’1’
which will include all women who have full-time employment. On the other hand, if the
flt defines a global scaling factor of 10 and an individual tab has a scaling factor of 5, the
cells in all tables except the one with scale=5 will be multiplied by ten before they are
printed. Cells in the other table will be multiplied by five.
General filter statements may have and and tt statements associated with them, the
former listing higher dimension axes to be used in all tables in the group, while the latter
defines the text to be printed at the top of all tables, in addition to texts generated by the
tab statement. For instance:
flt;c=c106’2’;nz;decp=2
and3 region
tab age occup
tab class age
tab occup class
Flts may also be followed by foot and bot statements, each with tt statements defining
other texts relevant to the group of tables as a whole.
Flt is useful when groups of tables have the same overall filters, or where a set of tables
are to be produced more than once using different filters. For example:
flt;c106’1’;op=12
ttcMen
tab ax01 demo;c=c121’1’
ttcResident in Central London
tab ax01 demo;c=c121’2’
ttcResident in Outer London
tab ax01 demo;c=c121’3’
ttcResident in England Outside London
flt;c=c106’2’;op=12
ttcWomen
tab ax01 demo;c=c121’1’
ttcResident in Central London
In this case we are tabbing the same set of axes for men and women separately. If we did
not want such specific table titles (i.e., showing the region), we could have written this
example more simply by generating a 3-dimensional table with region as the third
dimension, thus:
flt;c=c106’1’;op=12
ttcMen
tab region ax01 demo
.
Other tables
.
flt;c=c106’2’;op=12
ttcWomen
tab region ax01 demo
.
.
Yet another way of cutting down on the amount of writing is to make this a 4-dimensional
table:
As you can see, there are many ways of tackling this. Which one you choose depends on
what other tables you have to produce and the type of output requested by the client.
To cancel a filter altogether, rather than replacing it with another one, just enter a blank
flt. Note that this will cancel not only the conditions and options on the previous filter,
but also any tt texts or and statements associated with it.
Quick Reference
To define a set of filter conditions and options that you may refer toby name, type:
Named Filters are very useful ways of increasing the efficiency of, and reducing work in,
a program where sets of tables, each with its own flt statement and text, are interspersed.
As described above, conditions, options and text are defined using filter statements, but
this time we give the filter a name. The syntax is:
where name is a name of up to 15 characters, and filters and options are as described for
general filters above. This may then be followed by tt statements describing the filter. For
instance:
flt=male;c=c106’1’;op=120
ttlMales
To use a named filter, simply add the option flt=name to each tab statement requiring
the filter. The statement:
produces a table of ax01 by demo filtered by C106’1’, using the heading males.
When filters are applied in this way, they refer only to the table on which they are named.
Note also that a named filter on a tab does not override any previous general flts whose
conditions apply to that table. The conditions are additive in this case.
For example, if we have a series of tables consisting of one table each for men, women
and all respondents, followed by another set for the same three groups, and for one reason
or another we cannot rearrange them and use general filters or use 3-dimensional tables,
we can greatly reduce our work by creating three named filters and applying each one in
turn to the tab statements which create the sets of tables for men, women and all
respondents. We might write:
flt=male;c=c106’1’;op=012
flt=fem;c=c106’2’;op=012
flt=all;c=c106’12’;op=012
tab ax01 demo;flt=male
tab ax01 demo;flt=fem
tab ax01 demo;flt=all
Quick Reference
To mark the start of a nested filter section, type:
sectend
Sometimes, you’ll have a group of tables which share the same overall set of titles and
filters, but some tables within the group will require additional titles and filters. If you
use flt statements, you’ll define the filters and titles for the first group and then repeat
them, with the additional titles, for the second group. A more efficient method is to write
a nested table spec.
In a nested table spec, you define your tables in groups or sections. The outermost group
has filters and titles which apply to all tables in that group and also to any table subgroups
which may occur in the group. Inner groups have just their additional filters and titles
defined in their specs, but when Quantum creates the tables it will take the filters and
titles for the outer group(s) first and then apply the additional filters and titles required
for the subgroup.
As an example, suppose you have a set of tables for people who bought a new car during
the last six months. You then want a second group of tables for women who bought a new
VW car during that period. The basic titles and filters are the same – bought a new car
during the last six months – but there are extra titles and filters for the second set of tables,
namely, women who bought VWs. Therefore, you have two sections for filtering. The top
level or section is people who bought new cars during the last six months, the lower level
or subsection is the women who bought VWs.
The statements associated with this facility are sectbeg and sectend. To start a section,
type:
sectbeg; options
where options is any of the options permitted on a/flt/tab statements, except dp and
netsort. To end a section, type:
sectend
You may use up to ten sectbeg statements before using a sectend, but you must be certain
that you include one sectend for every sectbeg used. The use of consecutive sectbeg
statements without an intervening sectend signals a subsection within the main section.
This ends at the first sectend, leaving the main section to continue until another sectend
(with no further sectbeg statements) is read.
Having defined your filter requirements with sectbeg, you then enter titles and tab
statements as usual. Here’s the spec for our new car buyers:
The comments in the spec explain where each section starts and ends. When Quantum
reaches the second sectbeg statement, it knows that it hasn’t yet read a sectend statement
to terminate the first section, so it assumes that this is the start of a table subsection. Any
filters or options on the second sectbeg plus any titles defined after it are applied to the
tables in that section in addition to filters and titles already in place for the main section.
The table of likes by ban1 will have four titles:
We used the option ttend=tab on the a statement to force Quantum to print titles in this
order rather than printing filter titles after table titles.
If you wanted to write the same spec using flt statements, you’d need one flt statement
for each sectbeg, containing the full filter specification for the next set of tables, followed
by a listing of all titles required for those tables. For the tables in the subsections, this
would mean repeating titles which you’d already defined for previous tables.
/* define axes
l sex
ttlQ1. Sex
ttlBase: All Respondents
col 110;Base;Male;Female
/*
l age
ttlQ2. Age
ttlBase: All Respondents
col 111;Base;11-16 yrs;17-20 yrs;21-24 yrs;25-34 yrs;35-44 yrs;
+45-54 yrs;55-64 yrs;65+ yrs
/*
l q7
ttlQ7. Have you visited Museum before?
ttlQ8. If so, number of previous visits excluding this one
ttlBase: All Respondents
col 116;Base;Yes;No
n00;c=c116’1’
n11Base
n23All visiting Museum previously
val c(117,118);=;0 times;1 time;2 times;3 times;4 times;5 times;
+6 times;7 times;8 times;9 times;10 times;i;11+ times=11-99
n01DK/NA;c=-
/*
l q12
ttlQ12. Have you visited any other museum/art gallery before today
ttl and/or do you intend to visit any others?
ttlBase: All Respondents
col 123;Base;Yes;No;DK/NA=rej
/*
l q13;c=c123’1’
ttlQ13. Museums/Art galleries visited/intend to visit
ttlBase: All who visited other museums before today
ttl and/or intend to visit others
n11Base
col 124;hd=All visited/intending to visit other museums;
+Science Museum;Victoria and Albert;IGS;British Museum;
+Tate Gallery;National Gallery;Others;DK/NA=rej
/*
l lq1
ttlQ1. How long have you been in the Museum today?
col 137;Base;A few minutes;Half hour;Three-quarters of an hour;
+One hour;One and half hours;Two hours;Two and half hours;
+Three hours;Three and half hours;4 hours/half a day;Whole day;
+DK/NA=rej
/*
l lq2
ttlQ2. Was your stay longer/shorter than intended?
col 138;Base;Longer;Shorter;Hadn’t planned particular length of time;
+About what I’d planned;DK/NA=rej
/*
l lq3
ttlQ3. What do you remember seeing?
n10Base
col 181;Human Biology;Man’s place in evolution;Wildlife in danger;
col 182;Dinosaurs;Collections/Conservation/Research;Fish and reptiles;
+Fossil galleries;Birds;Insects;Whale hall;Mammals;Minerals/meteorites;
+Introducing ecology;Botany;Origin of species
n01Others;c=c181’4’
n01DK/NA;c=-
/*
l lq7
ttlQ7. How did you find your way round the Museum?
col 232;Base;Signposting;Guidebook;Attendant;Wandered;
+Brief description of route taken;"With difficulty";
+Gallery plan;With someone who knew/already knew;DK/NA=rej
/*
l lq8
ttlQ8. Could signposting be improved?
col 233;Base;Yes;No;Don’t know;Other;DK/NA=rej
/*
l lq9;c=c233’1’
ttl Q9. How do you think it might be improved?
ttlBase: All who think signposting could be improved
col 234;Base;Increase frequency;Larger/clearer;Color;
+Device to indicate section;Specific problem/solution;
+Non specific comment;Comment about something other than signing;
/* define banners
l ban1
n10TOTAL
col 110;Male;Female
col 111;11-20=’12’;21-34=’34’;35-54=’5/6’;55+=’78’
col 112;Yes;No
col 116;Yes;No
g
g
g Sex Age
g ----------- --------------------------
g TOTAL Male Female 11-20 21-34 35-54 55+
g -----------------------------------------------------
p x x x | x x x x |
g Completed Visited
g Full Time Museum
g Education Before
g ---------- -----------
g Yes No Yes No
g------------------------------
p x x | x x
11-16 yrs 38 23 15
6% 7% 6%
17-20 yrs 82 50 32
14% 15% 12%
21-24 yrs 98 52 44
16% 15% 17%
35-44 yrs 91 49 42
15% 14% 16%
45-54 yrs 55 32 23
9% 9% 9%
55-64 yrs 33 16 17
5% 5% 6%
65+ yrs 16 10 6
3% 3% 2%
Completed Visited
Full Tine Museum
Sex Age Education Before
----------- ----------------------------- ---------- ----------
TOTAL Male Female 11-20 21-34 35-54 55+ Yes No Yes No
----------------------------------------------------------------------------------------
Base 605 341 264 | 120 290 146 49 | 480 135 | 306 299
| | |
Yes 306 177 129 | 70 136 76 24 | 245 61 | 306 -
51% 52% 49% | 58% 47% 52% 49% | 51% 49% | 100% -
| | |
No 299 164 135 | 50 154 70 25 | 235 64 | - 299
49% 48% 51% | 42% 53% 48% 51% | 49% 51% | - 100%
| | |
All visiting Museum previously | | |
1 tine 88 52 36 | 17 44 24 3 | 77 11 | 88 -
29% 29% 28% | 24% 32% 32% 13% | 31% 18% | 29% -
| | |
2 tines 56 29 27 | 19 24 10 3 | 41 15 | 56 -
18% 16% 21% | 27% 18% 13% 13% | 17% 25% | 18% -
| | |
3 tines 33 14 19 | 9 12 8 4 | 26 7 | 33 -
11% 8% 15% | 13% 9% 11% 17% | 11% 11% | 11% -
| | |
4 tines 22 14 8 | 6 9 7 - | 15 7 | 22 -
7% 8% 6% | 9% 7% 9% - | 6% 11% | 7% -
| | |
5 tines 14 7 7 | 4 7 3 - | 10 4 | 14 -
5% 4% 5% | 6% 5% 5% - | 4% 7% | 5% -
| | |
6 tines 21 15 6 | - 15 4 2 | 20 1 | 21 -
7% 8% 5% | - 11% 5% 8% | 8% 2% | 7% -
| | |
7 tines 5 5 - | 2 3 - - | 4 1 | 5 -
2% 3% - | 3% 2% - - | 2% 2% | 2% -
| | |
8 tines 3 3 - | 2 1 - - | 3 - | 3 -
1% 3% - | 3% 1% - - | 1% - | 1% -
| | |
9 tines 1 - 1 | - - 1 - | 1 - | 1 -
* - 1% | - - 1% - | * - | * -
| | |
10 tines 12 5 7 | 4 6 2 - | 6 6 | 12 -
4% 3% 5% | 6% 4% 3% - | 2% 18% | 4% -
| | |
11 tines 51 33 18 | 7 15 17 12 | 42 9 | 51 -
17% 19% 14% | 10% 11% 22% 50% | 17% 15% | 17% -
| | |
Completed Visited
Full Tine Museum
Sex Age Education Before
----------- ----------------------------- ---------- ----------
TOTAL Male Female 11-20 21-34 35-54 55+ Yes No Yes No
----------------------------------------------------------------------------------------
Base 605 341 264 | 120 290 146 49 | 480 135 | 306 299
| | |
Yes 427 256 171 | 92 207 97 31 | 339 88 | 222 205
71% 75% 65% | 77% 71% 66% 63% | 71% 70% | 73% 69%
| | |
No 178 88 93 | 28 83 49 18 | 141 37 | 84 94
29% 25% 35% | 23% 29% 34% 37% | 29% 30% | 27% 31%
| | |
All visited/internding to visit other museums | | |
Science Museum 334 210 124 | 74 158 81 21 | 268 66 | 172 162
78% 82% 73% | 80% 76% 84% 68% | 79% 75% | 77% 79%
| | |
Victoria andd Albert 92 49 43 | 20 35 27 10 | 76 16 | 46 46
22% 19% 25% | 22% 17% 28% 32% | 22% 18% | 21% 22%
| | |
IGS 47 31 16 | 11 22 8 6 | 35 12 | 21 26
11% 12% 9% | 12% 11% 8% 19% | 10% 14% | 9% 13%
| | |
British Museum 26 18 8 | 8 11 6 1 | 17 9 | 11 15
6% 7% 5% | 9% 5% 6% 3% | 5% 10% | 5% 7%
| | |
Tate Gallery 19 7 12 | 6 9 4 - | 12 7 | 9 10
4% 3% 7% | 7% 4% 4% - | 4% 8% | 4% 5%
| | |
National Gallery 21 11 10 | 4 10 5 2 | 15 6 | 10 11
5% 4% 6% | 4% 5% 5% 6% | 4% 7% | 5% 5%
| | |
Others 37 16 21 | 10 17 7 3 | 39 8 | 18 19
9% 6% 12% | 11% 8% 7% 10% | 9% 9% | 8% 9%
| | |
Completed Visited
Full Tine Museum
Sex Age Education Before
----------- ----------------------------- ---------- ----------
TOTAL Male Female 11-20 21-34 35-54 55+ Yes No Yes No
----------------------------------------------------------------------------------------
Base 301 171 130 | 57 152 71 21 | 237 64 | 156 145
| | |
A few minutes 7 5 2 | 1 4 2 - | 5 2 | 6 1
2% 3% 2% | 2% 3% 3% - | 2% 3% | 4% 1%
| | |
Half hour 35 26 9 | 4 19 6 4 | 29 6 | 18 17
12% 15% 7% | 11% 13% 8% 19% | 12% 9% | 12% 12%
| | |
Three-quarters of an 27 18 9 | 4 12 8 3 | 23 4 | 17 10
hour 9% 11% 7% | 7% 8% 11% 14% | 10% 6% | 11% 7%
| | |
One hour 61 35 26 | 17 24 17 3 | 45 16 | 28 33
20% 20% 20% | 30% 16% 24% 14% | 19% 25% | 18% 23%
| | |
One and half hours 64 36 28 | 13 36 11 4 | 49 15 | 35 29
21% 21% 22% | 23% 24% 15% 19% | 21% 23% | 22% 20
| | |
Two hours 50 22 28 | 8 25 13 4 | 42 8 | 24 26
17% 13% 22% | 14% 16% 18% 19% | 18% 13% | 15% 18%
| | |
Two and half hours 21 8 13 | 2 14 5 - | 17 4 | 8 13
7% 5% 10% | 4% 9% 7% - | 7% 6% | 5% 9%
| | |
Three hours 13 8 5 | 3 5 4 1 | 9 4 | 8 5
4% 5% 4% | 5% 3% 6% 5% | 4% 6% | 5% 3%
| | |
Three and half hours 2 2 - | 1 1 - - | 1 1 | 2 -
1% 1% - | 2% 1% - - | * 2% | 1% -
| | |
4 hours/half a day 11 5 5 | 1 5 3 2 | 9 2 | 7 4
4% 3% 5% | 2% 3% 4% 10% | 4% 3% | 4% 3%
| | |
Whole day 10 6 4 | 1 7 2 - | 8 2 | 3 7
3% 4% 3% | 2% 5% 3% - | 3% 3% | 2% 5%
| | |
Base 301 171 130 | 57 152 71 21 | 237 64 | 156 145
| | |
Longer 49 25 24 | 13 26 8 2 | 33 16 | 19 30
16% 15% 18% | 23% 17% 11% 10% | 14% 25% | 12% 21%
| | |
Shorter 91 59 32 | 13 48 21 9 | 78 13 | 49 42
30% 35% 25% | 23% 32% 30% 43% | 33% 20% | 31% 29%
| | |
Completed Visited
Full Tine Museum
Sex Age Education Before
----------- ----------------------------- ---------- ----------
TOTAL Male Female 11-20 21-34 35-54 55+ Yes No Yes No
----------------------------------------------------------------------------------------
Base 301 171 130 | 57 152 71 21 | 237 64 | 156 145
| | |
Hadn’t planned 84 44 40 | 19 43 19 3 | 64 20 | 39 45
particular length of 28% 26% 31% | 33% 28% 27% 14% | 27% 31% | 25% 31%
time | | |
| | |
About what I’d planned 74 42 32 | 11 34 22 7 | 60 14 | 46 28
25% 25% 25% | 19% 22% 31% 33% | 25% 22% | 29% 19%
| | |
DK/NA 3 1 2 | 1 1 1 - | 2 1 | 3 -
1% 1% 2% | 2% 1% 1% - | 1% 2% | 2% -
| | |
Completed Visited
Full Tine Museum
Sex Age Education Before
----------- ----------------------------- ---------- ----------
TOTAL Male Female 11-20 21-34 35-54 55+ Yes No Yes No
----------------------------------------------------------------------------------------
Base 301 171 130 | 57 152 71 21 | 237 64 | 156 145
| | |
Human Biology 123 60 63 | 23 71 25 4 | 98 25 | 65 58
41% 35% 48% | 40% 47% 35% 19% | 41% 39% | 42% 40%
| | |
Man’s place in evolution 120 62 58 | 28 62 21 9 | 90 30 | 51 69
40% 36% 45% | 49% 41% 30% 43% | 38% 47% | 33% 48%
| | |
Wildlife in danger 32 17 15 | 2 25 4 1 | 29 3 | 19 13
11% 10% 12% | 4% 16% 6% 5% | 12% 5% | 12% 9%
| | |
Dinosaurs 259 152 107 | 53 130 58 18 | 203 56 | 128 131
86% 89% 82% | 93% 86% 82% 86% | 86% 88% | 82% 90%
| | |
Collections/ 58 38 20 | 11 28 15 4 | 49 9 | 35 23
Conservation/Research 19% 22% 15% | 19% 18% 21% 19% | 21% 14% | 22% 16%
| | |
Fish and reptiles 111 77 34 | 25 52 27 7 | 90 21 | 56 55
37% 45% 26% | 44% 34% 38% 33% | 38% 33% | 36% 38%
| | |
Fossil galleries 64 35 29 | 6 40 15 3 | 53 11 | 30 34
21% 20% 22% | 11% 26% 21% 14% | 22% 17% | 19% 23%
| | |
Birds 100 52 48 | 19 59 16 6 | 84 16 | 50 50
33% 30% 37% | 33% 39% 23% 29% | 35% 25% | 32% 34%
| | |
Insects 75 41 34 | 18 41 12 4 | 61 14 | 39 36
25% 24% 26% | 32% 27% 17% 19% | 26% 22% | 25% 25%
| | |
Whale Hall 83 43 40 | 14 49 15 5 | 72 11 | 43 40
28% 25% 31% | 25% 32% 21% 24% | 30% 17% | 28% 28%
| | |
Mammals 142 64 78 | 31 69 31 11 | 111 31 | 67 75
47% 37% 60% | 54% 45% 44% 52% | 47% 48% | 43% 52%
| | |
Minerals/meterites 88 40 48 | 19 44 19 6 | 70 18 | 31 57
29% 23% 37% | 33% 29% 27% 29% | 30% 28% | 20% 39%
| | |
Introducing ecology 29 14 15 | 4 22 3 - | 23 6 | 14 15
10% 8% 12% | 7% 14% 4% - | 10% 9% | 9% 10%
| | |
Completed Visited
Full Tine Museum
Sex Age Education Before
----------- ----------------------------- ---------- ----------
TOTAL Male Female 11-20 21-34 35-54 55+ Yes No Yes No
----------------------------------------------------------------------------------------
Base 301 171 130 | 57 152 71 21 | 237 64 | 156 145
| | |
Botany 58 31 27 | 12 33 10 3 | 45 13 | 26 32
19% 18% 21% | 21% 22% 14% 14% | 19% 20% | 17% 22%
| | |
Origin of species 101 52 49 | 24 51 19 7 | 74 27 | 51 50
34% 39% 38% | 42% 34% 27% 33% | 31% 42% | 33% 34%
| | |
Others 82 51 31 | 14 42 18 8 | 61 21 | 46 36
27% 30% 24% | 25% 28% 25% 38% | 26% 33% | 29% 25%
| | |
Completed Visited
Full Tine Museum
Sex Age Education Before
----------- ----------------------------- ---------- ----------
TOTAL Male Female 11-20 21-34 35-54 55+ Yes No Yes No
----------------------------------------------------------------------------------------
Base 301 171 130 | 57 152 71 21 | 237 64 | 156 145
| | |
Signposting 127 71 56 | 19 66 31 11 | 104 23 | 59 68
42% 42% 43% | 33% 43% 44% 52% | 44% 36% | 38% 47%
| | |
Guidebook 22 6 16 | 2 14 4 2 | 20 2 | 9 13
7% 4% 12% | 4% 9% 6% 10% | 8% 3% | 6% 9%
| | |
Attendant 11 3 8 | 2 3 5 1 | 8 3 | 10 1
4% 2% 6% | 4% 2% 7% 5% | 3% 5% | 6% 1%
| | |
Wandered 120 76 44 | 28 58 25 9 | 93 27 | 52 68
40% 44% 34% | 49% 38% 35% 43% | 39% 42% | 33% 47%
| | |
Brief description of 4 2 2 | 1 1 1 1 | 3 1 | 2 2
route taken 1% 1% 2% | 2% 1% 1% 5% | 1% 2% | 1% 1%
| | |
"With difficulty" 16 10 6 | 3 7 5 1 | 12 4 | 11 5
5% 6% 5% | 5% 5% 7% 5% | 5% 6% | 7% 3%
| | |
Gallery plan 35 23 12 | 7 20 8 - | 26 9 | 21 14
12% 13% 9% | 12% 13% 11% - | 11% 14% | 13% 10%
| | |
With someone who knew/ 30 14 16 | 7 13 8 2 | 23 7 | 28 2
already knew 10% 8% 12% | 12% 9% 11% 10% | 10% 11% | 18% 1%
| | |
DK/NA 3 2 1 | - 3 - - | 3 - | 1 2
27% 30% 24% | - 2% - - | 1% - | 1% 1%
| | |
Completed Visited
Full Tine Museum
Sex Age Education Before
----------- ----------------------------- ---------- ----------
TOTAL Male Female 11-20 21-34 35-54 55+ Yes No Yes No
----------------------------------------------------------------------------------------
Base 301 171 130 | 57 152 71 21 | 237 64 | 156 145
| | |
Yes 95 51 44 | 21 48 23 3 | 75 20 | 64 31
32% 30% 34% | 37% 32% 32% 14% | 32% 31% | 41% 21%
| | |
No 176 103 73 | 30 87 42 17 | 137 39 | 77 99
58% 60% 56% | 53% 57% 59% 81% | 58% 61% | 49% 68%
| | |
Don’t know 27 15 12 | 6 14 6 1 | 22 5 | 14 13
9% 9% 9% | 11% 9% 8% 5% | 9% 8% | 9% 9%
| | |
DK/NA 3 2 1 | - 3 - - | 3 - | 1 2
1% 1% 1% | - 2% - - | 1% - | 1% 1%
| | |
Base 95 51 44 | 21 48 23 3 | 75 20 | 64 31
| | |
Increase frequency 15 8 7 | 7 6 2 - | 8 7 | 11 4
16% 16% 16% | 33% 13% 9% - | 11% 35% | 17% 13%
| | |
Larger/clearer 30 15 15 | 5 14 11 - | 24 6 | 23 7
32% 29% 34% | 24% 29% 48% - | 32% 30% | 36% 23%
| | |
Color 11 7 4 | 4 4 3 - | 6 5 | 11 -
12% 14% 9% | 19% 8% 13% - | 8% 25% | 17% -
| | |
Device to indicate 2 - 2 | - 1 1 - | 3 - | 1 1
selection 2% - 5% | - 2% 4% - | 3% - | 2% 3%
| | |
Specific problem/ 32 18 14 | 3 18 9 2 | 28 4 | 20 12
solution 34% 35% 32% | 14% 38% 39% 67% | 37% 20% | 31% 39%
| | |
Non specific comment 4 2 2 | 1 3 - - | 4 - | 3 1
4% 4% 5% | 5% 6% - - | 5% - | 5% 3%
| | |
Comment about something 21 13 8 | 4 11 5 1 | 18 3 | 10 11
other then signing 22% 25% 18% | 19% 23% 22% 33% | 24% 15% | 16% 35%
| | |
Moving from 57 to 1,500 is the fine art of weighting. In this case, each middle-aged
housewife has a weight of 10,000/380. Since 57 of them buy cheddar cheese, the number
in the cell will be:
Weighting is also used to correct biases that build up during a survey. For example, when
conducting interviews by telephone you may find that 60% of the respondents were
women. You may then want to correct this ratio of men to women to make the two groups
more evenly balanced.
The basic idea behind weighting is that when someone falls into a given cell (i.e. satisfies
the conditions for that cell) the number in the cell is not increased by 1; rather, it is
increased by 1 multiplied by the individual’s weight.
Quantum is sufficiently flexible to allow more than one set of weights for a given set of
respondents. Which set is applied is determined by options on the a,sectbeg, flt or tab
statement or on the statements which create the individual rows or columns of a table.
Each set of weights, however, will apply one weight for each respondent. There are two
ways of calculating weights:
a) The weight for each respondent may be part of the data for that respondent, or it may
be calculated in the edit and passed to the tabulation section as a variable.
b) The more common method of weighting is to define a set of characteristics and apply
specific weights to respondents satisfying those characteristics.
Our example above uses characteristic weighting, where the characteristics are age, sex
and working status. Thus, all respondents who are women aged between 45 and 54 and
who do not work outside the home receive a weight of 10,000/380.
The characteristics must be such that each record satisfies one unique set. Each
respondent falls into one, and only one, set and no respondent is left out. Because of this,
you must check all columns containing the characteristics and if necessary, correct any
errors. If one characteristic is sex, then you must make sure that, say, c106 is single coded
with a ’1’ or a ’2’ only: it must not be blank, multicoded or otherwise miscoded in any
way.
Any respondent who is present in the base of the weighting matrix but not in any other
row or column of the matrix will be given a weight of 1.0, and his record will be printed
in the print file with the message ‘unweighted’.
Quantum offers factor, target and rim weighting, preweights, postweights, weighting
using proportions and weighting to a given total. These are described, with examples, in
the sections which follow. The keywords used to write the weighting matrices are
described later in this chapter.
Factor weighting
With factor weighting, every record which satisfies a given set of conditions is assigned
a specific weight. You would generally use it when the weights are calculated outside of
Quantum – for instance, you may be told that all unemployed people in London require
a weight of 10.5, whereas unemployed people in the rest of the country need a weight of
7.3.
When Quantum creates the weighted table, it will check which cell of the weighting
matrix each respondent belongs in, and will apply the weight associated with that cell
before placing the respondent in the table.
You can also use factor weighting, with a factor of 1.0, when you just want to use weights
stored in the data or calculated in the edit, without defining any other weights. These
weights are defined as preweights.
Target weighting
Target weights may be used when you know the exact number of respondents you want
to appear in each cell of the weighted table. For example, in a table of age by sex, you
may know the exact number of men under 21, women under 21, and so on, to appear in
the table once it has been weighted. The weights that you define in your matrix are
therefore the values to appear in the weighted table rather than the weights to be applied
to each respondent of a given age and sex.
When Quantum creates your weighted table, it calculates the weight for an individual
respondent by taking the target figure for the appropriate cell in the weight matrix and
dividing it by the number of respondents in that cell.
As an example, suppose that you have three groups of people. The first contains 100
people, the second contains 200, and the third contains 300. You know that in the total
population, the spread of any 600 respondents across these three groups would be 150,
200 and 250. When Quantum finds someone in the first group it will apply a weight of
1.5 (150/100) in order to obtain the total of 150 respondents in the weighted table.
Respondents in the second group will have a weight of 1.0 because the number of
respondents in this group matches the value in the weighting matrix for that group.
Respondents in the third group will have a weight of 83.33 (250/300) because there are
more people in that group than in the corresponding cell of the weighting matrix.
In this example, the number of people in our three groups was the same as the population
defined in the weighting matrix. This will not always be the case. Often you will find that
the values in the weighting matrix add up to more or less than the number of people you
have in your sample. For instance, the spread of the population across your three groups
may be 150, 250 and 250, giving a total of 650 respondents. When Quantum balances
your sample, it will weight each respondent according to the values in the matrix so that
the total number of respondents in your weighted table will be 650, rather than the 600
that were interviewed.
If you decide that you want the total in the weighted table to be the same as the total
number of respondents in your sample, you may define this total as part of the weighting
matrix using the keyword total= which is described below. When Quantum reads this
keyword it balances the three groups according to the weights in the matrix and then
adjusts all three weights so that the weighted total is 600.
Another variation of target weighting occurs when instead of knowing the actual number
of people in each group of the population, you know that each group is a given percentage
of the population. For instance, the first group may be 27% of the population, the second
may be 48%, and the third may be 25%. In cases like this, you include the keyword input
(see below) in the weighting matrix with the percentages for each group.
Rim weighting
a) you want to weight according to various characteristics, but do not know the
relationship of the intersection of those characteristics, or
b) you do not have enough respondents to fill all the possible cells of the table if you
were to weight the data using the multidimensional technique described above.
For example, you may want to weight by age, sex and marital status and may know the
weights for each category of those characteristics (e.g. people aged 25 to 30; men; single
people). However, you may not know the weights for, say, single men aged between 25
and 30, married women aged between 31 and 40, and so on.
On another study, you may need to weight by a large number of characteristics at the
same time (e.g., sex, age, race, occupation and income). Since each of these
characteristics will be broken down into categories, you will require a weighting matrix
with many cells. You may not have enough information to write a standard
multidimensional weighting matrix which defines weights for the intersection of all these
characteristics. However, as long as you have information on each category individually
(e.g., male, female, 21-24,25-30, and so on) you will be able to perform the weighting
required with rim weights.
Rim weighting is designed to attempt to weight all characteristics at the same time. The
accuracy of your weighting will depend on how well your sample matches the know
universe. If the sample is a good match, then it is likely that Quantum will generate
acceptable weights; if the sample is not a good match it is possible that the weights will
look perfectly acceptable when you look at the number of men or the number of married
people, but will look totally unacceptable when you look at the number of married men.
As the rim weighting process runs, it tries to distort each variable as little as possible
while still trying to attain all the desired proportions among the characteristics. The ‘Root
Mean Square’ figure which Quantum produces will tell you how much distortion you
have introduced (i.e. how reliable your sample is). The larger the number, the more
distortion and thus the less accurate your sample is. This is discussed in more detail later
in this chapter.
Another very powerful facility of rim weighting is the fact that is automatically rescales
all the target values to the same base. For instance, suppose you have a sample of 5,000
respondents. Your rim weighting matrix defines:
Quantum will calculate the weights for these characteristics, using the figures given, and
will then adjust them so that the total for the weighted table is 10,000. If you do not define
a total, the weights will be adjusted to the total of the first variable defined in the matrix.
As you can see from this simple example, rim weighting can be used when you have
weights coming from different sources, and when those weights to have a common form
or total.
When we were talking about target weighting, we said that sometimes you might not
know the actual counts of respondents in a group, even though you may know that the
group is a certain percentage or proportion of the total population. For instance, you may
know that 60% of the population is women, but you may not know how many women
that represents.
When this happens, you can enter the percentages or proportions as the weights for each
group, and use the keyword input to indicate that these figures should be used as targets.
For example, in a table of age by sex you would enter the proportion or percentage that
each combination of age and sex is of the total population, and Quantum would calculate
what weight to assign to each respondent in each category.
When you define targets which add up to more than the number of respondents in your
sample, Quantum will calculate the weights for each respondent such that the total for the
weighted table equals the total of the figures in the weighting matrix. You may define
your own total figure (usually the number of respondents in your sample) using the
keyword total=n, where n is the required weighted total. Quantum will then calculate the
weights according to the values in the weighting matrix and will then adjust them to
match the total you have defined.
Preweights
Preweights, stored as part of each respondent’s data or created during the edit, are applied
to individual records before target or factor weighting is applied. When the characteristic
weights are targets, the preweights are used in the calculation of the weight for each
respondent. For example, suppose that each of our 380 housewives has a preweight in
columns 181 to 189 of their data record: one has the value 10 in c(181,189), while for
another the weight in that field is 20. If all the rest have a weight of 1, we would appear
to have:
To reach our target of 10,000, the weight for each woman would be:
Preweights are often used in studies which deal with newspaper readership, or the like,
where a male adult respondent in a household will be counted as the total number of male
adults in the household, on the theory that the other males will probably have the same
demographics and similar behavioral patterns. Another use is in political polls, where a
respondent is preweighted by the number of calls it took to reach him. The supposition
behind this theory is that the more calls it takes to reach a respondent, the more people
there are like him, who are equally hard to reach. The respondent must therefore be
preweighted in order to help represent the many like him who were never interviewed.
Postweights
The opposite of preweights are postweights, which are applied after all other weights
have been applied, and therefore have no effect on the way in which targets are reached.
They are generally used to make a final adjustment to a specific item.
Suppose, for instance, that a survey was conducted in London and Inverness, and 200
respondents were interviewed in each city. The standard weighting might balance each
group according to sex and age so that the samples match the patterns of the total
populations in those cities. After this is done, you might apply a postweight to adjust the
totals for each city into their correct relative proportions, where London has a much
larger population than Inverness.
Although the a statement is the first statement in the tabulation section, weights are
calculated before any conditions specified on the a statement are applied. A similar thing
applies to filters defined on flt, sectbeg and tab statements. Therefore, filters do not
exclude respondents from weighting calculations.
• use reject;return in the edit section to reject them from the whole tabulation section,
or
• create a cell in the weighting matrix for those respondents and give them a weight of
zero.
Quick Reference
To define a weighting matrix, type:
wmnumber axis_names[;weight_type];[maxwt=max];[minwt=min];weight1;weight2;...
There are two ways of defining characteristic weights. You may either set up a weight
matrix that declares the weighting conditions and the weights to be applied, or you may
declare the weights in an ordinary axis and then label that axis as a weighting axis. This
section explains how to declare weights in a weighting matrix.
✎ Although you may write jobs that have weights declared in weighting matrices and
in axes, the syntax for the two methods is not interchangeable. If you want to define
weighting based on a combination of age and sex, say, you must either specify it all
using a weighting matrix or all using an axis.
When you weight using matrices, each weighting matrix defines a set of conditions and
the weights to be applied when a respondent is found having those characteristics. Matrix
characteristics are specified in ordinary axes which may be used for other parts of the
program, or not, as you choose. In our original example, if a respondent’s exact age is
stored in c(107,108) and sex is a ’1’ (Male) or ’2’ (Female) in c106, the middle-aged
women have c106’2’ and some arithmetic value between 45 and 54 in c(107,108). When
these axes are used for weighting, base statements (n10, n11 and Base on col/val/fld/bit),
text statements (e.g., n03), statistical rows (e.g., n12) and unweighted elements are
ignored.
wmn axes_names;[options];weights
where n is a unique number by which the matrix can be identified, axes_names are the
axes defining the characteristics of this matrix, options are keywords defining the type of
weighting required, and weights are the targets, factors or proportions to be used for
weighting.
✎ You cannot use a grid axis on a wm statement. If you need to define weights
specifically for a grid axis, you should define a dummy axis with as many elements
as there are cells in the grid axis and use that instead.
☞ For further information, see "Weighted grids" in chapter 26.
The matrix number may be any number between 1 and 9, and as long as no number is
repeated, matrices may be numbered in any order.
Options are:
☞ These options and their relevance to weighting are described in the section entitled
"c= and anlev with wm statements".
Let’s look at a simple matrix defining weights for age and sex. The matrix will consist of
targets based on population figures for the area covered by our survey:
so we would write:
If you prefer, you can enter the weights on separate lines, as the figures are to be printed
in the table:
In this example, all the weights are whole numbers, but Quantum can cope equally well
with weights which are real numbers.
If several consecutive weights are the same, you may save yourself time by writing the
weight out once and preceding it by an asterisk and the number of times it is to be
repeated. For example:
3*10.5
If your tables are to be correct, it is imperative that you enter the axis names and the
weights in the correct order. Axes are entered as they are for tab statements; that is, higher
dimension axes followed by column axis followed by row axis.
Weights are entered on a row by row basis, working from left to right along the row. As
you can see by comparing the numbers on the wm statement with those in the chart above,
the first two numbers are the weights for men aged 18 to 24 and women aged 18 to 24,
in that order. Note that there is no need to keep weights for different characteristics on
different lines; just string them one after the other separated by semicolons on the same
line. If you run out of room, continue on the next line remembering to start the line with
a plus sign to tell Quantum it is a continuation.
If you like, think of weighting as creating a table in which you not only specify the axes
to create the cell-conditions but also define the numbers to go into those cells. When a
table is created that uses these weights, the program will first check which cell of the table
the respondent belongs in, then it will look to see which cell of the weighting matrix
refers to him. Finally, Quantum reads the appropriate weight from the matrix and
increments the cell count in the table by this value instead of by 1.0.
The preweight read from c(181,189) will be applied before the targets listed on the wm
statement.
A Quantum run may contain up to nine weighting matrices (wm1 to wm9), each of which
may name up to nine axes defining the conditions of that matrix. The maximum number
of weights per run is 2,000.
Do not be put off by the prospect of multidimensional weight matrices: they are exactly
the same as multidimensional tables. The last two axes named on the wm statement are
the rows and columns of the table, and weights are entered with reference to the last axis.
For example, we might have:
where 1,200 is the target for working men aged 18-24, 2,400 is the target for working
women of the same age, 1,400 is the target for working men in the age group 25-34, and
so on. Not until all weights have been defined for people who work do we come onto
those for people who do not work. Remember that base-creating elements are ignored.
Quick Reference
wttarget=number
wtfactor=number
✎ Elements must be specified using n statements since this facility does not yet work
with col, val, fld or bit statements.
Rim weights are not yet supported by this method; you must specify them using a
weight matrix.
To define weighting targets for the elements of an axis, add the option:
wttarget=number
to the n01 statements that create the elements, where number is the number of
respondents you want Quantum to show in the element. The example below shows how
to define targets based on sex. When you weight tables by sex, Quantum will count the
number of men in the data and will calculate a weight such that the number of men
matches the target for men.
l sex
n01Male;c=c156’1’;wttarget=485
n01Female;c=c156’2’;wttarget=515
n01Not answered;c=c156n’12’
The ‘Not answered’ element has no target defined so it is ignored for weighting purposes.
This means that records in that element are unweighted. If you do not already have an
axis whose elements define the weighting characteristics you want to use, just create the
axis but do not use it on a tab statement.
You define factors in the same way except that you use the keyword:
wtfactor=number
Any elements in a weighting axis that do not contain either wttarget or wtfactor are
ignored for weighting purposes.
Each weighting axis must have a number by which Quantum can refer to it. Type the
statement:
wmnumber=axis_name
at the top of the tabulation section under the a statement. If weighting is defined in the
sex axis, you could write the wm statement as:
wm1=sex
If you use the weighting axis as the rows, columns or higher dimension of an unweighted
table the weighting specifications are ignored. For example:
wm1=sex
tab sex region
tab sex region;wm=1
l sex
n10Base
n01Male;c=c110’1’;wtfactor=48
n01Female;c=c110’2’;wtfactor=52
l region
col 123;Base;North;South;East;West
This specification produces two tables. The first is unweighted so the weighting
information in the sex axis is ignored:
The second table has the same rows and columns but the cell values are weighted using
the weights in the sex axis:
Quantum does not accept a weighting axis as the rows or columns of the table if the table
itself is weighted using a different axis, as in the example shown here:
wm1=sex
tab region brand;wm=1
l region
n10Base
n01North;c=c115’1’;wttarget=100
n01South;c=c115’2’;wttarget=110
n01East;c=c115’3’;wttarget=120
n01West;c=c115’4’;wttarget=115
To alert you to this error Quantum issues the message ‘weight line needs one target or
factor’ for each element in the row or column axis (in this example, for each region).
Preweights and postweights, weighting using proportions, and weighting to a given total
are all requested using keywords on the l statement:
Sometimes you will be dealing with data in which all weighting information is already in
the record or where the weights are all calculated in the edit. The only way to weight
using information from the data or edit is to use preweights because otherwise Quantum
expects to read the weights from the wm statement. However, preweights cannot be used
by themselves, so we need to set up a dummy weighting matrix as shown here. Using a
weighting matrix you would write:
a;op=12;dsp;wm=1
wm1 axdum;pre=wtvar;factor;1
.....
l axdum
n01
If you are declaring weighting in the axes themselves you would write:
a;op=12;dsp;wm=1
wm1=axdum
.....
l axdum;pre=wtvar
n01;wtfactor=1
The weight is read from the given variable (in this case wtvar) and treated as a preweight.
Since preweights must be used with targets, factors or proportions, we define a factor of
1 which will not alter the value of the preweight when the two are multiplied.
In the weight matrix version, the dummy axis, axdum, contains a single n01 statement
with no conditions to correspond to the single factor in the matrix.
If the value in wtvar is 5, the final weight for the respondent will be 5 (5*1=5).
Quick Reference.
To specify the maximum and/or minimum weights you will accept, place the keywords:
either on the wm statement if you are using a weight matrix, or on the l statement or on
individual elements if you are declaring weights in axes.
The options maxwt= and minwt= allow you to define the maximum and/or minimum
weights that can be applied in tables using a specific weighting matrix or axis. The values
you specify may be integers or reals.
If you are using weighting matrices you place these keywords on the wm statement; if
you are specifying weights in axes you place these keywords on the l statement or on
individual elements. If you use the same keyword on an element and on the l statement
with different values, the setting on the element overrides the setting on the l statement.
When you specify maxwt= and/or minwt=, Quantum tries to ensure that the maximum
and/or minimum weights used in the table match your specifications. Quantum performs
the weighting calculations and adjustments as follows:
2. Examine each weight and compare it against the minimum and/or maximum values
defined. If a weight is less than the minimum value it is set to the minimum value, if
it larger than the maximum value it is set to the maximum value. If no adjustment is
necessary the weighting calculation is complete.
3. If adjustments were necessary, Quantum calculates the total obtained using the
modified weights and compares it with the total obtained using the unmodified
weights. If the totals are different, Quantum attempts to correct for this by adjusting
the weights which were not set to maxwt or minwt and then returns to step 2.
If all weights are set to maxwt or minwt so that no correction is possible, Quantum
uses the unmodified values.
All adjustments made with this type of weighting are recorded in the weighting report
file.
☞ The name of the weighting report file varies between machines. See section 36.9 for
the name of this file on the type of machine you are using.
You can use this facility with target, factor, input or rim weighting. Pre- or postweights
are not affected by adjustments made by this stage of the weighting process.
✎ Use this facility with care. If minimum or maximum adjustment takes place it is
possible that the targets or proportions defined in the matrix will not be met.
In all our examples so far, we have known the total number of people in our population
who have two or more characteristics in common – one from each axis. For instance, in
the weighting matrix for region age and sex we knew how many men aged 65 or over live
in the west.
Suppose, however, we don’t have these figures. We know only that our universe of 1,000
people can be described as follows:
200 people live in the north, 380 people live in the south,
150 people live in the east, and 270 people live in the west
We don’t know, for instance, how many men aged 65 and over live in the north, so we
cannot create a standard weighting matrix using region, age and sex as characteristics.
Instead we use rim weighting which permits us to weight on these three conditions, thus:
Here, we have listed the four population totals for region, followed by the four totals for
age with the two totals for sex at the end. We have also put the three sets of weights on
separate lines. This has been done to make the example easier to read, but you can string
them all together on the same line if you wish.
Weighting – Chapter 25 / 409
Quantum v5e User’s Manual
Note that when rim weighting is used, there is a maximum of 16 weighting axes per run.
Rim weighting is also useful when you know the relationship between some axes but not
others - for instance, you may know how many people of each sex you have in each age
group, but not the relationship between these and the region in which the respondent
lives. To weight using age, sex and region as characteristics, create an axis called, say,
agesex, which combines the axes age and sex as follows:
l agesex
n10Base
n01Male, 18-24;c=c110’1’.and.c112’1’
n01Female, 18-24;c=c110’2’.and.c112’1’
Your rim weighting matrix will then contain weights for age and sex combined and
region:
Rim weighting calculates weights using a form of regression analysis. This requires two
parameters: a ‘limit’ which defines how close the weighting procedure must get to the
targets you have given in order for the weights to be acceptable, and a number of
‘iterations’ which defines the number of times the weight calculations may be repeated
in order to reach the cell targets.
The regression technique compares the root mean square (rms) deviation of each weight
with the given limit, and once all weights are within the given limit weighting is
considered successful. If, after the maximum number of iterations, the root mean squares
are still outside the limits, a message ‘rim weighting failure’ is issued.
The default limit is 0.005 and the default number of iterations is 12. You may change
these parameters by creating a rim weighting parameters file containing a line of the
form:
where n is the number of the weighting matrix concerned, x is the new limit (between
0.0001 and 0.05) and y is the new number of iterations required (between 5 and 500). For
example, you may wish to reduce the limit and increase the number of iterations on a
large sample to increase the accuracy of the weights.
☞ The name of the rim weighting parameters file varies between machines. See section
33.7 for the name you should use.
As with ordinary weighting, rim weighting writes a summary report of the weights it
applied in a file called weightrp. This shows the weights for each category as they were
specified in the Quantum spec, and the input and projected frequencies and percents, and
then the weights it calculated. If you wish, you may request a more detailed report that
shows the rim weights calculated at every iteration.
This more detailed report has, in addition to the standard pages, one page per iteration
showing the root mean square (rms) and limit at that iteration, plus a table showing the
current weight, output and projected frequency for each weighting category. For
example:
Weighting matrix 1:
After 1 iteration:
rms=607.817042 limit=0.500000
RIM OUTPUT PROJECTED
WEIGHT FREQUENCY FREQUENCY
---------- ---------- ----------
2.200000 10.000 22.000
1.250000 16.000 20.000
1.823529 17.000 31.000
2.250000 12.000 27.000
0.927721 50.662 47.000
1.074218 49.338 53.000
report=detailed
to the weight matrix entries in the rim weighting parameters file for which you require
the report. (The standard report type is report=normal, but this need never be specified.)
☞ For further information on rim weighting see the Rim Weighting Theoretical Basis
Paper entitled ‘ON A LEAST SQUARES ADJUSTMENT OF A SAMPLED FREQUENCY
TABLE WHEN THE EXPECTED MARGINAL TOTALS ARE KNOWN’, by W. Edwards
Deming and Frederick F. Stephan, in Volume 11, 1940 of the Annals of
Mathematical Statistics.
You may use maxwt= and minwt= with rim weighting as for ordinary jobs. Once
Quantum has calculated the weights according to your rim weighting specification, it
checks whether there are any that are lower than the minimum value you specified or
higher than the maximum. If so, it adjusts the weights as it does for other weighting
methods.
Quantum makes 100 attempts (iterations) at setting weights that fall within the given
range, after which it stops and reverts to the original weighting factors calculated before
the adjustments.
If the adjustments fail, Quantum produces the standard rim weighting report showing the
weighting factors calculated by the rim weighting process. If the adjustments succeed,
the rim weight, output frequency and output percent columns in the weighting report will
contain the following information:
RIM WEIGHT The original weight factors calculated by the rim weighting
process. These are used only if the adjustments fail.
OUTPUT The final output frequencies. If the adjustments succeeded
FREQUENCY then the figures will be based on the adjusted factors,
otherwise they will be based on the values reported in the
RIM WEIGHT column.
OUTPUT PERCENT The percentages for each rim element, either adjusted or
unadjusted as appropriate.
In addition, the Rim Weighting Efficiency and the Maximum and Minimum Respondent
Rim weights will also show adjusted figures if adjustment was successful.
✎ It is not possible to write out the adjusted rim weights as it is the cell weights that
Quantum adjusts rather than the rim weights.
Quick Reference
To use weights, type:
wm=number
as an option on the a, sectbeg, flt, tab or n (or equivalent) statement, where number is the
number of the weighting matrix or axis.
wm=0
Using weights is easy compared to setting them up. All you have to do is tell Quantum
which weighting matrix or axis to use when. Weighting is invoked by the option:
wm=n
on the a, sectbeg flt, tab or n line, where n is the number of the weighting matrix or axis
to be used. For example, to weight a table using weight matrix/axis 3 we would write:
wm= on the a statement is operative for the whole run, whereas on a tab line it refers only
to tables created by that statement (remember, and statements take their options from the
previous tab statement). When used on an n statement or with elements on a col, val, fld
or bit statement, wm= weights that element only, according to the given matrix. If a table
contains row elements weighted using one matrix and column elements weighted using
a second matrix, each cell will be weighted using the matrix named on the column
element for that particular cell. Weighting defined for the row axis is ignored.
To turn weighting off for a particular cell, for example to produce a table containing a
weighted and an unweighted base, place the option:
wm=0
on the element. To produce an unweighted table in an otherwise weighted run, add this
option to the tab statement for that table.
Information about the weights calculated and applied is written to the weighting report
file. Some information is also displayed on the screen.
☞ For further information on the name and content of this file, see section 36.9.
Normally weights are calculated and applied each time a record is read in or a process
statement is executed. However, when trailer cards are read in one at a time, the weight
is calculated as if each trailer card were a new record. When using targets this can lead
to incorrect weights being used. To limit the cards that contribute to the weighting
calculations and the number of times each weight is applied, the c= option may be
specified on the wm statement. It takes the form:
This causes weights to be calculated whenever the first card of a new record is read. If
the current card is not the first card of a new record, Quantum applies the last weights
it calculated (i.e., those calculated when the first card in this record was read).
The keyword anlev may also appear on the wm statement to cause weights to be applied
only at the named level. For example, the statement:
Quick Reference
To transfer weights into the data, type:
in the edit.
Often you may wish to transfer the weights created during a run back into the data file
itself, in order, for example, to give them to clients. This can be done using the statement:
in the edit. This tells Quantum which weighting matrix to use (you may omit this
parameter if there is only one matrix), which columns to store the weights in and the
number of decimal places required. Remember that the decimal point takes up a column
in the data, so you will need to assign at least one column more than there are digits in
the largest weight. If some weights have more digits before or after the decimal point than
others, do not worry. Quantum always puts the decimal point in the same column.
✎ When you use wttran, make sure that you include a write or split statement in the
edit after the wttran statement to save the new data in a file. Remember that unless
you specifically save your new data, all alterations or additions exist only as long as
each record is being processed.
wttran 2 c(75,80) :2
which copies the respondent’s weight from matrix 2 into columns 75 to 80 of the data
record. The weight is copied with two decimal places and has the decimal point in c78.
This chapter tells you how you can increase the efficiency of your program by filing
frequently-used groups of statements separately and calling them up as and when they are
required.
A facility exists to allow parts of Quantum programs to be filed and included at the
appropriate position in the main Quantum program. Entire blocks of axes and table
control statements may be filed and later retrieved whenever called for. It is quite
possible, and indeed common, to have a whole run consisting of a series of files to be
included.
An additional bonus is that Quantum allows you to replace variable items in the file with
symbolic parameters, and to define the values these parameters are to have each time the
file is retrieved.
Quick Reference
To include one data file in another, or to include a file of Quantum statements in the main
program file, type:
#include file_name
Programs may be split up into logical units, for example, edit, tabs, and axes, with each
group of statements saved in a different file.
A program is most likely to contain groups of identical statements when a series of tables
is reproduced a number of times using different filters. Say, for example, that we are
asked to produce a series of tabulations by area of residence, and for each area we require
the tables:
This could be done using a flt statement for every area and writing out these statements
five times, thus:
flt;c=c121’1’
ttlBase: Respondents Living in Central London
tab brand demo
ttcBrand Bought Most Often
.
.
but we can reduce our work greatly by creating a new file containing the tabs and tts
instead. Let’s call it tab1.
Now, to create these tables we need to tell Quantum to read the file. This is done with an
*include or #include line in the program at the point at which the file should be read.
(There is no difference between *include and #include, but we’ll stick to #include for all
out examples.) For example:
flt;c=c121’1’
ttlBase: Respondents living in Central London
#include tab1
flt;c=c121’2’
ttlBase: Respondents Living in Outer London
#include tab1
flt;c=c121’3’
ttlBase: Respondents Living in England Outside London
#include tab1
.
.
The first time this file is included, Quantum will list its contents in the output listing, but
whenever it is included again in the same program, only the include statement will be
printed so that you can see which file Quantum has read in.
This example may be abbreviated even further by using symbolic parameters for the
items which differ from filter to filter. This is discussed below. Because tab1 is in the
same place as the rest of our program we have only entered its filename. If the Include
file is in a different directory or partition, you must give its full name otherwise Quantum
will tell you it is unable to find the file.
An Include file may itself include other files: for example, the file called .bx "Include
files, nesting" subs may include the contents of a file called ax9A. Each time we call up
subs we are also calling up ax9A. This is called nesting. Quantum allows you to use up
to 200 different Include files in a run.
The statement #include filename may appear in a Quantum data file as well. Includes may
be nested in a data file to four levels: that is, the main data file may contain an include to
call up the file DATAB, which itself includes DATAC which in turn includes DATAD:
Quick Reference
To read a non-standard data file into Quantum, create a dummy data file containing the
statement:
#includes file_name;reclen=num_cols[;header=number[r]]
where header is the number of characters (number) or records (numberr) to skip at the
start of the file.
Quantum is able to read certain types of non-standard data file. Facilities currently exist
for:
• reading records which are not terminated by a new-line character; for example,
where the data is a continuous string of characters in which each record is exactly a
given length.
To read a non-standard data file, create a dummy data file containing the line:
where datafile is the name of the non-standard data file, and reclen defines the length of
each record in bytes (characters).
The header parameter is required when there is information at the start of the file which
you wish to ignore. If you enter it as ‘header=n’, it indicates that n bytes (characters)
should be skipped at the start of the file; if you enter it as ‘header=nr’, it indicates that n
complete records should be skipped, where a record is assumed to be reclen bytes long.
Quantum reads the data file in binary, and ignores any values in the skipped header. Any
non-printable characters in the rest of the data are converted to blanks (these are values
other than 32-127 inclusive for Unix, 160-255 inclusive for various for EBCDIC).
Sometimes a Quantum program contains a set of statements which are similar but not
identical. Quantum allows the set to be filed with symbolic parameters for the varying
parts. You may then define the actual values of these parameters each time the Include
file is called up.
Symbolic parameters are frequently used when you have a series of questions all having
the same response list, with the only difference being the columns or codes with which
the answers are coded. You can either write out the axes specifications separately with a
different column number each time, or you can put the axis into an Include file and
replace the column number and/or code with symbolic parameters. You can then call up
the file as many times as necessary, defining a new value for the symbolic parameter each
time.
When parameters are defined for files included in this way, the parameter value refers
only to statements within that file. Statements in the main file need their own symbolic
parameters and definitions. Global definitions may be assigned to parameters using the
#def statement.
Quick Reference
To define a column symbolic parameter, type:
col(letter)=column_number
cletter number
cann
where a is any single upper- or lower-case letter, except C, N, T, U, X or any other letter
that is the name of a user-defined variable. nn is any integer number. C, T and X may not
be used because they are the names of Quantum variables – if you write ct10, this is an
error because Quantum will think you mean c(t10) to substitute the value of t10 – while
the letters N and U are invalid because they represent the operators ‘not equal to’ and
‘unequal to’ – if you write cU30, this is an error because Quantum will think you are
trying to check whether the value in a column field is unequal to 30, but have omitted the
column numbers.
in the file ifil. b00 is the symbolic parameter which will be replaced by the appropriate
column number each time ifil is used. This is done with the notation:
col(b)=n
on the #include statement. b is the name of the symbolic parameter and n is the value to
be substituted. For instance:
#include ifil;col(b)=256
Now, when the contents of ifil are read, the column numbers will be interpreted as:
Remember that the value of 256 for column b00 is relevant only to the statements within
ifil. Any statements in the main file using column b00 will need separate definitions for
this column.
Let’s take a practical example. Our respondents were given two detergents marked A and
B to try. After two weeks they were interviewed and asked to say which product they
preferred for a variety of tasks. Our client wants a table showing which product was
preferred for each task.
The questionnaire tells us that each task is coded into a different column, but the codes
defining which product was preferred remain the same throughout: a ’3’ in c134 indicates
that Product A was preferred for washing woollens, while the same code in c135 means
that Product A was preferred for washing cottons. To make this example clearer, here is
the relevant part of the questionnaire:
The row texts and the codes are the same for each product, the only thing that changes is
the column, therefore we can replace it with a symbolic parameter and file the statements
away in a file called ipref, as follows:
n01Noticed a Difference;c=ca00’3/5’
n01 Prefer Product A;c=ca00’3’
n01 Prefer Product B;c=ca00’4’
n01 No Preference;c=ca00’5’
n01Did Not Notice a Difference;c=ca00’2’
n01Only Used One Product (DK Which);c=ca00’6’
n01Only Used Product A;c=ca00’7’
n01Only Used Product B;c=ca00’8’
l prefer
n10Base
n23Washing Woollens;unl1
#include ipref;col(a)=134
n23Washing Cottons;unl1
#include ipref;col(a)=135
n23Washing By Hand;unl1
#include ipref;col(a)=136
.
.
.
Note that when you define a value for a symbolic parameter, it always refers to the
parameter numbered 00. Whenever Quantum reads a00 from ipref, it will substitute 134,
135 or 136 in its place, as defined on the #include statement. If the notation were a06
instead, Quantum would replace it with 140 (or 141 or 142), since a06 is six more than
a00.
In our next example, we have a series of unaided questions asking respondents to name
products they are aware of and then to say for which products they have seen or heard
advertising. For each question we record the first response separately from any others.
We want to set up two axes, one for first mentioned in brand awareness (aware1) and the
other for first mentioned in advertising awareness (aware2).
Each brand in each of the two categories is coded into a separate column. If the column
contains a ’1’, the product was mentioned first, if it contains a ’2’, the product was not
named first. The questionnaire looks like this:
Both axes use the same brand list, so in our program file we would simply write:
l aware1
#include blist;col(a)=124
l aware2
#include blist;col(a)=148
n10Base
n01Brand A;c=ca00’12’
n01Brand B;c=ca01’12’
n01Brand C;c=ca02’12’
Notice that the number of the symbolic parameter is increased by 1 for each brand,
because each brand has its own column. In aware1, Brand A is in c124, Brand B is in
c125, Brand C is in c126, and so on, whereas when we call up blist for aware2, Brand A
is read from c148, Brand B from c149, and so on.
More than one symbolic parameter of the same sort may be used on a line, as long as the
names of the parameters are different. Suppose we wanted to set up an Include file for
rows which show whether a respondent mentioned a brand first for both questions. The
condition is an ‘and’ condition requiring two columns to be named. We can either use the
same symbolic parameter and remember to increment the second by the difference in
column numbers between the two questions:
n01Brand A;c=ca00’1’.and.ca25’1’
or we can use two different parameters and have each start from 00:
n01Brand A;c=ca00’1’.and.cb00’1’
#include bfil1;col(a)=124;col(b)=148
In all our examples we have used n01 statements. Symbolic parameters may also be used
on col and val statements:
☞ For examples of using column symbolic parameters, see the section entitled "Grid
axes".
Quick Reference
To define a symbolic parameter for codes, type:
punch(letter)=’code’
var_name’letter’
Codes are assigned symbolic parameters in much the same way as columns, except that
the notation is:
cn’a’
where n is any whole number and ’a’ is any single letter in upper or lower case.
We might have a set of questions asking which brands of cat food were bought on various
visits to the shops. Each brand has a different code, but all information about any one visit
is stored in the same column. However, we need to set up axes based on the brand bought
rather than the visit when it was bought. Thus we might have an axis as follows:
l brda
n10Base
n01First Visit;c=c134’1’
n01Second Visit;c=c137’1’
n01Third Visit;c=c140’1’
n01Fourth Visit;c=c143’1’
The axes for the other three brands are exactly the same except that codes 2, 3 and 4 are
used instead of a ’1’. Therefore it makes sense to use an Include file:
n10Base
n01First Visit;c=c134’p’
n01Second Visit;c=c137’p’
n01Third Visit;c=c140’p’
n01Fourth Visit;c=c143’p’
l brda
#include brds;punch(p)=’1’
l brdb
#include brds;punch(p)=’2’
For each run, you may use up to 31 different symbolic parameters for codes (including
blanks). Also, all definitions are only relevant to the statements within the Include file.
✎ If your code substitution appears not to be working, check that you have not
accidentally typed an extra character with the symbolic parameter inside the quotes.
Quantum only performs the substitution if the symbolic parameter is the only code
inside the quotes. If there is any other code inside the single quotes with the
symbolic parameter, Quantum treats the symbolic parameter as a multicode. For
example, if you type:
1 ax1
*include brds;punch(p)=’1’
n01Brand A;c=c132’5p’
Quantum will treat the ’p’ as a multicode of ’7–&’ and will include in the element
any record in which column 132 contains any of the codes 5, 7, – or &. Without the
’5’, the specification would refer only to records with a code 1 in c132.
If you type any additional characters inside the quotes, Quantum converts the letter to its
corresponding codes and does not perform the substitution you require.
Quick Reference
To define a symbolic parameter for text, type:
name=text
&name
Texts are replaced by one or more characters preceded by an ampersand. For example:
ttl&txt
n23All Traveling by &txt
Symbols must not contain blanks, and if more than one symbol is used in a program, they
must be unique amongst the lowest number of characters. That means, if there are two
symbols, one of three characters and one of four, the first three characters of each symbol
must differ by at least one character. Similarly, if both symbols are the same length, only
the last character need differ for them to be unique:
Now that we know this, we can write the example used above in the section entitled
Symbolic Parameters for Columns even more efficiently, since we can use a symbolic
parameter for the texts as well. Instead of writing out the heading on the n23 statement
we can represent it with the parameter ‘wash’:
n23&wash;unl1
n03
n01Noticed a Difference;c=ca00’3/5’
n01 Prefer Product A;c=ca00’3’
n01 Prefer Product B;c=ca00’4’
n01 No Preference;c=ca00’5’
n01Did Not Notice a Difference;c=ca00’2’
n01Only Used One Product (DK Which);c=ca00’6’
n01Only Used Product A;c=ca00’7’
n01Only Used Product B;c=ca00’8’
As you can see, the text itself may be longer than the parameter which represents it. Just
make sure, though, that when the text is substituted, it does not make the whole line
longer than the 200 character maximum allowed.
flt;c=c121’p’
ttlBase: Respondents Living in &area
tab brand demo
ttcBrand Bought Most Often
tab prefer age
ttcBrand Preferred
tab demo bk01
ttcDemographics
Once again, the definitions for the symbolic parameters refer to the statements in the file
included, but not to the rest of the program. A maximum of 15 different text symbolic
parameters are allowed per run.
Users are sometimes misled by the col(a) notation into thinking that it is only for
substitutions to do with columns. This is not true. You can use col(a) to define symbolic
parameters for variables too.
To illustrate this, let’s go back to the product test we used for explaining symbolic
parameters for columns. In the example that dealt with product and advertizing
awareness we explained how you could use two symbolic parameters on the same line to
test whether a product was named first at both the existence and the advertizing
questions. To do this you could either use the same parameter with different values:
An alternative is to create an array of named variables in the edit section that are set to 1
if the respondent mentions the same brand first at both questions. You can then replace
the ‘and’ condition on each element with a reference to one variable containing all the
information you need.
int first 3s
ed
t2 = 0
do 10 t1 = 124,126
t2 = t2+1
first(t2) = c(t1)’1’ .and. c(t1+24)’1’
10 continue
end
tab faware bk01
l faware
ttlSame brand mentioned first for existence and advertising
#include bfil1;col(a)=1
The include file, bfil1, contains the elements of the axis as follows:
n01Brand A;c=first(a00).eq.1
n01Brand B;c=first(a01).eq.1
n01Brand C;c=first(a02).eq.1
At the end of the edit first(1) will be set to 1 if the respondent mentioned Brand A first at
both the existence and advertizing questions. It will be set to zero if Brand A was
mentioned first at one question only or was not mentioned at all. first(2) and first(3)
contain similar information for brands B and C respectively.
Quick Reference
To assign global values to symbolic parameters, type:
Occasionally a group of includes will have the same column, code or text parameter in
common, even though they may differ in all other aspects. You may either enter all values
separately on each include, in which case they refer only to the parameters in the file
included, or you may save yourself time by defining the common parameter values on a
#def statement so they will remain operative until overwritten by another #def.
#def [col(a)=cc];[punch(b)=’p’];[txt=text]
where col, punch and txt are optional, depending upon the value to be defined, ‘a’ and ‘b’
are the parameters whose values are to be defined, ‘cc’ is a column number, ‘p’ is a code
and ‘text’ is a string of text. (#def also has an equivalent *def which you may use if you
prefer. We’ll use #def in our examples).
#def col(b)=157
Grid tables are a special sort of table whose specification invariably uses symbolic
parameters. Normally, as in all the tables we have discussed so far, the condition
determining whether or not a respondent is eligible for inclusion in a cell is the same for
all cells in a given row. In the axis sex, for instance, all men are included in the first row
and all women in the second. There is usually one condition for all cells in a given column
as well.
With grid tables, the conditions for a row or column vary from cell to cell, so we use
symbolic parameters to refer to the data columns and/or codes for each cell. Values are
then assigned to these parameters via n01 statements in the axis.
Grid tables are easily recognizable in a questionnaire because in many cases the
questionnaire contains a chart on which the interviewer is to record the responses, with
the rows of the chart being the rows required on the table, and the columns being the same
as the columns required in the table. In your output you will want to reproduce this chart
with the only difference being that on the questionnaire you see the codes representing
each answer whereas on your table you will see the number of people giving each
response. Typical questions are ones dealing with rating scales.
Grid axes
Grid tables come from grid axes which comprise four items:
1. The l statement which names the axis and defines its conditions.
2. n01 statements to define the column headings and assign values to the symbolic
parameters.
3. The side statement to separate the column definitions from the row specifications.
Let’s work through an example, starting with the chart on the questionnaire. There are
four products to be rated on a scale of 1 (Excellent) to 5 (Abysmal); in the questionnaire
we see:
Before deciding how to write this in Quantum, let’s look at the conditions for each cell
of this table and consider how they differ from those in an ordinary table. In a normal
table the axes used for the rows and columns are independent of each other. In this grid
table the condition for the row depends on the condition for the columns, and vice versa.
Then, following the instructions laid out for grid axes, we write:
l q10
n01Brand A;col(a)=134
n01Brand B;col(a)=135
n01Brand C;col(a)=136
n01Brand D;col(a)=137
side
col a00;Base;Excellent;Very Good;Satisfactory;Poor;Abysmal
The texts on the first four n01s provide us with the column headings and the values to be
assigned to the symbolic parameter for each column. If we had wanted to lay out the
column headings in a particular way, we could have written them on g statements
immediately before the side line.
The col statement names the symbolic parameter to be used and lists the row texts
required. Since no codes have been defined it is assumed that the first response is code
’1’, the second is a ’2’, and so on.
This axis produces the table shown above, except that the numbers 1 to 5 in each column
are replaced by counts of respondents giving each rating.
Suppose, now, that each rating is coded into a different column, so that Brand A is always
a ’1’, Brand B is always a ’2’ and a rating of Satisfactory for Brand C would be c136’2’.
The grid on the questionnaire is:
Nevertheless, we still want to create the same table as before, with brands as columns.
This time we would write:
l q11
n01Brand A;punch(p)=’1’
n01Brand B;punch(p)=’2’
n01Brand C;punch(p)=’3’
n01Brand D;punch(p)=’4’
side
ttlQ11: Overall Brand Rating
ttlBase: All Respondents
n01Excellent;c=c134’p’
n01Very Good;c=c135’p’
n01Satisfactory;c=c136’p’
n01Poor;c=c137’p’
n01Abysmal;c=c138’p’
By comparing these two examples we can see when to use symbolic parameters for
columns in a grid axis, and when to use them for codes. In the first example, the column
numbers differed from column to column of the table, so we defined them with column
symbolic parameters; in the second, the codes differed between columns, so we defined
them with code symbolic parameters. In short, look at the columns of the table to see what
alters and define the changing item with parameters.
In more complicated tables it is possible that both columns and codes will change from
column to column, especially if the grid refers only to specific elements of a response list.
Once again let’s work through an example to explain this.
In a survey on washing powders respondents have been questioned to test their awareness
of various brands on the market and to check what sort of advertising they have seen or
heard for various brands. The manufacturer who commissioned the survey is now
interested in finding out more about peoples’ attitudes to his particular products and asks
for a table showing he number of people who have already purchased one of his products
and also how likely people would be to buy them. c(212,213) tell us whether the
respondent buys the product, c(317,320) contain the number of times each brand was
bought, and c(321,324) show how likely the respondent will be to buy the brand in the
future.
Here is the axis which will produce the required table – we will discuss it presently:
l q20
n01Washo;col(a)=317;col(b)=212;punch(p)=’6’
n01Suds;col(a)=318;col(b)=212;punch(p)=’7’
n01Gleam;col(a)=319;col(b)=213;punch(p)=’3’
n01Sparkle;col(a)=320;col(b)=213;punch(p)=’5’
side
ttlQ20 – Number of Times Purchased
ttlQ21 – Purchase Intent
n00;c=cb00’p’
n10Base – All Who Purchased Product
col a00;hd=Number of Times Purchased;%unl1;One;Two;
+Three-Four;Five-Six;Seven-Ten;More Than Ten;dk=rej
n00
col a04;hd=Purchase Intent;%unl1;Base=All Respondents;
+Definitely Will Buy;Probably Will Buy;Might or Might Not Buy;
+Probably Will Not Buy;Definitely Will Not Buy;DK/NA/Refused
Let’s start by looking at the row definitions after side. The first n statement is an n00 –
a filter – which sets up the base for the number of people buying each of the products. It
uses both column and code symbolic parameters which are defined on the column n01s
above. The filter for all buying Washo is c212’6’, while all buying Sparkle are collected
with c213’5’.
The first col statement reads each of columns 317 to 320 to see how many times the
respondent bought each of the four products. If c317’3’ it means that the respondent
bought Washo three or four times; a ’6’ in c320 tells us that he bought Sparkle more than
ten times.
The blank n00 switches off the condition on the first n00 and all respondents now become
eligible for inclusion in the rest of the table. This time we re-use one of our symbolic
parameters and increment it by four for each brand. The second col statement therefore
reads data from columns 321 for Washo, 322 for Suds, 323 for Gleam and 324 for
Sparkle. If c322’4’ we know that the respondent probably will not buy Gleam.
The second part of this example (from the blank n00 onwards) assumes that respondents
were only asked about purchase intent if they said they were aware of the product and
that no other respondent has a code in these columns. If this was not the case, we would
put a condition on the n00 to exclude anyone who was not aware of one of the key brands.
(The table which this axis creates is shown overleaf.)
One 6 12 26 20
5% 8% 24% 11%
Two 23 3 11 30
18% 2% 10% 17%
Three-Four 12 34 5 26
9% 23% 5% 14%
Five-Six 17 39 9 51
13% 26% 8% 28%
Seven-Ten 15 28 23 16
12% 19% 21% 9%
More-Than-Ten 42 25 23 20
32% 17% 21% 11%
DK 15 9 13 17
12% 6% 12% 9%
Purchase Intent
DK/NA/Refused 3 6 0 0
2% 3% 0% 0%
Even though this example looks complicated, it is by no means uncommon. Its main
attraction is that it enables you to produce a grid table from several questions giving the
maximum amount of input in the minimum number of statements.
You do not use *def with grids themselves, but there may be times when your job
contains *def statements and grids. When this happens, you need to be particularly
careful about the letters you use for symbolic parameters. If you use the same letters for
symbolic parameters on a *def statement as in a grid axis you will find that either your
job will fail at the compile stage or it will produce incorrect tables.
• If you use the same letter for a column symbolic parameter on *def and in a grid axis,
the specification from *def overrides the specification in the grid. If the two
parameters refer to different columns your grid table will contain incorrect data for
that column. For example, if you specify *def col(a)=145 and then write
n01text;col(a)=167 in the grid axis, Quantum will create the element using
column 145.
• With punch(p)=, the specification from *def does not carry forward to grid axes. If
you specify punch(p) on *def but not in a grid axis your run will fail at the compile
stage with the message ’grid axis with no substitution’, even if *def declares the
punch letter you have used in the grid. This also happens if you specify punch(p) with
*def and in a grid axis, and you use the same letter to represent the punch, since
Quantum ignores the specification in the grid.
With punches you must use different letters in the grid and on *def.
• Although grids do not normally use text symbolic parameters, if a grid contains the
notation &txt, where txt is declared on a *def statement, substitution will take place.
For example, specifying *def txt=Brand A and then writing n01&txt in a grid
axis will generate an element in the grid labeled Brand A.
All options valid on n statements may be used in grid axes, with the exception of rej=.
Incs in grid axes can be specified in two ways. The first is to use the standard notation,
for example:
inc=c(a00,a01)
The second is to define it with a number, and then to call it up using that number. On the
column n01 statement the column whose contents are to be used as the increment is
identified with the option:
inc(n)=arithmetic expression
n25;inc=inc(n)
To illustrate this, we will go back to our rating question discussed at the beginning of the
section entitled Grid Axes earlier in this chapter. Respondents have been asked to rate
four products on a scale of 1(Excellent) to 5 (Abysmal), and we want to calculate a mean
score. The axis we would write to do this is as follows:
l rates
n01Brand A;col(a)=134;inc(1)=c134
n01Brand B;col(a)=135;inc(1)=c135
n01Brand C;col(a)=136;inc(1)=c136
n01Brand D;col(a)=137;inc(1)=c137
side
col a00;Base;Excellent;Very Good;Satisfactory;Poor;Abysmal
n25;inc=inc(1)
n12Mean Score
Quick Reference
To create a table from a grid axis, type:
to create rows and columns as they appear in the axis definition, or:
to rotate it by 90 degrees (i.e., the rows definitions become the columns, and the column
definitions become the rows).
Grid tables are the only time that the tab statement may be followed by a single axis name
only. To produce a table in which the rows and columns are as defined in the grid axis,
type:
where axisname is the name of the grid axis and grid is a keyword indicating that this is
a grid table. Since this is so, it follows that you cannot have an axis called grid. Options
may be any of the keywords listed in the section entitled "Options on a, sectbeg, flt and
tab statements" in chapter 16 and also section 21.2.
Normally, a grid table is created using the row and column definitions as they appear in
the axis. However, you may rotate the table by 90 degrees so that the rows in the axis
form the columns of the table, and the columns in the axis form the rows of the table. The
keyword which requests this type of table is rgrid:
The axis:
l rating
n01Brand A;col(a)=134
n01Brand B;col(a)=135
n01Brand C;col(a)=136
n01Brand D;col(a)=137
side
col a00;Base;Excellent;Very Good;Satisfactory;Poor;Abysmal
tabulated in the normal way with grid produces a table with brands as the columns and
ratings as the rows. With rgrid it generates a table in which the ratings are the columns
and the brands are the rows.
If there are no g and p statements in the lower section of the axis, the column headings
will be generated in automatic mode.
Note that any options defined with op= on the a or flt statement will be applied to the
table whether or not it is rotated. If you have op=12 (absolutes and column percentages)
specified on the a/flt statement, and you want to use the same axis with grid and rgrid,
you must specify the correct output options for the rgrid table if horizontal percentages
and absolutes are required. For example:
a;dsp;op=12
tab rating grid
tab rating rgrid;op=01
Grid axes in levels jobs work exactly the same as grid axes in ordinary jobs. To create a
table at one level and update its cells at a lower level, define the table creation level with
anlev= on both the tab and l statement. Define the update level with uplev= on the l
statement. For example:
This table is created at person level and updated at shop level. It tells you how many
people bought each flavor of yogurt in each shop.
You can create the same table using anlev= and celllev= on the tab statement and anlev=
by itself on the l statement:
This table is created at shop level, but its cells are updated once per person once all data
for that person has been read.
Weighted grids
Grid tables can be weighted. Normally, they’ll be part of a weighted run and the
weighting characteristics will be defined using ordinary axes such as age, sex or region.
If you need to define a weighting matrix specifically for a grid you must do so using a
dummy axis since Quantum does not accept grid axes on wm statements.
The dummy axis must have the same number of elements as there are cells in the grid
axis. You can create it simply by using a col statement that refers to a blank column. For
example, if the grid axis is:
l rating
n01Brand A;col(a)=134
n01Brand B;col(a)=135
side
col a00;Base;Excellent;Satisfactory;Very bad
l dumgrid
col 90;1;2;3;4;5;6
You then create a weighting matrix for the grid but instead of using the name of the grid
axis on the wm statement you use the name of the dummy axis:
wm9 dumgrid;factor;1.25;1.16;1.30;0.98;1,01;0.92
To weight the grid, name the matrix on the tab statement as you would for any other table:
When the table is created, respondents who rated brand A as excellent will be given a
weight of 1.25, respondents who weighted brand A as satisfactory will be given a weight
of 1.16, and so on.
The specification for the sample table uses n00 statements in the side section of the axis
to define additional filters for groups of row elements. Columns in grid axes may be
filtered using c= as you do in ordinary axes. The example below is of a product test in
which filtering of columns is quite common.
In our survey respondents are asked to test two products, A and B. One group tests them
in the order AB, while the others tries B then A, but neither group knows which product
is which. When respondents are asked about these products, they are simply referred to
as ‘the product tried first’ or ‘the product tried second’. The interviewer answers the
question about the order in which the products were tried.
Suppose the client wants a table to show how likely people would be to buy each product,
and whether this varies according to the order in which the products were tried. The
finished table is shown in Figure 26.2.
Table 10
Q9. Purchase Intent absolutes/col percents
l gax01
n01A;col(a)=146;c=c123’1’
n01B;col(a)=147;c=c123’1’
n01A;col(a)=147;c=c123’2’
n01B;col(a)=146;c=c123’2’
g Tried A First Tried B First
g A B A B
g------- ------- ------- -------
p x x x x
side
ttlQ9. Purchase Intent
col a00;Definitely would buy;Probably would buy;
+Might/might not buy;Probably would not buy;
+Definitely would buy
The first cell of the leftmost column contains people who have the condition
c123’1’.and.c146’1’ – that is, people who tried brand A first and who would definitely
buy it. The bottom cell of the rightmost column will contain respondents who satisfy the
condition c=c123’2’.and.c146’1/5’ – that is, those who tried brand B first and would
definitely not buy it.
Quick Reference
To have Quanvert export a grid axis is SAS format on a row-by-row basis rather than a
column-by-column basis, place the keyword:
byrows
on the l statement for that axis. This keyword has no effect on the Quantum tables.
When a Quanvert user exports a grid axis in SAS format, Quanvert exports it on a
column-by-column basis so that each cell of each column becomes a separate SAS
variable. For example, if the grid axis is defined in Quantum as:
l rating1
n01Brand A;col(a)=123
n01Brand B;col(a)=124
n01Brand C;col(a)=125
side
col a00;Base;Excellent;Very good;Satisfactory;Poor
Quanvert creates one SAS variable per brand. However, if the grid is specified as:
l rating2
n01Excellent;punch(p)=’1’
n01Very good;punch(p)=’2’
n01Satisfactory;punch(p)=’3’
n01Poor;punch(p)=’4’
side
n10Base
n01Brand A;c=c123’p’
n01Brand B;c=c124’p’
n01Brand C;c=c125’p’
each variable would contain data from more than one column of the data (it would be
multicoded) so Quanvert creates one variable for each cell of the column instead. In this
example the SAS data will contain one variable for each combination of brand and rating,
for example, Brand A Excellent, Brand A Very good, and so on.
With this example, it is not a problem to write the grid in the first format so that the SAS
data contains one variable per brand which is usually what is required. However, not all
grids are as simple, so Quantum allows database administrators to flag grid axes that
need to be exported in Quanvert on a row-by-row basis rather than in the standard
column-by-column way. The keyword that does this is byrows and it is used on the l
statement. It has no effect in Quantum.
Using the examples shown here, if you did not wish to rewrite the rating2 axis as it was
shown in the example called rating1, you could simply add byrows to the l statement of
rating 2 to achieve the same set of SAS variables in Quanvert:
l rating2;byrows
n01Excellent;punch(p)=’1’
n01Very good;punch(p)=’2’
n01Satisfactory;punch(p)=’3’
n01Poor;punch(p)=’4’
side
n10Base
n01Brand A;c=c123’p’
n01Brand B;c=c124’p’
n01Brand C;c=c125’p’
Row and table manipulation are two of Quantum’s more advanced features. They enable
you to create new rows and tables using whole tables or parts of previously created tables.
A previously created table is one whose tab statement appears before the tab statement
for the current table in this run, or one which appears anywhere in a previous run. You
cannot manipulate tables which have yet to be created, nor may you manipulate tables of
more than two dimensions.
Quick Reference
m[element_text]; ex=manip_expression[;options]
Row manipulation is the process whereby a row is created from other rows – for
example, by dividing one row by another or adding two or more rows together. These
facilities may also be applied to the columns of a table to create new columns from
existing ones. However, if a table contains both row and column manipulation, the row
manipulation will be done before the column manipulation. Throughout our explanations
of these facilities we will refer to row manipulation only.
Manipulation may be done to an existing row (i.e., one produced by an n01 etc.) using
the keyword ex=, or a new row may be generated with an m statement:
m[Text];ex=expression[;options]
where Text is the text to be printed in the table and ex=expression determines how the
row or column is to be created. Options define more specifically how the row is to be
printed. All options valid on an n01, except c=, inc= and wm= are valid with m (see
section 18.8).
Expressions on M statements
The operators +, −, * and / are straightforward, but the others require more detailed
explanation.
min(ex1,ex2, ... ) This returns the lowest value of the expressions within the
parentheses. Here, an expression is a number, a reference to
another row in the table, or any of min(), max(), sqrt() or exp()
themselves. For example, if Row1 and Row2 are the names by
which those rows are identified, the expression:
ex=min(Row1,Row2)
will compare the values in rows 1 and 2 for each column separately
and print whichever is the smaller in the corresponding column in
the manipulated row. Suppose we have the following:
Row 1 10 15 9 10
Row 2 6 20 9 1
then our new row will read:
Row 3 6 15 9 1
max(ex1,ex2, ... ) Max returns the highest value of a list of expressions. In all other
respects it is the same as min above. If we write:
max(Row1,Row6)
will yield the value 4. This is the square root of 16 which is the
smaller of the values in the two rows.
exp(ex1,int) This expression only has two items in the parentheses: the first is
an arithmetic expression and the second is a whole number which
is the power to which ‘ex1’ is raised (i.e., ex1int). For instance:
exp(Row6,4)
raises the values in row 6 to the power of 4. If the value in this row
is 15, the expression would be 154 which is 50625.
ex=10+4–2
If our breakdown axis has five columns and we want to put a new value into each cell
we could write:
mVector Row;ex={10.0,6.2,–8.3,15.6,–3.5}
c) A numeric element of the current axis which may be referenced in any of four ways
described below.
i) text, or as many characters of that text as make the element unique. All text must
be entered exactly as it appears on the element itself, and must be enclosed in
single quotes. Let’s use the axis region as an example; it has six elements:
l region
col 117;Base;Central London;Outer London;
+England excluding London;Scotland;Wales
and we want to create a row showing the total number of people living in England
including London. Since this is a simple axis we could use an n01 with the filter
c=c117’1/3’, but if we were to use row manipulation instead we would write:
Notice that we have only used the first word of each row text since these are all
unique – in fact, just the first letter of each text would be sufficient because they
are also unique. Notice also, that the words are entered exactly as they appear in
the table and in the original axis specification. If this was not so, the rows would
not be recognized. If we look at the table again, we will see the result of our
manipulation:
l region
col 117;Base;Central London;%id=R1;Outer London;%id=R2;
+England Excluding London;%id=R3
mEngland incl. London;ex=R1+R2+R3
col 117;Scotland=’4’;Wales=’5’
The base row and the last two rows are not included in any manipulation so there
is no need to give them Ids. The table produced by this axis is the same as that
shown above.
iii) A third way of referring to a row is by its overall position in the axis. This is
calculated by starting with the first element in the axis and counting down until the
row in question is reached – all n statements, including n03 and n09, all elements
on col and val statements and all intermediate m statements count as one element
each. The only exceptions are n00 which is ignored and n25 which creates three
unprinted rows.
☞ For information about manipulating n25 elements, see the section entitled
"Manipulating the components of an n25" later in this chapter.
When overall row positions are written on the m statement, they must be preceded
by a hash/pound sign (#). So, if we rewrite the axis region once again we will have:
l region
col 117;Base;Central London;Outer London;England excl. London
mEngland incl. London;ex=#2+#3+#4
col 117;Scotland=’4’;Wales=’5’
iv) The fourth method of picking up rows for manipulation is to use their relative
position in the axis. This is obtained by counting backwards from the m statement
to the row to be manipulated. All relative positions must be preceded by the ‘at’
sign (@). @0 and @ both refer to the current line; that is, they refer to the m
statement itself.
Therefore, we could create our sum of people living in England including London
by writing:
Any of these four options is correct; just use the one which suits you best at the time. You
can, of course, mix the various types in one statement if you wish.
Although an n25 statement does not print any rows in the table, it does create three rows
which are used as part of statistical calculations such as means and standard deviations.
These rows are:
☞ For an explanation of these values in Quantum terms and further information about
n25 statements, read the section entitled "The mean, standard deviation, standard
error and error variance" in chapter 19.
To refer to any of these figures on an m statement you must either give the n25 an
identifier or refer to it by its absolute position in the axis. The individual elements of the
n25 can then be called up as:
where ‘pos’ is the absolute position of the n25 in the axis, and ‘id’ is an identifier assigned
to the n25 with id=.
This example shows some of the more usual tasks you might accomplish with table
manipulation. The table was created by the statement:
l manax
col 109;Base;Single;%id=r1;Married;%id=r2
mSingle / Married;dec=4;ex=r1 / r2
mMarried / Single;ex=#3 / #2
mSingle + Married;ex=@4 + @3
n03
n00;c=c125’1’
n01People Who Bought Bread
n01Number of Loaves Bought;inc=c(250,251)
mLoaves Bought Per Person;ex=#9 / @2
n01Loaves Bought Last Month;inc=c(132,133)
mLoaves Bought Per Person Last Month;ex=@1 / ’People’
Base Male Female
Quantum does not apply spechar, nz, nzrow and nzcol to elements created using
manipulation. Also, if either nzcol or nzrow is in effect and an axis contains all-zero
elements, those elements are not suppressed if the axis is tabulated against one containing
manipulated elements.
To have Quantum treat manipulated elements the same as other elements with regard to
special characters for zero or near-zero values and suppression of all zero elements, place
the keyword manipz on the a, sectbeg, flt or tab statement.
You may revert to the default method of ignoring manipulated elements by placing a
nomanipz option at the point at which you wish this to happen. You may switch methods
many times in a run if you wish.
Here is the same table produced with and without manipz. The specification used is:
The tables produced without manipz are as follows. The Third and Fourth elements
created with n01s have been suppressed but the Third+Fourth element that is created by
manipulating those elements is not.
Base 89 43 46
First 22 15 7
Second 67 28 39
First + Second 89 43 46
Third + Fourth 0 0 0
The same table produced with manipz suppresses not only the all-zero n01s but also the
all-zero manipulation element:
Base 89 43 46
First 22 15 7
Second 67 28 39
First + Second 89 43 46
Quick Reference
To manipulate the figures in an existing element, include the option:
ex=manip_expression
All the examples so far have used m statements. However, ex= may also be used on n
statements to manipulate the figures in that row prior to printing. For example, the row
showing the number of loaves bought per person in the table above could be specified as:
In most cases you will see no difference in a table between a row created with m and the
same row created using n01;ex=. The difference is an invisible one to do with the
efficiency of your code.
An m statement performs whatever calculation is specified with its ex=. When Quantum
reads an n01 with an ex=, it ignores the ex= at first and calculates cell counts based on
the data and any inc= specifications. Once these calculations are finished and the basic
cell counts are available, Quantum applies the ex= specification.
So, which method should you use? If the values that are used to create the m row need
to appear in the table as rows in their own right, as in the example on the previous page,
then an m is more efficient. If an ex= expression on an n01 uses values that need to be
calculated by that statement, and those values do not need to appear in the table, then
using n01;ex= is more efficient. In the example, using this approach would only have
been better if we had not wanted to see the row showing the number of loaves bought.
The averages created by an n07 are simply the average of values appearing in a row or
column (i.e., sum of values divided by number of values). To create a row in which the
average is the value in one row divided by the value in another row, you will need to write
an m statement with the appropriate expression.
Suppose a tour operator has conducted a survey of the hotels it uses in various places in
an effort to improve the service it offers to holiday-makers. As part of this survey, hotel
managers are asked how many rooms and beds are available in their hotel each month
and, of those, the number actually occupied each month. The tour operator wants a table
summarizing hotel usage for a particular month by showing the average number of rooms
and beds occupied during that month. Additionally, all averages are to be shown as
percentages rather than as absolute values.
This average can be calculated by dividing the number of rooms available by the number
occupied and the number of beds available by the number occupied, using row
manipulation. If we wanted to break these figures down according to the regions in which
the hotels are situated, we would have two axes as follows:
l avers
n10Total Hotels
n01Rooms Available;inc=c(15,18)
n01Rooms Used;inc=c(25,28)
mAverage % Room Occupancy;ex=@1 * 100 / @2
n01Beds Available;inc=c(35,38)
n01Beds Occupied;inc=c(45,48)
mAverage % Bed Occupancy;ex=@1 * 100 / @2
l region
col 10;Base;North;North East;Midlands;East Anglia;
+South West;South East;South;London
If our first hotel has 210 rooms available of which 179 were occupied, the average room
occupancy is 85%, ignoring all decimal places.
Quick Reference
ex manip_expression
New tables may be generated by manipulating tables created previously in the current run
or anywhere in any other run. For instance, in a countrywide survey, the data may have
been collected on a regional basis, with each region having a separate directory. We may
wish to create some tables which refer to the country as a whole rather than to a particular
region. With table manipulation, we could create these tables separately and then add
them together to create a new table for the whole country.
ex expression
An ex statement by itself means nothing: it must always follow a tab statement defining
the basic table to be manipulated. The unmanipulated table created by the tab statement
is not printed as part of the output. For instance:
creates the table ax01 by bk01, stores the cell values and then multiplies them by 2.0
before writing them in the tables file.
which are exactly the same as for row manipulation, except that the expressions enclosed
in the parentheses with min(), max(), sqrt() and exp() refer to whole tables rather than
rows.
generates a table by multiplying each cell in the table by 1.45, as shown here:
Vectors may be used to replace numbers in the table created by the previous tab statement
or to define constants by which the cells in that table are to be incremented or
decremented prior to printing. When supplying raw numbers all you have to do is type in
the numbers separated by commas and enclosed in braces, thus:
This creates the table ax01 by bk01 but instead of showing the cell counts read from the
data (see example above) it shows the values specified by the ex statement. Hence, the
table base will be 80.0 instead of 70, the base for Col1 will be 32, and so on. Any cells
for which values have not been given will be shown as zero.
On the other hand, when vectors define incremental values any cell for which any
incremental value has not been given will retain its original value. For example, we could
create a table of ax01 by bk01 with a base of 80 respondents as before, except that instead
of entering the exact values for each cell we would enter the values by which the original
totals are to be incremented:
As you can see, the difference between the original base (70) and the manipulated base
(80.0) is 10.0, the original base for Col1 (30) must be incremented by 2.0 to reach the
required figure of 32.0, and so on.
In this example we have preceded the braces with a plus sign, but any of the operators −,
* and / are equally valid. Note also that long vector lists may be spread over more than
one line by ending the first line at a comma and preceding the vector at the start of the
second line with a ++ continuation.
Quick Reference
Tquantum_table_number
Tid_name
T#absolute_position
T@relative_position
As with individual rows, a table may be referred to in several ways. The first is to use the
Id which is assigned automatically by Quantum: the first table in the run is 1, the second
is 2, and so on. These numbers are printed to the right of each table-creating statement in
the compilation listing. For example, the first table would be numbered:
This produces a table in which each cell is ten times its original value. When automatic
Ids are used, they must be preceded by the letter T in upper or lower case, as shown in
our example. The leading zeros in the table Id number are optional – we could have
written ex t1 * 10.
Alternatively, you can make up your own identifiers using the id= option on the tab
statement. This is a code of up to six numbers and/or letters starting with a letter, for
example:
Note that the automatic identifier is generated for each tab statement and each axis on an
and statement, but not for add, div, sid and und statements. Also, when an id= appears
on a tab statement, it interrupts the automatic count so that the next table without a
user-specified Id will carry on where the last one left off:
✎ You are advised not to use Ids of the form Tn (e.g., T15, T29) as these can easily be
confused with Quantum’s automatic Ids.
The second method is to refer to the table’s absolute position in the run, preceded by the
characters T# (the T must be upper case). This is found by starting at the first tab
statement and counting down until the required table is reached – all tab, ex, add, sid, und
and div statements count as one table, while ands cause the count to be incremented by
the number of axes they contain. For example, to refer to the third table we would say:
ex T#3 / 10
Here the cells in the new table will be created by dividing the numbers in the third table
by 10. All tables mentioned in this manner must come before the current table - you
cannot be manipulating table 12 when you are only on table 8.
Finally, tables may be called up according to their relative positions in the run, preceded
by the characters T@. As with rows the relative position is calculated by counting
backwards from the ex statement to the appropriate tab statement. T@ and T@0 both
mean the current table; that is, the table created by the tab immediately before the ex
statement.
If we had:
we would have two tables: the first which was the table ‘ax01 by bk01’ and the second
which is the first table squared and divided by the table ‘ax02 by bk02’.
Quick Reference
Rrun_id / table_manip_expression
or:
Rrun_id>table_manip_expression
In order for tables from previous runs to be manipulated, the numbers in those tables need
to be saved somewhere. Quantum does this automatically whenever a run produces
tables. Although you need not worry about saving tables, it is nevertheless necessary to
understand a little of this mechanism in order to manipulate your tables correctly.
In a run containing no manipulation at all, Quantum saves the cell values of each table in
a numbers file. You cannot read this file yourself so think of it as a list of numbers
separated by spaces or commas, where the first number belongs in the first cell of the first
table, the second number belongs in the second column of the first row of that table, and
so on.
Now, when a run contains manipulation statements, all ordinary tables are saved in the
numbers file as usual, while both manipulated and unmanipulated figures are saved in the
manipulated numbers file. Again, you cannot read this file so just think of it as a list of
numbers and spaces.
Tables from anywhere in previous runs may be manipulated by preceding the table
specification with the letter R, a run Id of up to six characters and a slash (/). For example,
to multiply the second table in a run called JAN by two, we would write:
tab a1 b1
ex RJAN/T#2 * 2
✎ To use run Ids, you must set up a run definitions file in the same place as your
Quantum program. Each line in this file must contain the run Id and the location of
the run it represents, separated by a space.
☞ For further information on creating this file, see section 33.4.
Sometimes the numbers from previous runs may themselves be the result of some
manipulation, but unless you say otherwise, Quantum assumes that you will be using
unmanipulated figures and will search for the named table in the ordinary numbers file.
To force it to read manipulated figures from the manipulated numbers file follow the run
location name in the definitions file with a space and the word manip. If our run
definitions file names regA as the location of the numbers file for Region A and regB as
the locations of the manipulated numbers file for Region B we would write:
This might create a table showing age by sex for people interviewed in regions A, B and
C (the region we are currently analyzing):
✎ When referring to tables in other runs, take great care that you name the right table:
the notation T@ meaning the current table should only be used for the other runs if
the table being called up is in the same relative position in the run as the table created
by the current tab statement. If we are on our fifth table, T@ will mean the fifth table
in all other runs as well.
When tables other than the current one are used in an expression, Quantum compares the
element texts of those tables with that of the current table. If the texts are identical, the
elements are manipulated. If an element is present only in the current table, it appears in
the table unmanipulated, whereas if it exists only in the previous tables, it is ignored.
Elements which are present in both/all tables but have non-identical wording cause the
manipulation to fail.
This need not prevent you from manipulating rows with different texts or in different
positions in the tables because Quantum will also manipulate rows which have the same
Id. For example, in the following axes, the two rows named B will be dealt with together
because their rows texts are identical, as will the rows A and Z because they both have
the same identifier. Row C will be ignored because it only appears in the first table, while
row X which is present in the second table only will appear in its original form
Quick Reference
ex element_ref = expression
after the tab statement for that table. Side elements are referred to as sid and banner
(column) elements are referred to as bid. In both cases, id is the row number or identifier.
References to elements in other tables must start with the table reference followed by
either / or >.
New tables may also be created on an element by element basis using rows and columns
from tables created previously in the same or different runs. Statistical and totalling
elements may not be manipulated.
When we say that new tables can be created, what we mean is that you can create a table
and replace all the numbers in that table with numbers from other tables. Manipulation is
not an alternative to the tab statement: manipulation deals only with numbers whereas a
tab statement takes texts as well and formats them to produce row and column headings.
To create a table using elements from other tables, use the statements:
where the tab statement defines the basic table to be modified, ‘element’ names the row
or column to be created in the current table and ‘expression’ is an expression defining the
manipulation required.
Elements are entered as sn for side (row) elements and bn for breakdown (column)
elements, where n is the absolute position of the new element in the table. For example,
the first row is s1 and the first column is b1. When calculating an element’s position in a
table, remember that each n statement, other than n00, counts as one element.
If the elements to be manipulated have Ids you may use these instead – e.g., spr1 for the
side element whose Id is pr1.
The expression is made up of operators and operands. Operators and operands allowed
are any of those mentioned earlier in this chapter for row and table manipulation, or
references to elements in the current table or in any previously created table. Elements in
expressions are entered as sn, sId, bn or bId. If they are not part of the current table, the
element specifications must be preceded by the table reference and a / or > sign. For
instance:
tab aa bb
ex s1=s1 + T#2/s2
creates the table aa by bb and then adds the figures in the second row of the second table
in the run to row 1 of aa by bb. Note that the expression could also have been written as
ex s1+T#2>s2.
This form of ex statement must refer either to complete tables or to side elements or to
breakdown elements; a combination in one statement is not allowed. To perform
manipulations using a variety of elements, write a separate ex statement for each type.
There is no limit to the number of ex statements which may follow a tab.
The statements:
create the table ax01 by bk01 and then replace the counts in the third row with counts
which are the sum of first and second rows. Then, the values in the first column are
multiplied by 10. Let’s use some numbers to clarify this:
Since Quantum allows you to manipulate tables from previous runs, it is quite possible
to create new tables without having a proper data file for the run. For example, suppose
we have a monthly survey which asks a panel of respondents the same set of questions
each month, and we want to produce a set of tables showing summary figures for the first
three months of the year.
We could merge the three data files and rerun our Quantum program against this data, but
it would be more efficient to write a short manipulation program to merge corresponding
tables from each of the three runs into the summary tables we require. If we decide to use
manipulation, the first thing we do is set up a run definitions file naming the runs
containing the monthly information. Next, we take a copy of the program used to produce
the monthly tables and after each tab statement we write an ex statement which adds
together the figures from this table in each of the three previous months. For instance:
These statements produce one table of ‘age by progs’ which is the sum of table 1 for
January, February and March. This process is repeated for each table required.
Finally, create a dummy data file to be used for this run. All that’s needed is a file
containing one record with a serial number and card type in the appropriate columns. If
the record contains codes in any other column, you run the risk of it being accepted by
the tables, and thus making your three month summary table incorrect.
When you run your job, Quantum will read in the dummy record and create each table
according to the ex statements in your program.
Data of this type is frequently entered using trailer cards for the information which occurs
more than once; in this example, the answers given by each person in the household. This
is not always the case. Very occasionally you will come across surveys in which all data
has been entered as one long record. Do not worry. In this section we will discuss ways
of dealing with both types of hierarchical data.
Analysis levels are a relatively easy way of processing hierarchical data with or without
trailer cards. Each analysis level is a field of columns, a card or set of cards containing
information for a specific level in the data hierarchy. For example, if card 1 contains
information about the household as a whole, and each card 2 stores data for a different
person in that household, we have two levels, one for the household and one for people
within the household.
You may edit and tabulate data by level by giving each level a name and applying it to
the edit and tabulation statements for that level. A maximum of nine levels is allowed.
You may name levels and define the structure of the levels data either using a standard
struct statement with some additional parameters for levels, or using a levels file. Both
methods are described below.
Quick Reference
To define the top level, type:
Cards which must be present in every record must have the card number followed by the
letter r.
Levels and where to find the relevant data are defined in the levels file which must be
created in the same place as your Quantum program file. Levels must be defined in order
of priority, with the highest level first. The top level is specified as follows:
where n1 and n2 are the cards containing data for this level. If any of the cards are
mandatory, you must follow the card type with the letter ‘r’. Suppose our top level is the
whole household, which we call hhold, and the data for this level is stored on cards 1 and
2, both of which are mandatory. We would write:
hhold cards=1r,2r
The second level is defined in much the same way, except that we also have to show that
it is a sub-level of the top level. This is done by following the last card number with a less
than sign (<) and the name of the parent level. If our second level refers to the individual
people in the household we might write:
From this we can see that data for each person in the household is on card type 3. If there
is more than one person in the household, we will have a card 3 for each person.
If the data for a sub-level is on the same card as its parent level there is no need to enter
the card specification. Thus, the statement:
trip <person
tells us that information about journeys made is a sub-level of each person’s data but does
not have a card of its own: it is all on card type 3.
Any level may contain more than one sub-level of the same importance in the overall
hierarchy. Suppose we also have a card 5 for each pet present in the household: this is a
sub-level of the top level since the pet belongs to the household as a whole rather than to
an individual, so we write:
Even though pet is at the same level as person, it is not defined until person and all
sub-levels of person have been defined. Once you start defining a level, you must
continue down it until there are no more sub-levels before starting on the next level; you
cannot skip from one to the other.
Quick Reference
ser=start_pos, end_pos
crd=start_pos[, end_pos]
max=highest_card_number
reclen=num_cols
maxsub=max_cards_per_level
In addition to defining levels, the levels file also describes the format of the data to be
read, thus making the struct statement redundant when you are using analysis levels. All
descriptions of data must precede the levels specifications.
ser=m,n
where m and n are column numbers without a preceding C. For example, if the serial
number is in columns 1 to 5 of each card we would write:
ser=1,5
The position of the card type is noted in a similar manner with the statement:
crd=n or crd=m,n
For example:
crd=6 or crd=79,80
If each record contains more than nine cards, you must also enter the number of the
highest card type as follows:
max=n
If your data file contains records which are longer than 100 columns, you will need to
include the statement:
reclen=length
There is no need to define a record length since Quantum resets the C array to blanks
when a new top level record is encountered (e.g., a household record after a person
record).
Quantum assumes that each respondent’s record will contain a maximum of 4096
sub-records or cards. If any of your records contain more cards than this, you will need
to extend the default by entering the statement:
maxsub=n
in the levels file, where n is the maximum number of sub-records (cards) per record. If a
record is found with more than the given number of cards, the datapass terminates and
messages are written to out2 and the log file.
On small (int=short) machines, the absolute maximum number of sub-records per record
is 32,767. On large machines (int=long), the theoretical limit is 2,147,483,647, but the
actual limit will depend on the amount of memory space available to the datapass
program.
ser=1,4
crd=5,6
hhold cards=1r
person cards=2r <hhold
purch cards=3 <person
Here we are defining records with a serial number in columns 1 to 4 and a card type in
column 6. Each record may contain data at up to three levels. The highest level is the
household (hhold) which is read from card 1 which is mandatory. The second level is
person on a mandatory card 2 which is a sublevel of household. The lowest level is the
purchase (purch) which is a sublevel of person. Data for this level is read from an optional
card 3.
Quick Reference
As an alternative to the levels file, you may define the structure of a levels data file using
a struct statement with additional options that name the levels and define their
relationships to each other.
On the struct statement, specify read=2, and use ser= and crd= to define the serial
number and card type columns as you would for an ordinary data file. If records have
more than nine cards you will also need to use max= to increase the size of the C array
so that it has room to store the data for cards 10 and above.
level_name is the name of the level you are defining. parent_level is the name of the level
to which the current level belongs: it must be one which has already been declared.
card_numbers are the cards on which the data for level_name is held. If the data is on the
same card(s) as the parent level you may omit the card reference. You may specify ranges
of card numbers either by typing the individual numbers as a comma-separated list, or
using one of the range specifications start/end or start:end. If some cards must be present
in every record, follow the card number with the letter ’r’. If all cards in a range are
mandatory and you have used a / or : range specification, you may type ’r’ once at the
end of the range.
struct;read=2;ser=c(1,4);crd=c5;
+lev=hhold,1r; lev=person<hhold,2r,3
This example declares a data file with two levels. The top level is household which is held
on card 1. Every record must have a card 1. The second level is person. This is a sublevel
of household and the data is held on a mandatory card 2 and an optional card 3.
struct;read=2;ser=c(1,4);crd=c5;
+lev=person,1r; lev=trip<hhold,2; lev=shop<trip
The data file for this example has three levels. Person is the top level and is held on a
mandatory card 1. Trip is the second level and, if it is present, is held on a card 2. The
third level is shop which, since it has no card specification of its own, is assumed to be
held on card 2.
struct;read=2;ser=c(1,4);crd=c5;
+lev=doctor,1:3r; lev=patient<doctor,4r,5
This example has two levels. Information about the doctor is held on cards 1, 2 and 3, all
of which must be present for each record. Information about the doctors’ patients is held
on cards 4 and 5. Card 4 is mandatory and card 5 is optional.
To define the maximum number of sublevels a record may have, use the maxsub=
option:
maxsub = number
☞ For further information about sublevels and maxsub=, see the section entitled
“Defining the data structure in the levels file” earlier in this chapter.
Quick Reference
To start the edit in a levels job, type:
ed level_name
level level_name
endlevel level_name
return
When analysis levels are used, all editing must be done by level. Edits start with the
statement:
ed levelname
where levelname is that of the level to be edited, and end with a return. Subsequent edits
for other levels start with:
level levelname
and end with return. The edit section as a whole is terminated with an end statement:
ed hhold
/* Edit statements for household data
return
level person
/* Edit statements for person data
return
end
All statements between an ed or level and a return are executed after the cards for a given
level have been read into the C array. So, if the current level contains two cards of the
same type, all statements for that level will be executed twice, but if the cards are of
different types, the statements will be executed once only.
Where it is necessary to perform a task after all data at a given level has been read in for
the current record, you will need to enter the edit statements preceded by the statement:
endlevel levelname
and followed by a return. This causes all statements up until the return to be executed
once when all data for the named level has been read in. In our example this would be
when the third card 2 had been read. Note, however, that if the level in question contains
data at higher levels the statements following endlevel will not be processed until all data
at the lower levels has been read. For example:
ed hhold
/* ptot counts number of people in household
/* amount accumulates total family income
ptot=0; amount=0
return
level person
ptot=ptot+1
/* Personal income is in C(232,235)
amount=amount+c(232,235)
return
endlevel hhold
/* Calculate average family income
amount=amount / ptot
return
end
Here, the statements between level person and return are repeated for each person. The
statements following endlevel are not processed until the data for the last person in the
household has been read.
When data is processed using Analysis Levels, tab and l statements must indicate the
level at which the table or axis should be updated - that is, is it to be updated once per
respondent or once per sublevel.
Before tables are produced, Quantum creates an intermediate file in which cells are
switched on or off depending on whether a respondent satisfies the conditions of a
particular axis. If the condition is satisfied, the cell is switched on; if not, it remains off.
Normally, these cells are switched on when the record fulfils the conditions specified on
the axis. All cells in the intermediate table are reset to zero before a new record is read.
The axis:
l sex
col 10;Base;Male;Female
has two cells – Male and Female. Whenever a Male respondent is found, the first cell is
switched on, and whenever a Female respondent is found, the second cell is switched on.
In the example below, a 1 means that the cell is on and a zero means that it is off:
Respondent 1 – Male 10
Respondent 2 – Male 10
Respondent 3 – Female 01
Respondent 4 – Male 10
If the record contained trailer cards, the cells would be reset between reads. Therefore,
with a record containing four trailer cards, the intermediate file would read:
which is the same as the first example. In both cases, tables using these cells would tell
us that there are three men and one woman: the table is a count of people. However, this
may not always be what you want.
With hierarchical data, you will often want to produce tables in which the base is a count
of records rather than a count of the number of times a particular condition was true. For
instance, you may want a table in which the base shows the number of households
containing women, rather than the the number of women in all households. For tables of
this sort, you need to update the cells for each trailer card without resetting them between
reads. This means that for each record, once a cell is switched on, it remains on for the
whole of that record, regardless of the number of times the condition is fulfilled.
If we take our household of three men and a woman, the intermediate table would read:
Notice the difference between Persons 2 and 3 in this example and the same people in the
previous example. Here, the cells remain switched on until all data has been read,
whereas in the previous example cells were reset between reads. Tables produced with
these cells would show us that we have one household which contains men and women,
but not how many of each there are.
Exactly when the cells in the intermediate table are updated or reset depends upon the
type of table required and hence the keyword used on the tab or l statement. Keywords,
described below, are:
Quick Reference
To define the level at which tables and/or axes are to be created, type:
anlev=level_name
The level at which tables or axes are to be created is defined using the anlev= keyword
on the tab and l statements. It means ‘update the table or axis when all data for the named
level has been read in’. For example:
produces a table of region by class where the cells will be incremented once per
household.
Why have we used anlev=hhold on both l statements? All people in a household are of
the same class and live in the same region, therefore, each axis need only be updated once
per household rather than once per person, which gives us anlev=hhold on the axes.
There are two reasons for anlev=hhold on the tab statement. The first is simply because
we want a table based on households not people; the second is because no table may have
an analysis level higher than that of its component axes. If tables need to be updated at a
level different to that of the axes, you must use uplev or celllev in addition to anlev as
shown below.
Tab statements need not have the same analysis level as the axes which they use. For
example, we may require two tables of region by class, one showing the number of
households of each class in each region, and the other showing the number of people of
each class in each region.
The two tables use the same axes, both of which have an analysis level of household. The
first table is a count of households, so it too has an analysis level of household, but the
second table is a count of people and thus needs to be updated once for every person in
the household. Therefore we give it an analysis level of person:
The only point to bear in mind with this type of table is that the analysis level on the tab
statement must not be higher than that on the axes. It is incorrect to have anlev=person
on the axes and anlev=hhold on the tab. To produce tables of this sort you must use uplev
in addition to anlev.
Quick Reference
To update an axis at a level lower than that at which it is created, type:
uplev=level_name
on the l statement.
To increment the base of an uplev’d axis for every record containing data at the anlev
level, type:
levbase
Sometimes you will want to create an axis at one level but update or net it by a lower
level. For instance, you may wish to know how many households in each region contain
people in specific age groups. The table must be created at household level because it is
a count of households, but, because each household will probably contain more than one
person, each of whom may be in a different age group, the axis defining the person’s age
needs to be updated on a person basis.
A table at household level gives us anlev=hhold on the tab and l statements, but because
the cells in the age axis need to be updated for each person in the household we add
uplev=person to the appropriate l statement, thus:
Region Age
Household 1 – North 1000 Person 1 – 59 0000010
Person 2 – 53 0000110
Person 3 – 26 0010110
Household 2 – East 0010 Person 1 – 36 0001000
As you can see, the cells for age are merged using an ‘or’ comparison to build up an
overview of the household before the cells in the final table are incremented.
By default, the base in an axis updated using uplev= is only incremented for records
which contain data at that level. In the example we have just used, the base will be
incremented for every record read since every person in the household has an age.
Suppose, instead, that we have an axis which counts the number of households owning
different makes of car. The base for this axis will only report households owning cars –
those which do not own a car are excluded.
If you want the base to be incremented for every record containing data at the higher
level, regardless of whether it contains data at the lower level, place the option:
levbase
In this example, the table is created at household level and updated at car level. Normally
it will include only those households owning at least one car, but by placing levbase on
the base element we force Quantum to update the base for every household it reads even
if they do not contribute to other elements in the table.
In the car-ownership example, this would create a base which is all households, rather
than just all households owning cars.
When you request statistics in a table created with uplev=, you may notice a difference
between statistics created with inc= and the same ones created with fac=, which is not
simply due to the fact inc= is more accurate than fac=. If some of the records contain
more than one case at the lower level, this could be the reason for the differences. Here’s
why.
When you create statistics based on factors, Quantum creates the table and then applies
the factors to the values in the table. Only one factor is applied to each household. Where
the statistic requires the number of cases as well, this is the number of people whose data
was included in the table.
When you create statistics using inc=, Quantum adds up the values at the lower level and
then, for a mean, divides by the number of cases whose values were added. However, if
a record has more than one set of data at the lower level, Quantum needs some way of
deciding which case to take, since it can only take one value per household. It takes the
data for the last case in each record and ignores the data for the other cases. As before,
the number of cases is the number of people whose data was included in the table.
Here is an example that illustrates the difference. Suppose the data is:
00011
00012 2
00012 3
00021
00022 1
00022 2
00031
00032 2
00032 1
00032 3
Base
Num households 3
Excellent (3) 2
Good (2) 3
Poor (1) 2
Mean 2.00
The mean is calculated as the sum of factors (14) divided by the number of cases (7
people).
the table will be the same except that the mean will be shown as 2.33. This is because
every household has more than one person, so only the data for the last person in each
household contributes to the mean. The sum of factors is therefore 3+2+3=7 and the
number of cases is 3.
Quick Reference
To update a table at a higher level than that of the axes used to create it, type:
celllev=level_name
A table may be updated at a higher level than the axes by using celllev= on the tab
statement. For example, to produce a table of age by sex which shows, not the number of
men between 20 and 30 years of age, but the number of households having men in that
age group, we might write:
Both axes are created on a person-by-person basis since each person’s data has to be read
to obtain his age and sex. Because both axes have an analysis level of person, the tab
statement which cross-tabulates them must also be at that level. Normally this produces
a count of people, but as we want a count of households we use celllev which causes
Quantum to increment each cell in the table only once per household, regardless of the
number of people in each cell.
As with uplev, the table base is only updated for records that have data at the level defined
with celllev (that is, the level at which the cell counts are made). For example, if level 1
is household and level 2 is child, tables created with anlev=child;celllev=hhold
will only include households with children because the axes on which the cell counts are
based will be at child level. If you compare this table’s base cell (i.e., the top left-hand
cell) with a table created entirely at household level you should not expect the two table
bases to be the same unless every household in the data file has at least one child.
The first table shows the number of households by type of house and region. The table
base is the total number of households in the data file. The second table uses data at child
level: only records with data in columns 210, 213 and 214 are included in the body of the
table. The tab statement for these axes shows that the cell counts must be given at
household level rather than child level, so the figures are reduced accordingly.
When you are creating tables with levels you may find that you are not sure whether to
use uplev on the axes or celllev on the tab statement. There are times when the two
produce the same results: one instance is when you are using axes which are updated at
different levels. Suppose we are working on a shopping survey, the top two levels of
which are person and shop, and we want a table showing the number of respondents in
each region shopping in a particular shop.
Here, region is updated once per person and store is updated once per shop visited. The
table as a whole is updated once per person. If our first respondent lives in the north and
visited Safeways twice and Sainsburys one, our table would show:
This says that we have one person in the North who visited Safeways and Sainsburys.
If we now write:
we are updating region once per person and store once per shop visited. The table has
anlev=shop and celllev=person because we want a table of information about shopping
habits which is updated once per person. The table itself is:
This again tells us that one person in the north shopped at Safeways and Sainsburys.
If we simply wanted a table showing the number of times people in various regions
visited each shop we would write:
This works because each person in a household must live in the same region.
Now let’s look at an example in which uplev and celllev are not the same. In our shopping
survey, part of the data in the shop level tells us which flavors of yogurt the respondent
bought in each shop. We need a table of flavor by store at person level. The cell for
banana flavored yogurt bought in Safeways must show the number of people buying
banana flavored yogurt in Safeways at least once, not the total number of times banana
yogurt was bought in Safeways by every person interviewed.
However, if we look at the table that this creates, it becomes apparent that this does not
produce the figures we want. If we take just one person as follows:
We can see that we have one person who bought a variety of flavors in both shops, but
we cannot see exactly which flavor was bought where. This is because uplev takes all
answers and ‘ors’ them together before the axes are combined in the table.
Now, if we try:
We can now see that we have one person who bought banana and strawberry yogurts in
both Safeways and Sainsburys, mango yogurt in Safeways and peach yogurt in
Sainsburys. By giving the axes and the tab statement the same analysis level and updating
the table with celllev, we force Quantum to retain the relationship between the flavor and
store data for each person.
If we merely wanted a table showing the number of times each flavor was bought in each
store (i.e., a straightforward table of flavor by store) we would just use anlev=shop on
the tab and the axes, and ignore uplev and celllev altogether.
Quick Reference
process level_name
Process is followed by the name of an Analysis Level lower than the level currently
being processed in the Edit. This passes control to the tabulation section and updates only
tables at the named level. The format of the statement is:
process level
This is useful when information for a specific level is on the same card as its parent level.
Suppose our information about the brand of bread bought is read from level Trip, which
is a sublevel of Person, as follows:
person cards=1
trip cards=2 <person
brand <trip
level person
--edit statements for person-level data--
do 10 t1 = 134,140,2
c(134,135)=c(t1,t1+1)
process brand
10 continue
endlevel person
Quick Reference
To reset cells in Quantum’s intermediate file when the data contains trailer cards which
are not analyzed with levels, type:
clear=logical expression
on the l statement.
When we talked about tabulating data using levels, we said that cells are normally reset
in the intermediate file when a new respondent is read in or between reads when the data
contains trailer cards. If you are using analysis levels, the time at which these cells are
updated can be altered using anlev and uplev. If you are not using levels, you can achieve
the same effect using the option:
clear=logical expression
on the l statement. This causes the cells in the intermediate table to be reset only when
the given logical expression is true. If you want to reset these cells for each new
respondent, rather than for each record or read, enter the option as:
clear=firstread
If we take the axis Sex as we did when we explained analysis levels, you can see that both
clear= and anlev produce the same results. If we write our axis as:
l sex;clear=firstread
col 10;Base;Male;Female
and take a household of three men and a woman, the intermediate table would read:
The first cell of Sex is switched on when a household containing a man is found. It
remains on until all data for that household has been processed. If a household also
contains a woman, the cell for Female is switched on and is not reset until the next
respondent’s data is read in. If a household contains both men and women, both cells are
switched on.
Once the first card of the next record is read, the reserved variable firstread is 1 (i.e.,
true), so both cells will be reset to zero ready for the next household.
When axes which use clear are tabulated, the reserved variable lastread may be used in
a condition on the tab statement so that the cells in the final table are only incremented
when all cards for a respondent have been read.
A major use of this facility is in the production of Penetration or Profile tables on trailer
card data. As the trailer cards are read, the intermediate table is updated to build a profile
of the respondent. Cells in the printed table are incremented only when all trailer cards
for a respondent have been processed.
In the example below two tables are being produced: the first shows the number of
products bought, the second shows the number of products bought per respondent. Card
1 contains demographic data and card 2 (a trailer card) gives details of the items bought.
As you can see, the elements of items and resps are exactly the same – they are the brands
bought. The difference is in the l statement which names the axis and defines its
conditions. Items is just a straightforward axis whose cells in the intermediate table will
be reset to zero between reads. Any tables produced with this axis will be a count of the
number of times each brand was bought by respondents in each class. If the cell created
by the intersection of Brand B and Class DE contained the number 52, this would mean
that Brand B was bought 52 times by class DE respondents. This may mean that 52
respondents bought that brand once each or that 20 respondents bought it, with some
buying it more than once.
Resps, on the other hand, has the condition clear=firstread.eq.1 indicating that
cells in the intermediate table should only be reset to zero when a new respondent is
reached. This means that these cells will contain respondent profiles – that is, they will
tell us whether or not a respondent bought a particular brand at any time; they will not
show the number of times he bought each one.
The tab statement using this axis has the condition c=lastread.eq.1 meaning that the
table itself is not to be updated until all data for a respondent has been read. The cells in
this table will tell us how many respondents in each class bought each brand. This time,
if the cell created by the intersection of Brand B and Class DE contains the value 52 it
will be because 52 class DE respondents bought brand B at least once.
Quantum provides facilities for calculation of a set of basic statistics from the figures
produced in Quantum tabulations. They include the statistics most commonly used for
testing hypotheses about the values of proportions (percentages) and the locations
(average values) of variables, and about differences in these between two or more subsets
of the data. There are also chi-squared statistics for testing hypotheses about a single
distribution or about differences between two or more distributions.
• F Test for testing differences between a set of means (one-way analysis of variance
(ANOVA))
For each statistic, Quantum also calculates and prints an associated significance level so
that you can readily see the results of the tests you have performed.
The Quantum test statistics are divided into two groups. Axis-level statistics are specified
in the axis, and are calculated for each table in which the axis appears, whether as a row
or column axis. Table-level statistics are specified as an option on the tab statement, and
are applied only to the table(s) for which they are specified.
In general more than one test may be specified for a given axis or table, and table-level
statistics may be specified for tables where the axes also contain statistical elements.
In both cases, statistics are produced as part of a normal Quantum tabulation run. They
are requested by means of the keyword stat= on the tab statement or in the axis.
Axis-level statistics
Quick Reference
To use an axis-level statistic, type:
Axis-level statistics are requested by means of a stat= statement as an element in the axis:
where statname is the name of the statistic to be computed, text is an optional element
text to be printed in the table against the row or column of statistical values (if no text is
given, none is printed), and options determine how the statistics will be printed.
Options are one or more of dec=, decp=, fac= and id=. Axis-level statistics are printed
by default with two decimal places, but this may be altered by using the dec= option.
Similarly, the significance levels are printed with three decimal places, and this may be
altered using the decp= option.
☞ See chapter 18 for the use of these options on axis elements. Note that the decp=
option which normally refers to percentages is used on a stat= element to refer to
the significance level.
.For instance:
calculates a one-dimensional chi-squared statistic and prints it with 3 decimal places and
an element text of ‘1-D chi-sq’. Significance levels are printed with 4 decimal places.
Axis-level statistics only use data present in elements which appear before the stat=
element in the axis. If the statistic is to use all the data in the axis it must come at the end
of the axis. Additionally, if the axis contains a base element, the statistic is calculated
using elements between the base and the statistics elements only. To include all elements,
the base should therefore be the first element in the axis. For instance:
l brands
ttlQ5: Brands Bought
col 132;Base;Brand A;Brand B
n03
stat=chi1,1-D chi-sq
col 132;Brand C=’3’;Brand D=’4’
In this example the statistics will be calculated for brands A and B only, whereas in the
following example:
l brand2
ttlQ5: Brands Bought
col 132;Brand A;Brand B
col 133;Base;Brand C=’3’;Brand D=’4’
n03
stat=chi1,1-D chi-sq
If appropriate, more than one stat= element may be placed in an axis, as long as each one
is preceded by a base element, if necessary, and the requisite number of rows. To request
statistics for one set of elements in an axis, and further statistics for another
non-overlapping set of elements, you will need to divide the axis into segments, each one
beginning with a base element and ending with a stat= element. To request two or more
tests on the same group of rows, enter the stat= statements, one to a line, after those rows.
Thus:
l allbrd
ttlQ5: Brands Bought
col 132;Base;Brand A;Brand B
n03
stat=chi1,1-D chi-sq
n11Base
col 132;Brand C=’3’;Brand D=’4’
n03
stat=chi1,1-D chi-sq
produces 2 chi-squared statistics – the first for Brands A and B and the second for Brands
C and D.
Statistics and the associated significance levels are printed as a row or column
(depending on whether the axis is used as a row or column axis in the table), for every
table in which the axis appears, at the point at which the stat= element is defined in the
axis. The row or column text is as specified on the stat= statement, although if the axis
is to be used as a column axis, the column headings may be defined on g statements. The
statistic in each column (or row) is calculated from the figures in that column (row), from
the most recent base element, or the beginning of the table, through to the stat= element.
Table-level statistics
Quick Reference
To use table-level statistics, type:
stat=statname
on the tab statement, where statname is the name of the test required, and is one of:
If you need more than one statistic on the same table, list their names after the stat=
keyword, separated by commas. For example:
produces a table of brand usage by region and runs a 2-sample T test and an F test on it.
The statistic(s) requested are printed at the bottom of the table or on a new page if there
is insufficient space on the current page.
The chi2, ks and anova statistics produce a single statistic and significance level,
preceded by text at the left margin naming the statistic. The chis statistic prints a + or −
sign next to the percentage figure in a cell indicating whether it is significantly larger or
smaller than expected.
The z2, z3, z4, t2 and nk statistics all produce a triangular array of statistics and
significance levels, titled at the left margin with the name of the statistic. Each row and
column of the array is named by the side text defined on the corresponding axis element:
the row text is printed in the left margin and the columns are spread automatically across
the remainder of the page. If the texts are longer than 15 characters they will be truncated.
If there are so many columns in the triangular table that each one would be allocated less
than 5 columns, the statistical table is suppressed. The statistic testing the difference
between any two axis elements is found at the intersection of the row and column named
as those elements. (If this sounds complicated, an examination of one of the examples in
the following sections should make it clear.)
In all table-level statistics, the statistics and significance levels are printed to three places
of decimals. There are no options for modifying the text or layout of table-level statistics.
General notes
1. Many of the statistics require that the axis (or axes, with table-level statistics)
contains one or more base elements. In the case of some axis-level tests, these are
only necessary to separate segments of an axis from one another. Whether or not a
base element is required is defined in the notes which follow the description of each
statistical test.
Base elements may be printing or non-printing elements created using n10 or n11
statements, or base options on n01, n15, col or val statements.
2. Many of the statistics require a certain number of basic count (totalizable) rows.
These are rows created by n01, n15, col or val statements which obtain information
directly from the data file. In an ordinary job these rows will generally be counts of
people; in levels (hierarchical or trailer card) jobs they will be counts of households,
people, trips made, and so on. Other elements, such as text-only elements, are
ignored.
3. Some tests (generally those based on table-level statistics) require that the elements
to be tested are mutually exclusive. This means that respondents may be present in at
most one element of the axis. Examples of mutually exclusive axes are sex, age,
marital status, product preferred, and so on, where someone present in one category
will not be present in any other.
Complex axes often contain elements which are not wholly mutually exclusive: for
example, one containing elements for sex and elements for age, where respondents
may be present in both a sex category and an age category. Axes of this type may be
used as the basis for statistics although in many cases the values for the overlapping
categories should be ignored.
4. Some statistical tests become unreliable when performed on tables containing cells
with small numbers of respondents (e.g., less than 10) in them. These are generally
the tests resulting in a chi-squared or Z statistic. In these cases it is best, if possible,
to combine the row or column element, or both, with the logically nearest element to
increase the cell sizes. Otherwise, exclude the row or column from the test altogether
by specifying it with the option ntot. (The logically nearest element is the one whose
meaning is nearest to that of the small element – for example, the ‘18–24’ age range
would be combined with the ‘25–34’ age range rather than the ‘55 and over’ age
range.)
The table below summarizes the statistical facilities described in this manual. It indicates
where each test is specified, what name is used following the stat= keyword, what
requirements there are for Quantum to be able to perform the test, and whether one
statistic or a triangular array of statistics is produced. You will need to refer to the
appropriate chapter for a full description of the requirements of each test.
This table does not describe in full the statistical requirements of each test. The
descriptions in the following chapters provide basic information about these
requirements.
Quick Reference
To request a one-dimensional chi-squared test, type:
The one-dimensional chi-squared test statistic is an axis-level statistic. You may use it to
test whether the counts in an axis or a segment of an axis differ from those which would
result from a uniform distribution. A uniform distribution is one where all values have
the same relative frequency. Thus if an axis has 4 elements, and the respondents are
uniformly distributed over that axis, you would expect the number present in each
element to be 25% of the base for that axis.
l flavor
col =123;Base;hd=Low Fat;Strawberry;Raspberry;Blackcurrant;Pineapple
n03
stat=chi1, 1D chi-squared
n03
n11Base
col =123;hd=Original Flavor;Peach=’5’;Mango=’6’
n03
stat=chi1, 1D chi-squared
1. If all cell counts in a segment are the same, the chi-squared value will be zero.
2. Although the nz option suppresses all-zero rows in a table, these rows are still used
in the calculation of the chi-squared statistic.
3. The elements in the axis or in each segment must be mutually exclusive. This means
that a respondent must appear in only one element of the axis or segment.
4. Chi-squared tests may give misleading results when expected cell counts are small.
In this case, a useful guide is that the total of the counts in the axis, and in each
segment tested, should be 5 times the number of elements in the axis (or segment).
That is, the average of the counts in the axis or segment used should be at least 5.
5. Although a base element must be present as the first element of the axis, or of each
segment in the axis, only the basic count elements are actually used in the calculation.
Base : 60
Row 1 : 25
Row 2 : 19
Chi-Squared
Row 3 : 7
Row 4 : 9
Here, the statistic will be calculated for Rows 1 and 2 using a base of 44. Quantum
will then test whether those two counts are significantly different from 22.
Let’s look at an example. You have carried out a survey of purchases of washing powder
throughout the country, and now wish to test whether there is a preference for certain
powders in different regions. The Quantum program:
produces:
✎ Notice how an n33 statement may be used to enter text against the row of
significance levels.
Quick Reference
To request a two-dimensional chi-squared test, type:
stat=chi2
1. If there is no association – that is, the figures in each column (and, equivalently, in
each row) are distributed in the same proportions – the chi-squared value will be zero.
This would indicate, for example, that political opinions do not vary according to
age.
2. Elements in which all cells are zero are ignored by this statistic. You may suppress
them with the option nz.
4. Chi-squared tests may give misleading results when cell counts are small (i.e., less
than 10) or when there are both row and column bases which are small compared to
the overall base.
5. The statistic is calculated using the sum of totalizable (basic count) rows, the sum of
totalizable columns and the sum of all totalizable cells rather than the row base,
column base and table base.
You might use a two-dimensional Chi-Squared test when you wish to use the results of
a survey of political habits to test whether there is an association between voting patterns
and region.
produces:
Quick Reference
To request a single classification chi-squared test, type:
stat=chis[(options)]
b) the rows of the table are taken as the responses (e.g. brand preferred), and the
columns as the subsamples (e.g. sex=female). The + or − sign is printed to the right
of the column percentage;
c) the test uses unweighted data only, even if the table itself is weighted.
Options on the command line which change these are as follows. If more than one option
is required, the keywords must be separated with commas.
b) there is no previous base element in either direction (i.e. a missing row base, column
base or both), or
c) the row or column does not have the appropriate op= option on it (i.e. op=2 if the
columns are the subsamples, or op=0 if the rows are the subsamples).
In addition, if the weighted formula is requested, the following condition also applies.
The weighted formula uses both weighted and unweighted data, so when looking at the
subsample elements in a weighted table, Quantum expects each of those elements to be
preceded by a version of itself which is suppressed, unweighted and nontotalizable:
n15;c=condition;wm=0;nontot
n01Subsample 1;c=condition
The test will not be applied to elements where this suppressed element is not found. You
should note that Quantum does not check that the condition on the suppressed element
matches that on the corresponding subsample element.
The test is applied identically to single-coded and multicoded responses, and, although it
compares absolute figures, prints the results next to the appropriate percentage figures.
Whether or not a value is significant depends on the value of chi-squared at the given
confidence level, the value of chi-squared for the subsample being tested, and the size of
the subsample in relation to the total. Critical values used for testing significance are 3.84
at the 95% level and 6.63 at the 99% level.
If the value of chi-squared returned by the test is greater than the value of chi-squared at
the given level, and the subsample proportion is greater than the total proportion, the
sample is deemed to be significantly greater than expected, and a + sign is printed next
to the subsample proportion. If the value of chi-squared returned by the test is less than
the value of chi-squared at the given level, and the subsample proportion is less than the
total proportion, the sample is deemed to be significantly less than expected, and a − sign
is printed next to the subsample proportion.
In all other cases the difference is deemed insignificant and nothing is printed.
Base Yes No DK
Male 80% 39 30 11
48.8%+ 37.5%- 13.8%
Female 120% 22 83 15
18.3%- 69.2%+ 12.5%
The cell for men answering yes is flagged with a + sign. This means that it is significantly
greater than would be expected according to the overall proportion of people who
answered yes. In statistical terms this means that:
a) the value of chi-squared for that cell is greater than 6.63, and that
Where no + or − sign is shown, the subsample proportions are not significantly different
from the proportion for the sample as a whole.
Kolmogorov-Smirnov test
Quick Reference
To request a Kolmogorov-Smirnov test, type:
stat=ks
To request a Kolmogorov-Smirnov test, include the option stat=ks on the tab statement.
The table must have three columns only: the first must be the base column, and the other
two columns divide the sample into the two groups to be compared. For example, in the
shopping survey the table would require a base column and a column for each
supermarket.
The first row of the table must be the base row, while the other rows represent some
ordered classification of the respondents – numbers, numeric ranges, or measurements on
some ordered scale – listed in increasing order of magnitude.
1. Both the row and column axes must contain only elements which are mutually
exclusive.
2. When the rows comprise numeric ranges, remember that the test is based only on the
figures in the table, and therefore the more information there is in the table, the more
powerful the test will be. In other words, the more categories the better – you can lose
information by collapsing data too much into a few large categories. The counts in
the cells of the table can be small, even zero.
3. This test uses the sum of totalizable rows rather than the figures in the base row in its
calculation.
As an example, let’s expand on the shopping survey we mentioned just now. Suppose you
wish to compare frequency of shopping between a sample of people who shop at
Sainsbury’s and a sample who shop at Safeways, and you wish to know, not whether the
average number of visits differ, but whether the distributions themselves differ. A
Kolmogorov-Smirnov test is appropriate:
produces:
McNemar’s test
Quick Reference
To request a McNemar test, type:
to test for differences in a variable with just two possible values (e.g., yes/no). It is most
commonly used to test whether differences between ‘before and after’ measurements on
the same sample indicate a real change or are simply due to chance.
To run a McNemar test, you will need a stat=mcnemar element in the axis. This must
be preceded by exactly two basic count elements representing the changes. For instance,
one might count those respondents answering yes before and no after, and the other those
answering no before and yes after.
The first element in the axis must be a base element. If several McNemar tests are
required in the same axis, each must follow a base element (use n11 if you don’t want to
see these extra bases) and a pair of elements representing the changes.
1. The McNemar test is not concerned with the number of respondents whose opinions
do not change.
2. If the two counts are equal, the statistic will have a small but non-zero value.
3. In the same way as for the one-dimensional chi-squared test, the sum of the two
counts should be at least 10 to avoid giving misleading results.
This produces:
Quick Reference
To request Friedman’s two-way analysis of variance, type:
The test is performed by ranking each set of scores – that is, giving the value 1 to the
lowest score given by each respondent, 2 to the next lowest, and so on. (Sometimes, the
data are already in this form; for example, respondents may themselves have been asked
to rank their preferences for a set of products.)
Friedman tests are produced by a stat=friedman element in the axis. Each such element
must be preceded by at least two basic count elements identifying the products, tests etc.
to be compared. Each element must contain an inc= to calculate the sum of the ranks
given to the item specified in that element (as shown in the example below).
If your data columns contain scales which are not ranked (such as scores), you must use
statements in the Quantum edit to set the ranks – numbers from 1 upwards – into
variables. These can then be used to define the incs on the n statements used by the test.
For example:
data rnk 4s
ed
if (c131’9’) set rnk1’1’
if (c131’7’) set rnk1’2’
The first element in the axis must be a base element: other base elements may be present,
in which case they define the beginning of additional segments in the axis.
1. If there is no overall tendency for one product (or test or whatever) to score or be
ranked more highly than another, the value of the Friedman statistic will be zero. On
the other hand, the greater the disagreement between the ranks due to the different
respondents, the greater this value will be.
2. It makes no difference whether ranks are assigned by giving a rank of 1 to the lowest
score or preference and so on upwards, or to the highest score or preference and so
on downwards. Though the sums of ranks will, of course, be different, the value of
the Friedman statistic in each case will be exactly the same.
3. The Friedman test is extremely sensitive to any errors in assigning ranks to the
elements in the axis or segment. Each respondent must have assigned a score or rank
to each item in the axis or segment.
If the ranks are read directly from columns of data, you must ensure that the columns
contain one rank for each item and that the ranks the respondent has given are valid.
For example, when ranking four products on a scale of 1 to 4, the respondent must
have ranked each product within the range 1 to 4.
If the data columns contain scores, your Quantum edit must convert these correctly
into a valid set of ranks. Normally these will be exactly one of each of the numbers
from 1 to the number of elements; thus if there are four products which have been
ranked, there would be 4 elements in the axis or segment of the axis, and, for each
respondent, each element would contain one of the numbers 1 through 4.
If some products have not been ranked or invalid ranks are present in any of the data
columns, the Friedman statistic will be incorrect.
4. A respondent may assign the same rank to more than one product.
5. In order for the significance level associated with this statistic to be correct, there
should be a minimum of 10 respondents who have assigned scores or ranks to all the
items in the axis or segment.
6. Elements whose cells are all zero are included in the calculation of this statistic.
In the example below, respondents have expressed their preference among four washing
powders by giving the one they use most a value of 1, the next a value of 2, then 3 and 4.
We may write a section in the Quantum edit to check that these values result in a valid
product ranking, and then construct an axis which sums the ranks given to each product,
and performs Friedman’s test on the results. The resulting Quantum program is:
ed
r sp ’1/4’ o c(29,32)
c81 = xor(c29,c30,c31,c32)
if (c81 = ’1/4’) go to 5
write c(29,32) $product ranking incorrect$
5 continue
end
tab prdrank age
ttlProduct Preference
ttlBase: All Respondents
l prdrank
n10Base
n01Washo;c=c29’1/4’;inc=c29
n01Suds;c=c30’1/4’;inc=c30
n01Gleam;c=c31’1/4’;inc=c31
n01Sparkle;c=c32’1/4’;inc=c32
n03
stat=friedman,Friedman value;dec=5
n33Sig. level
l age
col 10;Base;18–24;25–34;35–44;45–54;55+
and it produces:
Product Preference
Base: All Respondents
29.6 Formulae
The formulae for the statistical tests in this chapter are shown below. The following
conventions have been used in these formulae:
• In the formulae for axis-level test statistics, the formula is applied separately to the
counts in each column or row, according to whether the axis containing the stat=
option is the row or column axis:
r represents the number of basic count rows from which the statistic
is calculated
c represents the number of basic count columns from which the
statistic is calculated
nij represents the (weighted) count in row i, column j,
N, Ni, Nj represent the (weighted) bases of the table overall, column i and
row j respectively.
• A dot suffix indicates summation over the replaced index; so, for example, the
formula for a column total is:
n⋅ j = ∑ nij
i=1
k 2
( ni – n )
∑
2
X = --------------------
n
i=1
2
is tested against the χ distribution with ( k - 1 ) degrees of freedom
where
n.
n = ----
k
r c 2
( n ij – e ij )
∑ ∑
2
X = ------------------------
e ij
i=1 j=1
2
is tested against the χ distribution with ( r – 1 ) ( c – 1 ) degrees of freedom
where
·
n i .n. j
e ij = -------------
n..
2
(O · – e · )
χ = ∑ -------------------------
2 2 2
e·
2
( n – a ) – (-----------------
N – b ) 2
2
a – b*n --------- *n
2 N N
χ = -------------------------- + -------------------------------------------------------
b N–b
---- * n ------------- * n
N N
Kolmogorov-Smirnov test
2 2 N1 N 2
X KS = 4D -------------------
N1 + N2
2
is tested against the χ distribution with 2 degrees of freedom.
where
i i
r ∑ n k2 ∑ n k1
=1 =1
D = max k-----------------
- – k-----------------
-
N2 N1
i = 1
McNemar’s test
2
2 ( n 1 – n2 – 1 )
X MN = ------------------------------------
n1 + n2
2
is tested against the χ distribution with 1 degree of freedom.
Friedman’s test
k
12
∑ Ri – 3N ( k + 1 )
2 2
Xr = ------------------------
Nk ( k + 1 )
i=1
2
where R i is the sum-of-ranks in cell i of the axis, is tested against the χ distribution with
k – 1 degrees of freedom.
30.1 Z - tests
Quantum provides four types of Z-test for comparing proportions with specified values
and with other proportions.
Quick Reference
To request a one-sample Z-test, type:
as an element in the axis. factor is a value between 1 and 100 and is the proportion against
which values are to be compared.
This is an axis-level statistic which is used to test whether the percentage of respondents
in a particular element differs from a given value. For example, if you wanted to see
whether more than 40% of yogurt buyers buy low-fat brands only, you would compare
each sample percentage with the number 40.
To request a one-sample Z-test, place a stat=z1;fac=n element in the axis. The fac=
option on the stat statement specifies the value, expressed as a percentage, with which
the percentages in the preceding element are to be compared. n may be a whole number
or a decimal number with up to six decimal places. Each stat= element must be preceded
by a base element and a single element of basic counts.
The Z-tests may give misleading results when the bases from which proportions are
calculated are small. In this case, the base element should contain at least 10 respondents
for the test to be valid.
The Z-value will be zero if the difference is zero, otherwise it will be negative if the
calculated percentage is smaller than the specified value, and positive if it is greater.
The example which follows defines an axis for a survey investigating purchases of dairy
products. We are checking whether 50% of respondents buy low-fat brands of yogurt,
and then testing this hypothesis among different age-related sub-samples.The Quantum
spec. is:
produces:
Quick Reference
To request a two-sample Z-test, type:
stat=z2
This test is produced by the option stat=z2 on the tab statement. The table must consist
of a base row and one row of basic counts only. The test calculates a Z-value comparing
each column percentage with each of the other column percentages, and produces a
triangular table showing all the Z-values and their associated significance levels. The
triangular table is labeled with the text ‘Z TEST – TYPE 2’.
1. The row of basic counts defines an attribute which respondents in that row have; for
example, the attribute of having a full-time job.
2. The percentages (proportions) which are compared are always calculated for the test
by dividing the count in each cell of the row to be tested by the corresponding cell in
the base row. It is not necessary for the column percentages to be printed using the
option op=2 (though you might find it confusing to use a two-sample Z-test and print
the row or total percents instead).
3. The columns of the table should define the different groups of people in such a way
that each group is mutually exclusive – for example, age groups or sex. If the column
axis defines more than one set of mutually exclusive elements the test will still be
printed, but the comparisons between elements which are not mutually exclusive will
be meaningless and should be ignored. For example, if the column axis contains both
sex and age breakdowns, the comparison between, say, ‘Female’ and ‘Age 18–25’
must be ignored since some respondents may be women and aged 18–25.
4. The Z-tests may give misleading results when the bases from which proportions are
calculated are small. In this case, tests involving a column whose base is less than 10
should be treated as approximate only. Such columns should preferably be combined
with the nearest logical equivalent.
5. The calculation for Z subtracts the first proportion from the second, rather than the
more usual method of subtracting the second proportion from the first.
produces:
Job Status
Base: All Women
Base 18-24 25-34 35-44 45-54 55+
Base 605 96 194 91 126 98
Full-Time 297 29 107 66 75 20
Z TEST - TYPE 2
18-24 25-34 35-44 45-54
25-34 2.388
0.017
Quick Reference
stat=z3
This test requires a stat=z3 option on the tab statement. The table must consist of a base
row only, and the first element of the column axis must be a base element. The test
calculates a Z-value comparing each row percentage with each of the other row
percentages, and produces a triangular table showing all the Z-values and their associated
significance levels. The table is labeled with the text ‘Z TEST – TYPE 3’.
2. The columns of the table should define groups of respondents in such a way that the
groups are mutually exclusive – for example, age groups or sex. If the column axis
defines more than one set of mutually exclusive elements the test will still be printed,
but the comparisons between elements which are not mutually exclusive will be
meaningless and should be ignored. For example, if the column axis contains both
sex and age breakdowns, the comparison between, say, ‘Female’ and ‘Age 18–25’
must be ignored since some respondents may be women and aged 18–25.
3. The Z-tests may give misleading results when the base from which proportions are
calculated is small. In this test the base should be at least 20.
4. The calculation for Z subtracts the first proportion from the second, rather than the
more usual method of subtracting the second proportion from the first.
produces::
Quick Reference
To request a Z-test on overlapping samples, type:
stat=4
This test requires a stat=z4 option on the tab statement. The table must consist of an axis
tabbed against itself, and the axis must allow multicoding. The first element of the axis
must be a base element, and there must be at least two basic count elements in the axis.
The test calculates a Z-value comparing each row percentage with each of the other row
percentages in the base row, and produces a triangular table showing all the Z-values and
their associated significance levels. This table is labeled with the text ‘Z TEST – TYPE 4’.
2. Although the test is only comparing proportions in the base row, it is necessary to
have the axis tabbed against itself because Quantum needs to know the extent of the
overlap between the different elements of the axis.
3. The Z-tests may give misleading results when the base from which proportions are
calculated is small. In this test the base should be at least 20.
4. The calculation for Z subtracts the first proportion from the second, rather than the
more usual method of subtracting the second proportion from the first.
produces:
In this section we describe a set of tests which may be used to investigate whether means
differ significantly from each other or from specified values. The statistics used are the
T and F statistics, two of the so-called ‘classical’ test statistics.
Quick Reference
The one-sample T-statistic is an axis-level statistic. It may be used to test whether the
mean of a numeric variable or factor (fac=) is significantly different from zero or some
other specified value. It may also be used to test for differences between means measured
on matched samples (the paired T-test) – for example, between the means of two
variables both obtained from the same sample of respondents (see the ‘Notes’ section
below).
For example, you may wish to test whether respondents spent the same length of time per
day, on average, watching broadcast television before and after purchasing a video
recorder. To request a one-sample or paired T-test you should include stat=t1 element in
the axis at the point at which you want the statistic displayed. To define the variable or
factor to be tested and perform the necessary statistical summations, you will need either
a fac= option on each basic count element in the axis, or an n25 element in the axis with
the inc= option.
1. The value of T will be zero if there is no difference in the data, otherwise the sign of
T will reflect the sign (direction) of the difference.
2. It is not necessary to use n12 (mean), n17 (standard deviation) or n19 (standard error)
elements in the axis as these are automatically calculated by the stat=t1 element.
However, you will probably wish to print at least the mean using n12 so that you can
see the values which are being tested by the T-statistics.
3. In a weighted run, the compiler inserts an unweighted n15 with the option nontot in
the axis so that the T-test can be calculated using unweighted figures.
4. The simplest use of the one-sample T-test is when testing whether the mean of a
variable already coded in columns of the data is zero. In this case you need only
specify the required columns on the inc= option of the n25 statement. For example:
n25;inc=c(120,122);c=c119’1’
stat=t1,One-Sample t-test
5. There may be occasions when you wish to use a one-sample T-test on values which
are not the same as those in the data. You may create these values using fac= on n01
or col elements.
6. If you wish to test whether a mean may be different from some non-zero value, you
should subtract that value from each data value. In other words, to test whether the
mean number of visits to a supermarket is equal to 2 you actually test whether the
mean of (number of visits to supermarket – 2) is equal to 0. For example:
n25;inc=c(120,122)–2;c=c119’1’
stat=t1,One-Sample T-test
7. If you wish to make a paired test between two data values, you should test whether
the difference between them is zero. For example, to make a test of the difference
between the data values in columns 120–122 and in columns 123–125 you would
write:
n25;inc=c(123,125)–c(120,122);c=c119’1’
stat=t1,Paired T-test
If the calculation of the values to be used by the T-test is more complicated than this,
you may need to write an edit to calculate the values. A simple example which has
the same effect as that shown above is:
8. If the axis being tested contains fac= and inc=, Quantum scans backwards through
the axis from the stat=t1 element and uses whichever of the two it finds first (i.e.,
whichever of fac= or inc= occurs closest to but still before the statistical element).
produces:
produces:
Two-sample T-test
Quick Reference
To request a two-sample T-test, type:
stat=t2
The two-sample T-statistic is a table-level statistic. It may be used to test whether the
means of a numeric variable or factor are the same in two separate samples or
sub-samples, or to make a number of such comparisons, pair-wise, between more than
two samples. For example, you might wish to compare the length of time per day, on
average, spent watching broadcast television by owners of video recorders, with the same
figure for non-owners.
This test is produced by a stat=t2 option on the tab statement. The column axis defines
the groups to be compared. These must be mutually exclusive. The row axis must include
a base element, a mean (n12) and a standard deviation (n17) – these require fac= options
on the axis elements or an n25 element with the inc= option.
1. This statistic calculates T-values using rows of means and standard deviations. Each
mean in the n12 row is compared against every other mean value in that row. A
triangular matrix of T-values and significance levels is produced with values for each
pair of means. It is labeled with the text ‘T TEST – TYPE 2’.
2. The column axis must define groups of respondents which are mutually exclusive –
for example, age groups or sex. If there is more than one set of mutually exclusive
elements in the axis the test will still be printed, but the comparisons between
elements which are not mutually exclusive will be meaningless and should be
ignored. For example, if the column axis contains both sex and age breakdowns, the
comparison between, say, ‘Female’ and ‘Age 18–25’ should be ignored since some
respondents may be women and aged 18–25.
3. The value of T will be zero if there is no difference in the data, otherwise the sign of
T will reflect the sign (direction) of the difference.
4. The calculation for T subtracts the first mean from the second rather than usual
method of subtracting the second mean from the first.
5. Elements whose cells are all zero are excluded from this test. You may suppress them
with the nz option if you wish.
6. This test uses the sum of totalizable rows and the input to the means and standard
deviation in its calculations.
7. If the axis being tested contains fac= and inc=, Quantum scans backwards through
the axis from the stat=t1 element and uses whichever of the two it finds first (i.e.,
whichever of fac= or inc= occurs closest to but still before the statistical element).
This produces:
Notice, in this example, that the column headings at the top of the main table are different
to those in the statistical table. Those at the top of the main table are defined by the g
statements in the axis, whereas those in the statistical table are taken from the col
statement. The reason for this is that the full element text, as shown on the g statements
is too long to fit into the 15 characters allocated to the statistical columns.
☞ For general information on the size and layout of statistical output, see the section
entitled “Axis-level statistics” in chapter 29.
Quick Reference
To request a one-way analysis of variance, type:
stat=anova
Being a table-level statistic, this test requires a stat=anova option on the tab statement.
The column axis defines the groups to be compared, which must be mutually exclusive.
The row axis must include a base element, a mean (n12) and a standard deviation (n17)
– these require fac= options on the axis elements or an n25 element with the inc= option.
2. The F-test is invalid if the column axis defines groups which are not mutually
exclusive.
3. Elements whose cells are all zeros are included in this test.
4. The calculation uses the sum of totalizable rows and the input to the mean and
standard deviation rather than the base and the mean and standard deviation values
themselves.
5. If the axis being tested contains fac= and inc=, Quantum scans backwards through
the axis from the stat=t1 element and uses whichever of the two it finds first (i.e.,
whichever of fac= or inc= occurs closest to but still before the statistical element).
As an example, we may use the F-test to examine more carefully the results of the
previous example, used to illustrate the two-sample T-test. The Quantum spec. is the
same as that used in the previous example, except that the tab statement becomes:
Notice how the significance of the F-statistic allows us to be more confident about the
real significance of the differences between the means.
Quick Reference
To request a Newman-Keuls test, type:
stat=nksig_level
The standard Newman-Keuls test (as described in Winer) is a table-level statistic which
may be used as an alternative to T-tests when you wish to compare the differences
between the means of two or more samples.
The test is produced by the option stat=nknn option on the tab statement, where nn is 90,
95 or 99 depending on the level at which results are required. The column axis defines
the groups to be compared. The row axis must include a base element, a mean (n12) and
a standard deviation (n17) – these require fac= options on the axis elements or an n25
element with the inc= option.
1. This statistic calculates Q-values at the 90%, 95% or 99% level, as defined on the tab
statement. A triangular matrix of Q-values is produced with values for each pair of
means. It is labeled with the text ‘NEWMAN-KEULS STATISTIC’ followed by the
level at which the values have been calculated.
2. This statistic uses the sum of totalizable rows and the input to the mean and standard
deviation rather than the base and the mean and standard deviation themselves.
3. If the axis being tested contains fac= and inc=, Quantum scans backwards through
the axis from the stat=t1 element and uses whichever of the two it finds first (i.e.,
whichever of fac= or inc= occurs closest to but still before the statistical element).
The table below uses the same row and column axes as those used for the Two-Sample
T-test (see Figure 30.7). It was created by the statement:
30.5 Formulae
The formulae for the statistical tests in this chapter are shown below. The following
conventions have been used in these formulae:
• In the formulae for axis-level test statistics, the formula is applied separately to the
counts in each column or row, according to whether the axis containing the stat=
option is the row or column axis:
r represents the number of basic count rows from which the statistic is
calculated
c represents the number of basic count columns from which the statistic
is calculated nij represents the (weighted) count in row i, column j,
N, Ni, Nj represent the (weighted) bases of the table overall, column i and row j
respectively.
• A dot suffix indicates summation over the replaced index; so, for example, the
formula for a column total is:
n. j = ∑ nij
i=1
• The sum of factors, mean, standard deviation, standard error and sample variance of
a row or column are calculated in exactly the same way as by the n13, n12, n17, n19
and n20 statements.
∑ ni xi
The sum of factors is given by ·
x =
2 1---
The standard deviation is given by: ( ∑ ni x i ) 2
n x 2 – ---------------------- -
∑ i i N
s = ----------------------------------------------
N–1
In all cases, xi represents the factor or increment associated with the ith cell.
p – p0
z = ------------------------------------1-
---
2
---
1
- p ( 1 – p 0 )
N 0
where
n
p = ----
N
p2 – p1
z = --------------------------------------------------1
---
2
----
1
- + ----- p ( 1 – p )
1
n 1 n 2
where
nj
p j = -----
Nj
and
n1 + n2
p = -------------------
N1 + N2
p2 – p1
z = ---------------------------------------------------------------------------------------------1-
---
2
---
1
- ( p ( 1 – p 1 ) + p 2 ( 1 – p 2 ) + 2p 1 p 2 )
N 1
where
Nj
p j = -----
N
p2 – p 1
z = ---------------------------------------------------------------------------------------------------------------1
---
2
---
1
- p 1 – p 1 ) + p 2 ( 1 – p 2 ) + 2p 1 p 2 – 2p 12 )
N ( 1 (
where
Nj
p j = -----
N
and
n ij
p ij = -----
N
x
t = ------------
se ( x )
x2 – x2
t = ----------------------------------------------------------------------------------------------1-
2 2 ---
( ( N 1 – 1 )s 1 + ( N 2 – 1 )s 2 ) 1 1
2
-------------------------------------------------------------
-
N 1 N 2
-----
- + -----
-
N1 + N2 – 2
r 2 2
c r
∑ n x
c ij ij ∑ ∑ n ij x ij
i = 1 j = 1 i = 1
∑ ----------------------------
Nj
– --------------------------------------
N
j = 1
MS B = ----------------------------------------------------------------------------------------
c–1
2
( ∑ ( Njxj ) )
∑ j j ----------------------------
2
N x – -
N
= ------------------------------------------------------
c–1
r 2
n x
c r c ∑ ij ij
i = 1
∑ ∑ nij xij – ∑ ----------------------------
2
Nj
j = 1i = 1 j = 1
MS W = ---------------------------------------------------------------------------------
N–c
∑ SS j – ∑ Nj xj
2
= ----------------------------------------
N–c
∑
2
〈 N j – 1〉 s j
= -------------------------------------
N–c
where
Σ 2
SS j = ----- n ij x ij
i
MS B
F = ------------
MS W
Newman-Keuls test
The formula for this test is as given in Winer, Statistical Principles in Experimental
Design, 3rd Edition (ISBN0007070923).
This chapter describes miscellaneous features of the tabulation section. These are the
inclusion of C programming code or edit statements in the tabulation specifications and
the sorting (ranking) of tables.
Quick Reference
#c
C code
#endc
#
c
/* C code here
#endc
Quick Reference
To include edit statements in the tabulation section, type:
#ed
edit statements
#end
Edit statements may by embedded in the tabulation section be enclosing them in #ed and
#end statements. This can be useful when you need to do a recode in an axis but do not
want to write a full edit. For instance:
l ax01
#ed
if (c104’2’) c181=or(c151,c152,c153,c155,c155)
#end
n01First Row;c=c181’1’
ed
if (c104’2’) c181=or(c151,c152,c153,c154,c155)
end
.
l ax01
n01First Row;c=c181’1’
✎ If the edit statements are in an axis that is not used, your program will not compile.
If a value is assigned to a variable, the variable retains that value for all other axes
unless it is reset within another axis. The exception is when using inc= statements
which are stored until the run.
Sometimes we wish rows to be arranged according to the size of the counts or values in
the cells, the largest in the first row, the second largest in the second row, and so on. The
Quantum words that control the production of sorted or ranked tables are:
sort row-wise sort (i.e., largest row first, smallest row last).
rsort row-wise sort (the same as sort).
csort column-wise sort (i.e., largest column first, smallest column last).
pcsort sort on percentages in the direction defined by sort or csort.
nosort do not sort this table.
sortcol selects the column on which to sort when sorting is row-wise. The
default is to sort on the figures in the base column.
You may place any of these keywords on the tab statement of the table which is to be
sorted, or on the a/flt/sectbeg statement, in which case all tables at that level will be sorted
(tables which are not to be sorted would then take the keyword nosort on the tab
statement).
Sorting rows
Quick Reference
To sort the rows of a table, type:
sort
Sorting is usually done on the figures in the base column. To sort on a different column,
type:
sortcol
on that element.
To define an unsorted element, axis or table in an otherwise sorted axis, table or run, type:
nosort
The statement:
produces a table of prefer by sex in which the product preferred by most people forms the
first row, and the product preferred by fewest people is the last row, thus:
Suds 59 10 49
Bubbles 49 11 38
New Foam 35 7 28
Sparkle 30 6 24
Glow 18 8 10
Extra Glow 9 2 7
As you can see, sorting is done using the figures in the base column of the table. By
chance, this means that the column for women is also sorted in descending order.
If you want to sort on, say, the third column of the table, just put the option sortcol on the
appropriate element in the axis:
l sex
col 10;Base;Male;%sortcol;Female
If you want an axis to be sorted every time it is used as a row axis, you may place sort on
the l statement. Alternatively, if the axis is always to be unsorted in an otherwise sorted
run, place the keyword nosort on the l statement. Both these methods are quicker and
easier than remembering to place sort/nosort on every tab statement which uses the axis.
Sorting columns
Quick Reference
To sort the columns of a table, type:
csort
To sort the columns of a table, place the keyword csort on the tab statement. For
example, the statements:
Sorting percentages
Quick Reference
To sort on percentages rather than absolutes, type:
pcsort
To sort on percentages rather than absolutes, use pcsort. This is not a keyword that you
can use by itself: you must use it with sort or csort since these define whether sorting is
by rows or columns. Without one of these keywords, Quantum will not know which
direction to sort in and will therefore ignore pcsort.
The direction of sorting also determines what type of percentages will be sorted. If you
are sorting in row order with sort (i.e., vertically), Quantum will sort column (vertical)
percentages; if you are sorting in column order with csort (i.e., horizontally), Quantum
will sort row (horizontal) percentages. You cannot sort row percentages in row order or
column percentages in column order. In terms of keyword combinations, this means
Quantum will sort percentages for the following combinations only:
Tables may be sorted at different levels: rows may be grouped together and sorted
internally within the group before the group as a whole is sorted with the other elements
of the axis. This is extremely useful when you are sorting tables containing nets.
All rows in a net may be sorted amongst themselves, completely separate from the rows
of any other net. Then the nets may be sorted according to the number of people in each
net. The resultant table will show the largest net first and within that, the most frequently
occurring response.
There are two ways of sorting nets. The simplest is to place the keyword netsort on the a
or l statement. This sorts nets and their component elements automatically, and indents
each net and standard element text by a fixed amount according to the level at which the
text occurs. The second method is to define the elements to be sorted as a group using the
keywords subsort and endsort to mark the start and end of each sort group. You may find
this method useful if you want to indent element texts by different amounts for each net
level or sort group.
The sections which follow deal with each method separately, but use the same sample
tables as a means of illustrating the differences and similarities between the two methods.
Quick Reference
To create a sorted table of nets, type:
netsort[=spaces_per_level]
where space_per_level is the number of additional spaces to ident element texts at each
level.
The quickest way to define an axis which will create a sorted table of nets is to place the
keyword netsort on the a or l statement (netsort is not valid on sectbeg, flt or tab
statements) and the keyword sort on the a/sectbeg/flt/tab statement. When these two
keywords are used in the same table, a net statement determines not only the level at
which the net is to be created (i.e. whether it is a top level net, a subnet, a sub-subnet, and
so on), but also the level at which the net and the elements it contains are to be sorted in
relation to the other elements in the table. This means that nets at level 1 will be sorted
and, within them, nets at level 2 will be sorted, and so on. Individual elements within a
net will be sorted too.
When we were discussing nets by themselves in section 17.6, we said that netsort
determines the number of spaces by which each net and element text is indented. This is
still the case when netsort is used in sorted tables. Texts at each level below level 1 will
be indented by 2 spaces per level, thus nets at level 2 are indented by 2 spaces (1×2
spaces); nets at level 3 are indented by 4 spaces (2×2 spaces). The elements comprising
a net are indented by an additional two spaces. You may request a different indent by
typing netsort=n, where n is the number of spaces by which to indent, instead of netsort
by itself.
To turn off indenting for a single table in a run where indenting is the default, add the
option nonetsort or netsort=0 to the l statement of the table’s row axis.
Sometimes the axis will contain rows which are not to be sorted at all. These elements
require the option nosort. If the element is part of a group, it will retain its original
position in the group even if the group later occupies a different position in the sorted
output. If the element is not part of any group, it will retain its original place regardless
of any other elements.
Let’s take a simple example to start with. We have an axis dealing with peoples’ opinions
of a new chocolate bar they have tried. Responses are netted under the headings Taste
and Texture, and there is also a row to gather respondents giving no answer at all. Taste
and texture are to be sorted so that the one containing the most respondents appears first.
The No Answer row is to remain as the last row of the table.
Within taste and texture the various comments are to be sorted so that the one mentioned
by most people is printed first. Each net contains a Don’t Know row which must always
be the last line of the net. Element texts are to be indented by 1 space per net/sort level.
To satisfy this specification we write:
The numbers in the parentheses are not part of the row specifications: they are the number
of respondents giving each response.
First, let’s see how each group is sorted internally. The sort is conducted on the first
column (created by the first condition in the axis). We will assume that this is the base,
so the figures in parentheses are totals.
With Taste, all elements down to the net statement for Texture are assumed to be part of
the same net and sort group. Thus, we would have:
The same method of sorting applies to the Texture group, except that this time the net and
sort group is terminated by the netend1 statement.
Then, the two groups are sorted in relation to each other. Since the net for Texture (340)
is larger than the net for Taste (310), Texture Observations is placed first, without regard
to the other values within either group. Thus, our final table is as follows:
Notice that the three elements which were specified with the nosort keyword have all
retained their original places in the relation to the other elements in their groups. Notice,
also, that the texts at level 1 are not indented, whereas those at level 2 (i.e. the elements
which make up the two nets) are indented by 1 space, as requested with netsort.
Net and sort group may be made up of any number of rows and may themselves be
subdivided into smaller groups. This is called nesting. Suppose our taste and texture nets
refer to the chocolate topping on a cake, and our axis also has comments about the body
of the cake itself. This gives us two main groups for sorting – the topping and the cake –
and within each we have two sublevels, namely the taste and the texture. This would be
specified as follows:
l taste;netsort
n10Base (142)
net1Chocolate Topping (Net) (27)
n01Not Sweet Enough;c=c121’2’ (19)
net2Texture Observations (Sub-net) (41)
n01Too Course;c=c122’1’ (20)
n01Too Fine;c=c122’2’ (21)
net1Cake (Net) (50)
net2Taste Observations (Sub-net) (48)
n01Too Sweet;c=c123’1’ (22)
n01Not Sweet Enough;c=c123’2’ (26)
net2Texture Observations (Sub-net) (44)
n01Too Course;c=c124’1’ (20)
n01Too Fine;c=c124’2’ (24)
Total
Base 142
Cake (Net) 50
Taste Observations (Sub-net) 48
Not Sweet Enough 26
Too Sweet 22
Texture Observations (Sub-net) 44
Too Fine 24
Too Course 20
Chocolate Topping (Net) 47
Taste Observations (Sub-net) 46
Too Sweet 27
Not Sweet enough 19
Texture Observations (Sub-net) 41
Too Fine 21
Too Course 20
net2Taste Observations on
n33the Chocolate Topping (Sub-net)
If the axis contains an ntt to create a text-only net element, the elements in the ntt group
will be sorted although the group as a whole, including the ntt element, will retain its
original position in the axis. To illustrate this, let’s add a group of miscellaneous
comments to the end of the previous axis:
l taste;netsort
n10Base
net1Chocolate Topping (Net)
net2Taste Observations (Sub-net)
n01Too Sweet;c=c121’1’
n01Not Sweet enough;c=c121’2’
net2Texture Observations (Sub-net)
n01Too Course;c=c122’1’
n01Too Fine;c=c122’2’
net1Cake (Net)
net2Taste Observations (Sub-net)
n01Too Sweet;c=c123’1’
n01Not Sweet enough;c=c123’2’
net2Texture Observations (Sub-net)
n01Too Course;c=c124’1’
n01Too Fine;c=c124’2’
netend1
n01No Mentions;c=c(121,124)$ $;nosort
ntt1Miscellaneous Mentions
n01Other topping observations;c=c121’3/&’.or.c122’3/&’
n01Other cake observations;c=c123’3/&’.or.c124’3/&’
Total
Base 142
Cake (Net) 50
Taste Observations (Sub-net) 48
Not Sweet Enough 26
Too Sweet 22
Texture Observations (Sub-net) 44
Too Fine 24
Too Course 20
Chocolate Topping (Net) 47
Taste Observations (Sub-net) 46
Too Sweet 27
Not Sweet enough 19
Texture Observations (Sub-net) 41
Too Fine 21
Too Course 20
No Mentions 80
Miscellaneous Mentions
Other cake observations 35
Other topping observations 29
Quick Reference
To sort the table in sections, define the start of each section with:
subsort
on the first element in the section. Mark the end of the section with:
endsort[=num_sections]
The second way of specifying sorted nets is to use the keywords subsort and endsort to
mark the start and end of each sort group in the axis. Although this involves you in more
work, it can be useful when you want to use different amounts of indentation for different
sort levels – for instance, to indent all second-level elements by 1 space and all third-level
elements by 2 spaces.
The differences between this example and the previous version are as follows:
2. All indentation has been done manually by preceding each element text with a space.
Notice that there is no need to mark the top-level sort groups (i.e., the net and No Answer
elements) since these will be sorted automatically by the keyword sort on the tab
statement.
Groups defined with subsort and endsort may be nested up to a depth of seven levels of
sorting – that is, you may type in up to seven subsorts before typing an endsort to
terminate one of the groups. If one row terminates more than one group, endsort must be
entered as endsort=n where n is the number of groups terminated. If we rewrite the
specification for Figure 31.1, it becomes:
l taste
n10Base
net1Chocolate Topping (Net)
net2Taste Observations (Sub-net);subsort
n01 Too Sweet;c=c121’1’;subsort
n01 Not Sweet Enough;c=c121’2’;endsort
net2Texture Observations (Sub-net)
n01 Too Coarse;c=c122’1’;subsort
n01 Too Fine;c=c122’2’;endsort=2
net1Cake (Net)
net2Taste Observations (Sub-net);subsort
n01 Too Sweet;c=c123’1’;subsort
n01 Not Sweet Enough;c=c123’2’;endsort
net2Texture Observations (Sub-net)
n01 Too Coarse;c=c124’1’;subsort
n01 Too Fine;c=c124’2’;endsort=2
The top level of sorting is between the rows ‘Chocolate Topping’ and ‘Cake’. Within the
topping net we have two sublevels, each of which is delimited by the keywords subsort
and endsort. The row entitled ‘Too Fine’ in the Texture subnet terminates the texture
subsort as well as the sort between taste and texture observations in general, so we use
endsort=2 to indicate that we are terminating two levels of sorting.
Text-only nets with ntt work with subsort and endsort exactly the same as with netsort,
except that the elements within the net will require subsort and endsort keywords if they
are to be sorted within the net. If we indent second-level texts by 1 space and third-level
texts by 2 spaces, the spec. at Figure 31.2 becomes:
Total
Base 142
Cake (Net) 50
Taste Observations (Sub-net) 48
Not Sweet Enough 26
Too Sweet 22
Texture Observations (Sub-net) 44
Too Fine 24
Too Course 20
Chocolate Topping (Net) 47
Taste Observations (Sub-net) 46
Too Sweet 27
Not Sweet enough 19
Texture Observations (Sub-net) 41
Too Fine 21
Too Course 20
No Mentions 80
Miscellaneous Mentions
Other cake observations 35
Other topping observations 29
Figure 31.3 Sorted table of nets created with subsort and endsort
Text-only elements created with n03 statements automatically attach themselves to the
next numeric element in the axis and are sorted with that element. If there is no
subsequent numeric element, or the sort level changes (e.g., with a new net statement or
with subsort) the n03 remains unsorted. To force an n03 to be unsorted, add the option
nosort at the end of the statement.
n33 statements which define continuation text attach themselves to the element whose
text they continue and remain with that element if it is sorted.
Subheadings created with n23 statements are always unsorted unless they specifically
carry the option sort.
The two examples below show how to use an n03 to separate unsorted No Answer and
Don’t Know responses from the rest of the table. The table itself is shown at the end of
this chapter, but the overall layout that we want to achieve is this:
Efficacy Net
Comments
Freshening Sub-Net
Comments
Fragrance Net
Comments
Fragrance Intensity Sub-Net
Comments
Miscellaneous
Comments
DK/NA
The highest level is the two nets and the group of miscellaneous statements. The second
level is the comments within these groups, and the third level is the comments within the
two sub-nets.
Before we write our axis, there are some other points to bear in mind. First, the group of
miscellaneous comments and the DK/NA row are to remain in that order at the bottom of
the axis, even though the miscellaneous comments themselves are to be sorted. Second,
all subnets are to remain at the end of the main net, even though their components are to
be sorted. Third, since there will only be a few rows in the efficacy net, they are not to be
sorted at all.
Because we want a sorted table, we start by putting the keyword sort on the tab statement.
This sorts all rows which are not part of a sublevel, namely the two net rows, the
miscellaneous row, and the two rows at the end of the axis. The nets are counts of people
and are therefore created by net statements, but Miscellaneous is a heading only and is
therefore created by an ntt at the appropriate level. The rows entitled ‘Nothing Disliked’
and ‘Don’t Know/No Answer’ are to remain in their original places so they take the
option nosort.
When the table is sorted, the two top level nets are sorted according to the number of
respondents they contain, and within that the subgroups are sorted. The group of
miscellaneous comments is sorted but retains its original place in the axis (i.e., after the
two nets).
☞ The table which these axes produce is shown at the end of this chapter.
The other way of writing this axis is to use subsort and endsort in place of netsort:
l dislike
ttlTable 5: What is there about this product
ttl that you think you would dislike?
n10Total Respondents
net1EFFICACY (Net)
n33==============
n01 Doesn’t Work;c=c223’12’;subsort;nosort
n01 Other Efficacy Comments;c=c223’&’;nosort;endsort
net2 Freshening (Sub-Net)
n33--------------------
n01 Doesn’t Freshen Room;c=c223’7’;subsort;nosort
n01 Other Freshening Comments;c=c223’&’;endsort;nosort
net1FRAGRANCE (Net)
n33===============
n01 Dislike Fragrance;c=c224’125’;subsort
n01 Smells Artificial;c=c224’3’
n01 Fragrance Names not Descriptive Enough;c=c224’4’
n01 Other Fragrance Comments;c=c224’&’;nosort;endsort
net2 Fragrance Intensity (Sub-Net)
n33-----------------------------
n01 Strong Overpowering Smell;c=c227’1’;subsort
n01 Weak Fragrance/ Not Strong Enough;c=c227’2’
n01 Other Fragrance Intensity Comments;c=c227’34&’;nosort;endsort;endnet1
ntt1MISCELLANEOUS
n33==============
n01 Doesn’t Last Long;c=c225’1’;subsort
n01 Too Expensive;c=c225’7’
n01 More Expensive Than Other Products;c=c225’8’
n01 Difficult / Inconvenient to Use;c=c228’1/4’
n01 Don’t Buy This Type of Product;c=c226’12’
n01 Might be Harmful / React Chemically;c=c226’34’
n01 Allergic to This Type of Product;c=c226’678’
n01 Prefer Other Types of Products;c=c226’90–’
n01 Other Miscellaneous Comments;c=c226’&’.or.c229’1’;nosort;endsort
n03
n01Nothing Disliked;c=c232’–’;nosort
n01Don’t Know/ No Answer;c=c232’0&’;nosort cR
Here, the first row in each net has the keyword subsort, indicating that it and all
subsequent rows form a subgroup of the net and are to be printed in rank order beneath
it. The group of miscellaneous comments also form a subgroup. Subgroups are
terminated by the keyword endsort.
Notice that even though the subnets Freshening and Fragrance Intensity are part of the
nets, they are dealt with as a separate group within the net (i.e., they have their own
subsort/endsort group). This is because we want to keep all comments to do with
freshening and fragrance intensity under their respective net rows. If we left them as part
of the overall Efficacy or Fragrance net, the individual comments would be sorted with
the other comments in those nets according to their size, rather than being kept together
as a subgroup.
Subtotals, totals, statistical elements and elements created using m statements are not
sorted unless you place the sort keyword on the element itself.
Sorting takes place after the cell counts for these elements have been calculated, not
before, and subtotals and totals are not recalculated after elements have been sorted. If
you are careless in the way you write your spec you could create tables that look wrong
simply because the order of elements in the axis has changed after the cells values were
calculated.
You can create a sorted table of means using just n25 and n12 statements. For example:
tab q1 banner;sort
l q1
n25;inc=c120;c=c120’1/7’
n12Mean;dec=2;sort
n25;inc=c121;c=c121’1/7’
n12Mean;dec=2;sort
n25;inc=c122;c=c122’1/7’
n12Mean;dec=2;sort
n25;inc=c123;c=c123’1/7’
n12Mean;dec=2;sort
An alternative is to use the means option on the tab statement. This example also uses
op=3 to print the rank numbers under each cell:
tab q1 banner;means;sort;op=123
l q1
n10Base
n01First mean;c=c56’1/5’;inc=c56
n01Second mean;c=c57’1/5’;inc=c57
n01Third mean;c=c58’1/5’;inc=c58
n01Fourth mean;c=c59’1/5’;inc=c59
Some people like to create sorted summary tables of means, standard deviations and
standard errors, where the row axis consists of blocks of these three elements for a
number of different columns. The table is sorted on the means, but each mean needs to
be followed by its own standard deviation. To solve this problem, create an include file
with the basic specification and then include it as many times as necessary with the
appropriate substitutions defined on the *include element. The specification in the
include file might be as follows:
n12;inc=ca00;c=ca00’1/7’
n03&txt;&unl;sort
n12 Mean;dec=2;nodsp;sort
n17 Std. dev;dec=2;nodsp;subsort
n19 Std error;dec=2;nodsp;endsort
Dislike Fragrance 26 1 13 3 31
1.2% 0.8% 7.1% 1.6% 4.6%
Smells Artificial 8 - 4 2 2
1.2% - 2.2% 1.1% 0.5%
Other Fragrance 10 2 3 3 2
Comments 1.5% 1.7% 1.6% 1.6% 1.0%
Fragrance Intensity 60 5 21 16 18
------------------- 8.7% 4.2% 11.5% 8.4% 9.2%
(Sub-Net)
---------
Strong Overpowering 44 3 19 13 9
Smell 6.4% 2.5% 10.4% 6.8% 4.6%
Other Fragrance 11 - 10 3 7
Intensity Comments 1.6% - 5.5% 1.1% 3.6%
EFFICACY (Net) 19 - 10 2 7
============== 2.8% - 5.5% 1.1% 3.6%
Doesn’t Work 14 - 6 2 6
0.9% - 3.3% 1.1% 3.1%
Freshening (Sub-Net) 5 - 4 - 1
-------------------- 0.7% - 2.2% - 0.5%
Other Freshening 1 - 1 - -
Comments 0.1% - 0.5% - -
MISCELLANEOUS
=============
Doesn’t Last Long 101 16 35 29 21
14.7% 13.4% 19.2% 15.3% 10.7%
Don’t Buy This Type of 79 - 26 5 48
Product 11.5% - 14.3% 2.6% 24.5%
Difficult / 67 13 16 20 18
Inconvenient to Use 9.8% 10.9% 8.8% 10.5% 9.2%
Too Expensive 62 8 23 11 20
9.0% 6.7% 7.1% 5.8% 10.2%
Might be Harmful / 9 1 3 1 4
React Chemically 1.3% 0.8% 1.6% 0.5% 2.0%
Other Miscellaneous 13 2 4 2 5
Comments 1.9% 1.7% 2.2% 1.1% 2.6%
Quantum provides a variety of special types of t-test for comparing pairs or groups of
rows or columns. They are:
All tests, except the Paired Preference test, compare columns of data; the Paired
Preference test compares rows.
Quantum normally includes all basic elements in a test. Basic elements are elements
created by n01, n15, n10 or n11 statements or their counterparts on col, val, bit or fld
statements, and elements created by m statements which manipulate basic elements. All
other types of element are ignored.
If you do not want to test all basic elements in the axis you may either select the ones you
do want to test or reject those you do not. To exclude elements from a test, place the
keyword notstat on the statements that create those elements. For example:
l age
n10Base
n01Under 30;c=c112’1’;id=A
n0130 to 50;c=c112’2’;id=B
n01Over 50;c=c112’3’;id=C
n01Not answered;c=c112’ ’;notstat
✎ Although the base element is not flagged with notstat Quantum always excludes it
from all tests. If you want to test the base element you must flag it with tstat.
If you want to exclude more elements than you want to include, an alternative is to place
notstat on the l statement to set exclusion as the default for the axis, and then to flag the
elements you want to test with tstat. Here is the previous example in reverse:
l age;notstat
n10Base
n01Under 30;c=c112’1’;id=A;tstat
n0130 to 50;c=c112’2’;id=B;tstat
n01Over 50;c=c112’3’;id=C;tstat
n01Not answered;c=c112’ ’
tstat and notstat are also valid on a, flt, sectbeg and tab statements to request or suppress
t-tests for all tables at the given level. If you use notstat on a tab statement, for instance,
Quantum will ignore tstat statements under the tab statement as well as tstat options in
the column axis of that table. This can be useful when you want to produce several tables
using the same column axis but only want the t-tests in certain of those tables. You define
the axis with the necessary statements and options for the t-tests, and then request or
suppress the tests using tstat or notstat on the tab statement.
For all tests except the paired preference test, each column element to be tested must have
an id. For the paired preference test each row element to be tested requires an id. Ids are
single upper case letters assigned with id=. Quantum prints the identifiers with the
element texts and uses them in the rows or columns to mark elements which differ
significantly from the other elements in the test. In addition, when the elements form the
columns of a table, Quantum generates an extra line of column headings showing the
element ids for each column.
All row axes in tables to be tested must have a base element. Although the base itself is
not normally tested, the figures from the base are used to determine whether there are
sufficient respondents in the element for the test to produce valid results. The base figures
are also used in the calculations of some of the statistics.
Quick Reference
If you are requesting T-statistics on weighted tables place the keyword:
nsw
on the a statement.
If the table is weighted, you must weight the base element using the same weighting
matrix as the tab statement. Either specify the weighting matrix for the whole axis using
wm= on the tab statement or make sure that the base element and the tab statement have
identical wm= options.
The T-statistics need to know the sum of the squared weights for the axis. Each
respondent’s weight is squared and then added into the total for the axis. To create this
figure, place the keyword nsw on the a statement. Quantum will then insert a squared
weighting statement after each base element in every axis and before every n12 as it
compiles your spec. If the spec contains weighting matrices and T-statistics but no nsw
option on the a statement, Quantum issue a warning message at the end of the
compilation stage.
If you want to see the squared weight element in your table, you may type the nsw
statement by hand after the base element:
n10Base
nswSum of squared weights
Quick Reference
To print the effective base as a row or column in the table, use an n10, n11, n01 or n15
statement with the keyword:
effbase
Quantum creates T-statistics on weighted tables using a special base figure called the
Effective Base. The purpose of the effective base is to reduce the likelihood of the
statistics producing significant results simply because the weighting has made
adjustments to the data.
In this case, the weighting will inflate the answers given by women and deflate the
answers given by men in order to match the population proportions. Any answers given
by women will count as greater than 1 in the tables and any answer given by men will
count as less than 1. To be precise, womens’ answers count as 52/32 which is 1.733 and
mens’ answers count as 48/70 which is 0.686.
The effective base takes these adjustments into account. It is calculated by dividing the
squared sum of weighting factors for an axis by the sum of the squared weighting factors;
that is:
If the data for a particular column has both unweighted and weighted bases of 40, and
comes from 12 women and 28 men, the effective base is 32.509. The calculation that
produces this value is:
The effective base is a good criterion for judging how good your weighting is. If the
weighting is inflating the answers from a particular group by a large factor, the effective
base tends to be much smaller than the unweighted and weighted bases. The closer the
effective base is to the unweighted base, the better the weighting is.
You can print the effective base in your tables by writing an n10, n11, n01 or n15
statement with the keyword effbase:
l ax01
n10Base
n10Effective Base;effbase
In order for Quantum to report the effective base correctly, make sure that your axis is
specified as follows:
• Any condition applied to the effbase element is the same as that applied to the most
recent base element.
• The weighting matrix applied to the effbase element is the same as that applied to the
most recent base element. If you just want to check what the effective base is you can
use the debug or tstatdebug options to produce a file of intermediate values used in
the calculation of the statistics.
☞ These options are described further in the section entitled "Checking how Quantum
calculated your statistics" later in this chapter.
Elements created by effbase will only appear in Quanvert for Windows databases if you
are running a version of Quanvert for Windows later than v1.2r6. If you use an earlier
version of Quanvert on a database that was created with the effbase facility built in, the
figure that you see for the effective base will be wrong: Quanvert for Windows will
simply print the sum of the squared weights, but will not tell you directly what it has
done. The only indication that the effective base is wrong is in the your out3id file where
there will be the message ‘Error- Table n, Page p-Summ 34 not processible’.
Quick Reference
The default value for a small base, when the base will be flagged as small, is 100 and the
default value for a very small base, when no statistics will be calculated, is 30. To define
your own small base figure, place the keyword:
smallbase=number
on the a, sectbeg, flt or tab statement or on the tstat statement for the test. To define your
own very small base figure, place the keyword:
minbase=number
on the a, sectbeg, flt or tab statement or on the tstat statement for the test.
The base element plays an important part in T-statistics, not least because its size
determines whether or not the tests are run at all.
The previous section explained how Quantum uses the effective base rather than the
weighted base in the calculation of T-statistics in weighted tables. In unweighted tables
the unweighted base is the same as the effective base.
Where an axis contains more than one base element, the t-statistic is calculated on the
most recent base. In weighted runs, a new nsw element will be created for each base with
the same conditions as the main base.
All statistics are more reliable if they are based on large samples. If the base for a
T-statistic in an unweighted table, or the effective base in a weighted table, is less than
100, Quantum treats it as a small base and flags it with a single asterisk in the printed
table.
If the unweighted base/effective base for a column (row for the Paired Preference test) is
less than 30 Quantum sees it as a very small base and flags it with two asterisks and does
not carry out the test. Quantum issues the message ‘base too small (<30)’ to warn you
that a test has been suppressed.
You can specify your own values to act as small and very small bases instead of these
defaults. To define your own small base, place the option:
smallbase=number
on the a, sectbeg, flt or tab statement or on the tstat statement that requests the test. The
maximum value you can use for a small base is 255.
To define your own value for a very small base, place the option:
minbase=number
on the a, sectbeg, flt or tab statement or the tstat statement of the test itself.
Quantum prints a footnote for each test that it runs on a table. The footnote reports the
name of the test, the ids of the elements tested, and the risk level at which the T-statistic
was tested for significance. If the table contains small or very small bases the footnote
reminds you that * marks a small base and ** marks a very small base.
The footnote shows that all combinations of column pairs were tested, and that the results
were checked for significance at the 5% risk level – that is, the 95% confidence level.
Suppressing footnotes
Quick Reference
To suppress the footnotes that Quantum generates automatically place the keyword:
notauto
You can suppress these automatic footnotes and, if you wish, replace them with titles of
your choice. To suppress the footnotes, place the keyword:
notauto
Quick Reference
To define your own titles for tables with T-statistics use tt statements with any of the
following keywords where you require information about the T-statistic:
If you suppress Quantum’s automatic footnotes you may wish to replace them with titles
of your own. Set up your titles using tt statements, as you would for any other titles. The
position of the tt statements in the spec determines when and where the titles will be
printed. For example, a title under the tab statement will be printed on that table only
whereas one in the axis will be printed in all tables of which that axis is a part.
When Quantum generates its own automatic footnotes it can pick up the variable
information such as the name of the test or the names of the columns tested from the
instructions it holds internally about how the tables are to be created. To allow you the
same flexibility in titles that you define yourself, Quantum provides a number of special
keywords that you may use on tt statements at tab level and above. When Quantum prints
the title it replaces any special keywords with the appropriate type of information specific
to the current table.
All keywords are enclosed in pairs of double angle brackets and are as follows:
These types of titles work with one T-statistic per table only. You cannot request two
different tests on the same table and define different titles with different replaceable texts
for each test.
If you specify titles with these special texts for tables without T-statistics, Quantum treats
the replaceable texts keywords as ordinary text and prints them. If you do need to define
global titles for tables with T-statistics you can flag them with the word tstat and
Quantum will print those titles only on tables with T-statistics. For example:
☞ For further information about the tstat option on tt statements, see the section
entitled "Titles for T-statistics tables only" in chapter 22.
Quick Reference
To request a special t-test, type:
underneath the tab or l statement for the table or axis in which the test is required.
You request T-statistics using a tstat statement. This goes underneath the tab statement
for the table on which the statistics are required, or underneath the l statement if the test
is required whenever that axis is used. The full syntax of a tstat statement is:
where name is the name of the test you want to run and elm_ids are the ids of the elements
you want to compare. Both these parameters are required; the others are all optional. The
sections below explain each parameter in turn.
You may run more than one test on a table by listing the appropriate tstat statements
under a single tab or l statement. The exceptions to this are a combination of ‘prop’ (t-test
on column proportions) and ‘ppt’ (paired preference test) which is not allowed because
one tests rows and the other tests columns, and a combination of ‘prop’ and ‘mean’ which
you request with the option propmean.
✎ tstat is disallowed under a, sectbeg and flt statements, after add, div, sid and und,
and before and statements.
To define the test you want to run, type the name of the test immediately after the tstat
keyword. Valid names are:
The automatic footnote tells you which test(s) were applied to each table, or you can
define your own titles using the <<test>> keyword to insert the test name.
Although you give elements identifiers and flag them with tstat or notstat, these do not
in themselves tell Quantum which elements to use with a specific test. The identifier is a
marker that you can use to refer to the element and tstat and notstat flag elements as
eligible or ineligible for inclusion in tests.
elms=element_ids
on the tstat statement, separated by a semicolon from the test’s name. For example:
tstat prop;elms=ABC
tells Quantum to run a t-test on column proportions on all possible pairs of columns from
A to C; that is, on AB, AC and BC. Any other elements in the axis are ignored even if they
have identifiers and are flagged with the tstat option.
You can control more precisely which combinations of elements are compared by listing
the pairs or sets individually, separated by commas. The option:
elms=ABC,DE
tell Quantum to test all possible pairs within ABC plus the pair DE. Combinations of
elements from the two groups, such as AD or CE are ignored.
Different tests expect to test different numbers of elements. The notes given with each
test tell you the exact requirements.
The automatic footnote lists the combinations of elements tested, or you can print your
own title using the <<cols>> keyword to list the columns tested.
If the test finds that a comparison of two elements is significantly different, it prints the
element identifier of the larger value next to the cell count of the smaller value. You will
see how this looks in the sample tables shown later in this chapter.
Confidence level and risk level are two ways of looking at the same thing. The confidence
level tells you how certain you can be that any significant differences between the
columns tested are due to some external factor rather than being due to chance. The risk
level is the opposite of the confidence level and tells you how likely it is that any
differences are simply due to chance rather than being significant for some other reason.
The sum of the confidence level and the risk level is 100, so a confidence level of 95%
implies a risk level of 5%, and vice versa.
Quantum can test the significance of statistical values at a number of confidence levels.
Acceptable values in Quantum for all tests except the Newman-Keuls test are 99, 95, 90,
85, 80, 75 and 68. Acceptable values for the Newman-Keuls test are 99, 95 and 90 only.
To specify the confidence level you want for a particular test, add the option:
clevel=level_1[,level_2]
to the tstat statement. If you want to set a global confidence level for all tests, place this
option on the a statement instead.
The option requires you to specify one confidence level, but allows an optional second
level. If you specify a second level it must be lower than the first level and must be
separated from it by a comma.
✎ If you define two confidence levels you must specify the element ids with elms= all
in the same case. A mixture of upper and lower case is not allowed.
For the proportions, means and Newman-Keuls tests, Quantum checks first for
significance at the higher level and prints an upper-case letter if the value is significant
at that level. If the test fails, Quantum tests for significance at the lower level and prints
lower-case letter if the value is significant at that level. Otherwise, no letter is printed.
The same rules apply to the paired preference test, but significance at a given level is
shown by the letter S (higher level) or s (lower level) as appropriate.
The significant net difference test uses the higher level only and silently ignores any
lower level that is set.
If you do not specify a confidence level, Quantum uses the default of 95% confidence.
The automatic footnote reports the risk level at which significance was tested. You can
specify your own titles that show the risk level or the confidence level using the options
<<risk>> and/or <<conf>> on the tt statement that creates the title.
Sometimes you may be surprised by the results of your tests and you will want to check
how Quantum arrived at a particular value. By placing the keyword debug on the tstat
statement you can have Quantum write out the intermediate figures it used to calculate
the statistics. This information is written to a file called tstat.dmp. Here is part of the file
that was created for the table shown earlier in this chapter (some long lines have been split
and printed on two lines):
The values in this report are as follows. The first section refers to all respondents in the
row being tested:
The second section refers to respondents who have overlapping data in the row being
tested:
If you want to see this type of information for all special T-statistics in a run, place the
option tstatdebug on the a statement and omit debug from the individual tstat statements.
Probability (P) values tell you how likely it is that a value returned by a statistic could
have happened by chance in a 2-tailed t-test. The general rule is that the smaller the
probability the greater the significance of the statistic. A probability value of less than
0.05 means that there is less than a 5% chance that the result is due to chance. This
probability corresponds to the 5% risk and 95% confidence levels. If the statistic has a
probability of less than 0.05 then the results are significant at the 95% confidence level.
Quantum reports probabilities as a decimal value with three decimal places, where 1.000
corresponds to 100%. A probability of 0.862 indicates that the value of the statistic is
86.2% likely to occur by chance in a 2-tailed t-test. The number of decimal places is not
affected by either dec= or decp=.
The way Quantum reports P-values varies according to the type of test you are running;
for example, with a test on column proportions Quantum writes the P-values out to a
separate file whereas with a significant net difference test the P-values are printed on the
table itself.
Quick Reference
If the t-test is to be performed on overlapping data, type:
overlap
To suppress the footnote that is automatically generated whenever the overlap formulae
are used place the keyword:
nooverlapfoot
The t-tests on column proportions and column means, the significant net difference test
and the Newman-Keuls test will work on mutually exclusive or overlapping data.
When the data is overlapping, the formulae which calculate the statistics must be
modified to take this into account. To do this, place the keyword overlap on the tab
statement of the table to be tested, or on the a, flt or sectbeg statement above it:
To clarify the difference between mutually exclusive data and overlapping data consider
a question that asks respondents which of two products they tried. If you create an
element that is ‘Tried one product only’ it will be single coded so overlap is not needed.
The element ‘Tried both A and B’ will always be multicoded so you must use overlap.
Elements such as ‘Tried A or B, or A and B’ could be single coded if no one tried more
than one product, or multicoded if some respondents tried both products. You should
always use overlap with elements of this type. You should also use overlap if you are
testing combinations of overlapping and non-overlapping data.
If you do not use overlap when the data is overlapping you can obtain incorrect results.
Specifying overlap when the data is mutually exclusive does not affect the validity of the
statistic as this part of the calculation will simply produce a value of 0.0.
When you request tests on overlapping data, Quantum displays the message ‘Overlap
formulae used’ as part of the standard footnote, just above the small base/very small base
messages. You may suppress this message by placing the keyword nooverlapfoot in the
a, sectbeg, flt or tab statement. To switch the message back on after switching it off use
overlapfoot.
Quick Reference
To request a t-test on column proportions, type:
after the tab or l statement. Use propcorr to apply a continuity correction to the numerator
of the proportion’s T-value.
This test looks at each row of the table independently and compares pairs of columns to
test whether the proportion of respondents in one column is significantly different from
the proportion in the other. For each pair in which the difference between the columns is
significant, the id of the smaller column is printed beside the figures in the larger column.
For example, if you compare columns A and B, and proportion A is found to be
significantly smaller than proportion B, the letter A will be printed beside the figures in
column B.
If two confidence levels have been defined, the id will be shown in upper case if the test
was significant at the higher level, or in lower case if it was significant at the lower level.
The t-test is a two-tailed test. You can check which side of the curve the T-statistic is on
by running the test with one of the options tstatdebug or debug and looking in the
tstat.dmp file. Negative values are on the left of the curve and positive values are on the
right.
You may run this test by itself or with a t-test on column means.
With a t-test on column proportions, either on its own or with a t-test on column means,
a continuity correction can reduce the difference between the two proportions compared.
It is applied to the numerator of the proportion’s T-value. If the difference between the
two proportions is positive, Quantum subtracts the correction value from the difference.
If the difference is negative, Quantum adds the correction value to the difference.
✎ When you use propcorr with a propmean test, the correction is applied to the
proportions part of the test only. It is ignored for the means test.
1. Insert a tstat prop statement after the tab or l statement of the table or axis to be
tested.
To run this test with a similar test for column means, use tstat propmean instead.
Any number and combination of column pairs may be specified as noted earlier in
the section entitled ’Which Elements to Compare?’ examples above.
2. To request the optional continuity correction, add the keyword propcorr to the
tstat prop or tstat propmean statement.
This example tests the differences between the proportions of people trying various types
of wine in London and Manchester. In the comparison between columns B and D (women
in London and women in Manchester) a significant difference has been found for those
trying brand A: the letter D beside the figure for women in column B indicates that the
proportion of women in London is significantly larger than the proportion of women in
Manchester. A similarly significant difference exists between columns C and D for brand
A. The other comparison in that row (columns A and D) did not produce a significant
difference at the 90% level.
London Manchester
--------------------- ----------------------
Male Female Male Female
(A) (B) (C) (D)
Base 90 86 90 73
Brand A 8 9D 9D 2
Brand B 8 11 14A 11
Brand C 18B 9 11 10
Brand D 8 6 7 10
Brand E 21 23 24 22
Brand F 6 9 7 6
Brand G 21 19D 20D 12
None 10 14 10 27CB
______________________________________________________________
Proportions: Columns Tested (10% risk level) AB / CD / AC / BD
Quantum generates a P-value for each pair of columns tested in each row. So that all these
values may be viewed in a legible form, Quantum writes them out to a separate file,
tstat.log. This file is laid out so that there is one display column per pair of columns
tested, and one row per row tested. Headings indicate which display column refers to
each pair, and the side text for each row (truncated if necessary) is printed at the side of
each row. For example:
Lines in the T-stats log file may be a maximum of 160 characters long, and a minimum
of five characters is required for each P-value. If your table has many columns and you
have requested t-tests for many pairs of columns you may find that Quantum has
insufficient space to write all the information it needs in one line. You’ll know when this
happens because you will see error messages of the form:
This does not affect the validity of the T-statistics in the table; it is merely points to a
problem in writing to the log file.
Quick Reference
To request a t-test on column means, type:
This test is similar to the t-test on column proportions, except that instead of comparing
proportions it compares column means which have been created with n12 statements
(this test does not work on tables of means). Where the mean in one column is
significantly different from the other mean in the pair, the id of the smaller mean is
printed next to the figures in the larger column.
If two confidence levels have been defined, the id will be shown in upper case if the test
was significant at the higher level, or in lower case if it was significant at the lower level.
1. Insert a tstat mean statement after the tab/l statement of the table/axis to be tested.
To run this test with a similar test for column proportions, use the option tstat
propmean instead.
2. Make sure that the row axis contains an n12 statement for the mean.
The Quantum programs required to run a t-test on column means, and the tables
produced, are as shown above for the test on column proportions, except that the row axis
must contain an n12. The test will place letters next to those means which are
significantly different from those with which they were compared.
The section above entitled P-Values for a Test on Column Proportions is also applicable
to the t-test on column means.
☞ For an alternative method of testing means, see section 32.13 on "Testing means
using the least significant difference test".
Quick Reference
To request a Newman-Keuls test, type:
This test compares the differences between the means of two or more samples. For each
pair of means in which the difference is significant at the chosen level, the id of the
smaller column(s) is printed next to the figures in the larger column, as for the t-test on
column proportions in section 32.8.
If two confidence levels have been defined, the id will be shown in upper case if the test
was significant at the higher level, or in lower case if it was significant at the lower level.
• Insert a tstat nkl statement after the tab or l statement of the table or axis to be tested.
The test is applied to all rows for which the tstat flag is set. If the row is an n12, then
the calculation uses the means formulae; if not, the propns formulae are used.
The notes in the section above entitled "P-values for a T-test on column proportions" are
also applicable to the Newman-Keuls test.
Quick Reference
To request a significant net difference test, type:
tstatntd [;options]
statntd, element_ids
This test deals with each row independently and compares the proportions in four
columns at a time to test whether the difference between the values in the first pair of
columns is significantly different from the difference between the values in the second
pair of columns. For example, when comparing columns A, B, C and D, the difference
between A and B will be tested against the difference between C and D to see whether the
difference between the two is significant.
1. Insert the statement tstat ntd under the tab/l statement which creates the table/axis
to be tested.
stat ntd;elms=ABCD,EFGH
If the number of sets of letters does not match the number of stat ntd statements in
the axis, the excess of either type is ignored and a warning message to this effect is
issued. Therefore, if there are three groups of columns defined with tstat but only two
stat ntd elements in the axis, only two statistics will be calculated.
2. For each group of columns to be tested, place a stat ntd statement in the column axis
to determine where the results for those columns should be printed.
☞ The stat ntd statement has the same format as those discussed in chapter 29 and
chapter 30.
3. Optionally, add decp= with a value greater than 0 to the row elements of the table
being tested. (decp= at any higher level has no effect on the way Quantum prints this
statistic.)
The number of decimal places shown for the T-statistic is controlled by decp=, for
which the default is zero. If the value of the T-statistic is less than 1 you could find
that the T-statistic is replaced by the spechar characters. Setting the number of
decimal places to one or more prevents this happening.
This example tests whether the difference between working and nonworking women who
have tried nonalcoholic wine in London is significantly different from the difference
between the same groups of women in Manchester. The column labeled ABCD shows the
value of the T-statistic for each row.
London Manchester
------------- -------------
Does Does
Not Not
TOTAL Works Work Works Work ABCD
(A) (B) (C) (D)
With significant net difference tests, the P-value is printed in the ntd column in place of
the value returned by the T-statistic. Here is the same table as shown above but with
P-values instead of the value of the T-statistic shown in the ABCD column.
London Manchester
------------- -------------
Does Does
Not Not
TOTAL Works Work Works Work ABCD
(A) (B) (C) (D)
The P-value is the probability that the difference is significant. In this table, any value
less than or equal to 0.05 indicates a difference that is significant at the 95% confidence
level or any higher level. Since it is the case that the smaller the probability the greater
the significance, the figure of 0.008, for example, indicates that the difference is
significant at the 99.2% confidence level.
Quick Reference
To request a paired preference test, type:
in the axis at the point at which the letters should be printed. ppnse tells Quantum to print
NS or E depending on whether or not the value of the statistic is significantly different
from sqrt(2.0).
This test deals with each column independently and compares pairs of rows to see
whether the figures in each pair differ significantly from one another. If the results of the
test are significant at the selected level, the letter S is placed in that column. Thus, if the
proportion of women preferring Brand A is larger than those preferring Brand B, and the
difference between the two proportions is significant, the letter S will be printed in the
column for women.
If two confidence levels have been defined, significance at the higher level is shown by
upper-case S and significance at the lower level is shown by lower-case s.
This test is generally used in product tests where respondents test two or more products:
the rows are then the products tested.
The presence of overlapping data is irrelevant with this test, and since overlap
calculations involve more processing time and a larger nums file, you are advised not to
use overlap with this test.
1. Insert a tstat ppt statement underneath the tab/l statement which creates the
table/axis to be tested.
2. Insert a stat ppt statement in the row axis to create a row in which the significance
letters should be printed:
3. Optionally, place the option ppnse on the tstat statement to have the letters NS and E
printed in the columns where the difference is not significant at the given level. NS
indicates that the value of the statistic is significantly different from sqrt(2.0); E
indicates that the value of the statistic does not differ significantly from sqrt(2.0).
This example tests whether the number of respondents preferring brand A is significantly
different from the number preferring brand B.
Target Total
market Respondents Respondents
age 13-29 age 13-39 age 13-55
------------- ------------- -------------
Base 206 275 303
Overall preference
Prefer Brand A (A) 104 147 166
Prefer Brand B (B) 101 126 135
No Preference 1 2 2
Paired Preference E NS NS
The Paired Preference row shows that the value of the statistic does not differ
significantly from sqrt(2.0) for the target market, whereas the difference for the other
categories does.
With paired preference tests, the P-value is printed instead of the letter indicators S, NS,
E or blank, except that the S is printed next to the value if appropriate. For example, a
paired preference test without P-values might produce the following output:
1st ppt E S E E
Prefers C 18 13 20 22 15
Prefers D 23 22 29 17 23
No preference 4 4 0 5 6
Total 45 40 49 44 45
2nd ppt S E E E
When run with the pvals option on the tstat statement, the output would be:
Prefers C 18 13 20 22 15
Prefers D 23 22 29 17 23
No preference 4 4 0 5 6
Total 45 40 49 44 45
Quick Reference
To request a least significant difference test, type:
There is an alternative to testing means using the mean and propmean tests. This involves
the use of a least significant difference (LSD) in which the variance is computed across
all the columns defined with one elms= keyword at once rather than pairwise.
The calculation used depends on whether the sample is independent (non overlapping)
or overlapping. In both cases, the LSD is compared against the difference between each
pair of means. When the difference is greater than the LSD it is significant and the
column is marked with a letter in that same way as for other tests. The value of the LSD
is printed directly under the mean to which it relates and, for each value, Quantum reports
the group of elements (non-overlapping groups) or pair of columns (overlapping groups)
tested.
• Append the keyword lsd to the tstat statement for your test on column means or
column means and proportions.
32.14 Formulae
The formulae for the statistical tests described in this chapter are shown below.
When you ask for special T-statistics, Quantum compares the T-statistic that is calculated
from your data with a figure from a look-up table of values. If the number calculated from
the data is greater than the number from the look-up table, this is significant and you
should expect to see a T-statistic letter on your table. (The number is treated as significant
if greater, regardless of whether it is positive or negative.) If you have asked Quantum
to print the intermediate figures used in the calculation of the statistics, you will see that
the last two figures shown per test are the significance value from the look-up table, and
the T-statistic which is derived from the data.
The name of the look-up file is qtab.qt, and it is located in the include directory of the
main Quantum program directory. The table has seven columns and 120 rows. The
columns correspond to the various significance (confidence) levels (99, 95, 90, 80, 75,
68 and 50), and the rows correspond to the 1 to 120 degrees of freedom. The degree of
freedom used in a calculation (see below) determines which row of the table is checked.
For example, if the degree of freedom is 50 and you have requested a confidence level of
99%, Quantum will compare the T-statistic it derives from your data with the value
shown in row 50, column 1 of the table.
For degrees of freedom over 120, the normal approximation values are used. These are:
clevel 99 95 90 80 75 68 50
Value 3.64 2.77 2.33 1.81 1.63 1.41 0.00
All values in the look-up table and these normal approximation values are multiplied by
sqrt(2.0).
The T values generated by Quantum are multiplied by sqrt(2.0) and will therefore differ
from the results you will get from making the calculation manually, and from the
T-values printed in books of statistical tables. If you wish to compare the results produced
by Quantum with external figures, either divide Quantum’s figures by sqrt(2.0) or
multiply the external ones by the same value. This difference does not affect the
significance values which Quantum also reports. These are always the same as those you
will achieve through manual calculation.
For any row, and any pair of columns called 1 and 2 with absolute values r1 and r2,
proportions p1 and p2 , and weighted bases w1 and w2:
Let the sum of the squared weights for a given column, as calculated by the nsw
statement, be:
∑ wik
2 2
wi =
k
Then, the effective base for each of the two columns is:
2 2
( wi ) ( wi )
e i = ------------ = --------------
∑ wik
2 2
wi
2
( w0 )
e 0 = -------------
2
w0
be the effective base for respondents common to both columns (otherwise e0 is zero), and
let r0 be the correlation co-efficient (it appears r0 = 1).
Let
r1 + r 2 r1 + r2
------------------- ⋅ 1 – ------------------ -
2
w1 + w2 w1 + w2
S = --------------------------------------------------------------
1
1 – ----------------
e1 + e2
Then
p1 – p 2
T = ------------------------------------------------------1-
---
2r 0 e 0 2
S ----- + ----- – -------------
2 1 1
e 1 e 2 e 1 e 2
where e1 is the effective base for the first column tested and e2 is the effective base for
the second column tested. It is applied to the numerator of the proportion’s T-value. If the
difference between the two proportions is positive, Quantum subtracts the correction
value from the difference. If the difference is negative, Quantum adds the correction
value to the difference.
For any pair of columns called 1 and 2, let the sum of weights be:
wi = ∑ wik
k
ti = ∑ wik xik
∑ wik xik
2
SS i =
2
ti
v i = SS i – -----
wi
2
As before, w i is the NSW and we have:
2
( wi )
e i = ------------
2
w i
ti ∑ wik xik
n i = ----- = ---------------------
∑ wik
wi
Let:
2 v1 + v2
S = -------------------------------------------------------------
2 2
w 1 w 2
w 1 – ----------- + w 2 – --------
w1 w2
Then:
n1 – n2
T = ------------------------------------------------------1-
---
S 2 ---- 1 1 2r 0 e 0 2
- + ----- – -------------
e 1 e 2 e 1 e 2
For any row, and any set of four columns called 1, 2, 3 and 4, let
• the proportions ( p i )
be as previously described.
Let p ij represent the column proportion in the overlap between columns i and j, and e ij
represent the effective base in the overlap.
Let:
2
pi ( 1 – pi ) 2 ( p ij – p ij ) ( e ij – 1 )
s = ∑ ---------------------- ∑ ---------------------------------------------
2
-– -
ei ( ei – 1 ) ( ej – 1 )
i i≠j
Then:
T = ( p 2 – p 1 ) – ( p 4 – p3 )
For any column and any pair of paired preference rows, let
–p1 p2
r = ------------------------------------------------------
p1 p2 ( 1 – p1 ) ( 1 – p2 )
Let:
c1 + c2 c1 + c2
---------------- ⋅ 1 – --------------- -
2
2w 0 2w 0
S = ---------------------------------------------------------
1
1 – --------
2e 0
Then:
p1 – p2
T = --------------------------------------1-
---
2
S 2 ⋅ ----
2
- ( 1 – r )
e0
ncols 1⁄2
ncols
∑ i i x x
∑ xsq i – -------------------
i=1
ni
i = 1
s = -----------------------------------------------
N – ncols
where:
xsq i is the sum over all observations in column i of the ‘squares of x’
xi is the sum over all observations in column i of the ‘sum of x’
ni is the number of observations in column i
N is the sum over all observations in column i of ni (i.e., the total
number of observations in the elms= set of columns).
∑
1
----
xi
i=1
1⁄2
LSD = t ( df ) * S * ---
2
h
The significance is tested using the normal tstat method. The LSD is then computed as
follows:
LSD = t ( df )∗ se
Besides reading your program and data files, Quantum will also read a number of other
files if they exist. These files are ones which you create to determine more precisely how
you want Quantum to work.
Many of the files may exist is more than one place; for example, in a subdirectory of the
Quantum installation directory and also in the project directory. This enables you to
define a set of defaults for the way Quantum will operate for all users, but to override
those defaults for specific projects.
All files are text files which you may create with an editor of your choice. Unless
otherwise stated, you must create these files in the area in which the job is to run (e.g.,
the project directory).
A file called variable may be used when the user needs to increase the number of C, T or
X variables or to set up named variables for use in the Quantum program. The file may
contain up to 300 definitions including those for C, T and X variables.
Type s after the variable’s size if you want to be able to omit the parentheses from
references to single cells in the variable.
DOS variables
HP Spectrum variable
Unix variable
VMS variable
☞ For further information on naming and creating variables, see chapter 14.
The levels file is required only if multicard records are to be processed using the levels
facility. It is an alternative to the struct statement used in ordinary runs and names the
various levels and determines which cards belong in each level.
The file starts with a number of statements defining the overall structure of the data file.
Each statement starts on a new line and are as follows:
The rest of the file is concerned with defining the levels in the data. Up to nine levels are
allowed. They must be defined in order, starting with the highest level. You define this
with a statement of the form:
If no card types are given, the data for this level is read from the cards of the parent level.
DOS levels
HP Spectrum levels in the group and account in which the job is to run.
Unix levels
VMS levels
The default options file contains only an a statement defining default options for
Quantum jobs. This file may appear in various places, depending on which jobs it refers
to.
If several default files exist at different levels the defaults in file at the lowest level
override those defined at higher levels.
DOS %qthome%\include\defaulta.qtm
HP Spectrum defaulta in the group and account in which the job is to run.
Unix defaulta.qtm in the project directory, your home directory or
$QTHOME/include
The Run Definitions file is used in table manipulation to define run ids for previous
Quantum runs. Each line consists of a Run Id, one or more spaces and the name of the
run it represents. Run Ids may be a maximum of six letters long.
JAN \barbara\quantum\run1
HP Spectrum qtvrdefs
Unix run.def. Runs are identified by their directory pathname; the
notations . and .. are valid but ~ is not. For example:
JAN [USR.QT.BARBARA.QUANTUM]RUN1
☞ For further information on row and table manipulation facilities, see chapter 27.
When data for a respondent is to be read from more than one file, a file called merges
must exist in the current directory, defining the files to be merged and the type of merge
required.
This is the first item in the merges file, and is followed by the names of the files to be
merged with the main data file. These may be entered on separate lines or on the same
line as the merge type, in which case items must be separated by semicolons. For
example:
DOS merges
HP Spectrum merges in the group and account in which the job is to run
Unix merges
VMS merges
The corrections file defines corrections to be made as the data is read and before the
statements in the main Quantum program are executed. Serial numbers in the corrections
file must appear in the same order as those in the data file.
serial ; corrections
for non-trailer card records, where serial is the record serial number and corrections are
the corrections to be made. The format for trailer card records is:
serial/n ; corrections
where n is the trailer card number found from the error listing.
For example:
DOS corrfile
HP Spectrum corrfile
Unix corrfile
VMS corrfile
☞ For further information on correcting data from a file, see section 12.5.
This file defines the parameters to be used in rim weighting calculations. It contains a
single line, as follows:
The value how_close determines how close the weightingm procedure must get to the
targets you have given in order for the weights to be acceptable. The default is 0.005, but
any value between 0.0001 and 0.05 inclusive is valid. The number of iterations is the
number of times the weight calculations may be repeated in order to reach the cell targets.
The default is 12, but values of between 5 and 500 are acceptable.
The optional report=detailed requests a report of the weights calculated at each iteration.
Normally Quantum reports only the weights achieved at the final iteration.
DOS rim.par
HP Spectrum qtrimpar
Unix rim.par
VMS rim.par
Users may define aliases for Quantum statements. These are defined in a file which may
exist in the project directory or in the central Quantum directory. If an alias file exists in
both locations, the aliases in the project directory are additional to those in the central file
except where the two files contain the same alias code. In this case, the alias in the project
directory overrides the central one.
where statement(s) and options are the statements and options which the alias represents.
For example, if the alias is:
sd n17Standard Deviation
sd;dec=2
Quantum will replace the alias with the statement it represents so that the line reads:
n17Standard Deviation;dec=2
The substitution has no effect on any other parameters which are defined on the line.
If the alias represents more than one Quantum statement, the statements must be
separated by a backslash:
You may define text to be substituted before the first semicolon only, by use of an
asterisk in the alias definition. For example, if the alias is:
the statement:
msdScore
n12Mean Score;dec=2
n17Standard Deviation of Score;dec=3
You may define your own customized texts to replace the ‘hard-wired’ text which
Quantum displays at the top of tables, describing the contents of the table. You may also
redefine or translate the default on-line edit commands, or define your own abbreviations
for them.
For table texts, each line of this file will define one text in the form:
id:text
where id is a single character from the list below defining the text to be replaced, and text
is the replacement text. This must start in the 3rd position in every line.
p Page
t Table
1 Absolutes
2 Column percentages
0 Row percentages
& Total percentages
8 Indices
3 Ranks
l Last absolutes
r Proportions
m Means
The example below defines German texts for page and table number, absolutes, row and
total percentages, and means:
p:Seite
t:Tabelle
1:Absolutzahlen
0:Reihenprozenten
&:Gesamtprozenten
m:Durchschnitt
If you would like to define your own names for the online edit commands, you may do
so by typing definitions of the form:
onumber:text
where number is a number identifying the online edit command, and text is the command
you wish to use. The text may contain a # sign marking the minimum number of
characters to be entered for each command. If there is no # sign, you must type the full
command. To redefine the delete command to be remove, for example, and to allow an
abbreviation of rem to be accepted, you would type:
o5:rem#ove
The numbers and defaults for each edit command are as follows:
1 s#et
2 ac#cept
3 ad#d
4 ca#ncel
5 de#lete
6 di#splay
7 e#mit
8 rt
9 rj
10 rm
11 ed#it
13 rej#ect
DOS %qthome%\include\texts.qt
HP Spectrum qtexts.h
Unix texts.qt in the project directory or $QTHOME/include/texts.qt
VMS QTHOME:[INCLUDE]TEXTS.QT
Quantum has a number of internal parameters which limit how large your jobs may be.
Most may not be changed. The exceptions are:
• the maximum number of axes per run (default 500) when runs fail with the message
‘limit of number of axes in run exceeded’.
• the maximum number of elements per axis (default 500) when runs fails with the
message ‘too many elements in axis’ or ‘too many p fields’.
• the maximum number of characters per axis (default 20,000) when runs fail with the
message ‘too many chars in axis text heap’.
• the maximum number of different inc= per run (default 600) when runs fail with the
message ‘too many inc=’s specified’.
• the number of characters reserved for storing the names of inc= variables (default
8,000) when runs fail with the message ‘inc=’s names too long in total’.
• the maximum number of different text symbolic parameters per run. The default is
15 different text symbolic parameters, but one of these is always used internally so
the usable number is, in fact, 14.
• the maximum number of named variables per run. The default is 300 named variables
per run, but since 33 of these are used internally, the true default is 267.
You may define limits in several ways, depending on whether they apply to the
installation as a whole, to a particular job, or to an individual user. To define
installation-wide limits, edit or create a file containing the lines:
axes=max_axes
elms=max_elms
heap=max_chars
incs=max_incs
incheap=max_ichars
textdefs=max_params
namevars=max_vars
with the new maxima you wish to implement. You need enter only the lines whose values
you wish to change; if you want to retain the default maximum for axes per run, for
instance, omit this entry from the file.
To define limits for a particular job, create the file in the same location as the Quantum
program and data files for that job.
The third method is available with systems such as Unix, Vax VMS and DOS which
recognize environment variables. On these systems, you may define your own personal
limits via the environment variables QTAXES, QTELMS, QTHEAP, QTINCS,
QTINCHEAP, QTTEXTDEFS and QTNAMEVARS. The values assigned to these
variables must be integers (whole numbers). Limits set in this way are read after any
installation or job limits and override limits set at those levels if appropriate.
If you wish to use C subroutines in your Quantum run, define them in a file called
private.c in the project directory. When Quantum compiles your program, it will
automatically compile the contents of the private.c file if it exists.
3. Read and process the data using the program created at step 2.
You can either run all stages automatically one after the other, or you can run a specific
stage in isolation.
Your computer may have more than one version of Quantum available, for example, a
standard version for client tables and a newer version for in-house testing. To indicate
which version you wish to use, you must assign the pathname of that version to an
environment variable called QTHOME and then add the Quantum bin directory to your
path.
On Unix systems, you define QTHOME in your .login file with a setenv statement. For
example:
On Vax machines running VMS, you define QTHOME in your .login file with a define
statement. For example:
[QTHOME]BIN
to your path.
Under DOS, the version to use is set at the time the software is installed. For information
on how to switch between versions, see your installation instructions.
quantum This version silently deletes all temporary files created during a run
unless you include the option –k on the command line.
quantumx This version does not delete temporary files.
In the examples of commands in the rest of this section, we will use the word quantum to
mean quantum or quantumx.
At installations where automatic deletion of temporary files is not desirable, you may
find that the administrator has renamed the files so that quantumx is called quantum, and
vice versa. You should check this before you run your first job.
Quick Reference
To run a complete Quantum job, type:
If you omit the program and/or data file names, Quantum will prompt you for them as it
needs them. If you omit the name of the tables output file, Quantum will save any tables
in a file called tab_.
• run only one section of the job such as the compilation stage or the table creation
stage,
• define a run Id when you want to do more than one run in the directory,
• define the names of directories in which Quantum should look for program and data
files or create intermediate files,
• convert the Quantum program and data files into a Quanvert database.
They are:
Quantum can deal with compressed data files whose file names end with a .Z suffix. If
you have a data file of this type, there is no need the suffix on the command line.
Quantum always checks first for a file with the exact name you typed on the command
line. If it cannot find this file, it makes a second search for that file with a .Z suffix.
Quantum can also cope with files that start with records you wish to ignore, or in which
records are not terminated by a new-line character. We refer to these globally as
non-standard data files. If you have a file of this type, create a dummy data file and enter
the name of that file on the quantum command line.
☞ For further information on dummy data files see the section entitled "Reading
non-standard data files" in chapter 26.
Quick Reference
To compile a Quantum program, type:
The first step in any Quantum run is to check the syntax of your Quantum specification
and to convert it into C code. We call this compilation. You can run the compilation stage
by itself by typing the quantum command with the –c option:
The compilation output file contains a listing of your program file as Quantum reads it.
If errors are found, Quantum will mark them in this file. If you do not name an output
file, Quantum will use its default compilation output file, out1.
If your Quantum program file is called run1 and you want to write the compilation output
to a file called run1.out, you would type:
The compilation creates many files, the most important of which are:
☞ For further information on the contents of these files, see chapter 36.
Quick Reference
To load the C code created by a compilation under any system other than DOS, type one
of:
quantum –l data_file
After a successful compilation, Quantum converts the C code created by the Quantum
compile into a program and, if there are no problems, reads the data. We call this program
the datapass program. You can run this stage as a separate task on non-DOS systems by
typing:
or
quantum –l data_file
This stage also creates a number of files, most of which are normally deleted at the end
of the run. The file you need to know about is:
Quick Reference
To read the data file after a previous compile and load, type:
quantum –r data_file
The datapass program reads and processes data as you defined in your Quantum program
file. Normally, this happens as an automatic extension of the load phase, but if you have
corrected errors in the data or added more data to the data file, you may rerun the datapass
without recompiling and reloading your program file. To do this, type:
quantum –r data_file
The datapass reads and processes each record separately. If you requested that data
should be separated into clean and dirty data files, or that it should be written out to
another file, Quantum will do so during this stage. Any holecounts or frequency
distributions are also created now. Finally, Quantum sets flags indicating the cells and
tables in which each record is to be included.
Quick Reference
The weighting program, weight, weights records according to the figures given in your
Quantum program file. If the run has no weighting, the weighting program is ignored.
The accumulation program, accum, builds a file containing the cell values for each table.
If your job uses row or table manipulation, Quantum runs a program called manip. This
carries out your manipulation requests and creates a second file of cell values. Note that
this file contains values for all tables whether or not they are the result of manipulation.
You cannot run the weighting, accumulation or manipulation stages in any way except as
part of a complete Quantum run.
Quantum creates the following files, amongst others, during these stages:
Quick Reference
To create tables, type:
quantum –o [program_file]
The final step in most runs is to take the cell values and use them to create tables.
Quantum reads the page and table headings and positions them as requested. If tables are
to be sorted, added or placed side by side, the relevant figures are rearranged or
combined.
To change the table layout without changing the cell counts (e.g., to print more decimal
places for percentages, or use special characters for absolute zero or rounding) you may
rerun just the compilation and output stages using the command:
quantum –o [program_file]
Files created during this phase which you should know about are:
If you want to rerun a single table only, you may run the Quantum output program by
name rather than via the Quantum shell script. Type:
where tab_file is the name of the file to which the table will be written and table_num is
the number of the table you wish to reprint. For example, to rerun table 10 and save it in
the file tab_10 you would type:
qout -o tab_10 -t 10
Quick Reference
To run the job in the background, append & to the end of the command line.
To run a VMS Quantum job in the background and create a log file, type:
The notes in this section do not apply to DOS Quantum since these facilities are not
available on that platform.
Quantum normally runs interactively. With large jobs, this can lock up your terminal for
a considerable time, so you may wish to use facilities provided with your operating
system to run your jobs in the background. This then frees up your terminal for other uses.
When you run jobs in the background, they still write messages to your screen unless you
redirect them to a log file. Quantum provides for this on the systems which need it with
the –l option. This writes any messages which would normally appear on your screen into
a file called log instead. You use it on the quantum command in addition to any other
options required for the job. For example, to run a complete job in the background under
Unix, you might type:
✎ On some systems your system manager may prefer you to run large jobs via the
batch system.
Quick Reference
To run more than one job in a directory, assign a unique suffix to each run by typing:
You may run more than one job in a directory without overwriting existing files by
assigning a unique suffix to each run. All files created during this run will have names
which end with a dot and the given string. For example:
File names that already contain a dot will not have a suffix appended. If your run creates
clean and dirty data files, these will retain their original names, clean.q and dirty.q.
We advise you to avoid a suffix of Z since this is the suffix assigned to compressed files
and it may lead to confusion if compressed files also exist.
Quick Reference
To create intermediate files in a directory other than the project directory, type:
Quantum can create its temporary work files in a directory other than that in which the
job is running. The directory is named using the option –td on the command line:
This example tells Quantum to create temporary files in a subdirectory called temp in the
project directory.
Creating temporary files in a different directory is one way of improving the performance
of large jobs running under DOS. When the number of files associated with a job rises
above 500, you’ll find that the job runs more quickly if the temporary files are created in
a different directory. You’ll also find it more convenient to scan directories’ contents
when the number of files in each one is reduced.
You may also find that using –td when creating a Quanvert database helps to keep the
project directory clean of unwanted files. It is also useful if you need to do multiple
Quantum runs to create the database. As long as you use a different temporary directory
for each run, you can then combine the directories with qvmerge to create the Quanvert
database.
☞ For information on creating Quanvert databases, see chapter 41 and chapter 42.
Quick Reference
To read program, data or include files from a directory other than the one in which the
program is being run, and to create permanent files such as report file in that same
directory, type:
Quantum normally reads its program, data and include files from the directory in which
you are running the program, and creates permanent output files such as print or report
files in that directory. If you want to use a different directory, define it on the command
line with the option –pd. An example using the Unix pathname notation is:
The exceptions are filedef and include with absolute pathnames. In these cases Quantum
uses the directory named in the pathname.
The individual programs which make up DOS Quantum are able to use extended
memory. This means that Quantum can access all the memory available on your PC and
not just the 640K which is always available to all programs. In Quantum terms, this
allows you to compile larger jobs, define larger axes and produce more and bigger tables.
Extended memory does not, by itself, allow you to create or run bigger datapass programs
(qtm_ex_.exe).To do this, you will need a 32-bit compiler such as the Watcom compiler
described below.
Quantum’s extended memory capabilities do not affect the Quantum datapass program.
Quantum generates this program by loading the C code that the Quantum compile phase
created from your program file. Whether the datapass program can access all the
computer’s memory or just 640K depends on what program you use to load the C code.
The default is to use the Microsoft C compiler. This is a 16-bit compiler which creates a
16-bit datapass program. 16-bit programs can access 640K of memory only.
Quantum has been written so that you have an option to load the C code with a 32-bit
compiler. This creates a 32-bit datapass program which can access all the memory
available on your PC. The 32-bit C compiler that Quantum recognizes is the Watcom C
compiler.
✎ The Watcom compiler is not distributed as part of Quantum: it is something that you
may purchase separately if you wish. You do not have to buy the Watcom compiler
if you do not want to. If you can do all your work using the 16-bit datapass program
created by the standard Microsoft C compiler, you may do so. Quantum will still
work for you as described in this manual.
If you are interested in using this facility, please ensure that you read the notes in the
sections below.
Quick Reference
If you install the Watcom compiler and you want to be able to use it, you must define its
location with the environment variable watcom. For example, if you install the Watcom
compiler in the directory c:\watcom, you must add the line:
set watcom=c:\watcom
to your autoexec.bat file. If this line is missing, Quantum will not know where to find the
compiler.
emm386 is the extended memory manager supplied with DOS and Windows. The
emm386.sys file supplied with DOS is incompatible with Watcom. If you use Watcom
with this extended memory manager it will not work.
The Windows version of emm386.sys is compatible with Watcom. If you have this
version of emm386 installed and you want to use the Watcom compiler, make sure you
have entry of the form shown below in your config.sys file:
In this example, c:\windows is the name of the directory in which you have installed
windows. If you have installed it in a different place, the path name in the line will be
different. The other parameters on the line may vary. If you already have a statement of
this type in your config.sys file, leave these parameters as you have set them. If not, we
suggest that you use the values shown above.
☞ For further details on other values for these parameters, see your Windows
documentation.
If your config.sys file already contains a windows device= entry containing other
parameters, you must make sure that those parameters come at the end of the line after
256 (or whatever value you already have set).
✎ Using the Windows version of the extended memory manager does not mean that
you must run Quantum through Windows.
In order to run Quantum from a DOS Window you must have a computer that can
perform floating point arithmetic. Normally, this requires a maths co-processor fitted into
your PC, but if you do not have one you can have Windows load an additional driver file
when it starts up instead. You can check whether you need to do this by running a
Quantum job and checking for messages issued by the compiler, qcom. If you see the
message ‘Error: Floating point malfunction’ your machine does not have a maths
co-processor and you will need to load the driver file.
The driver file is called wemu387.386 and it is installed in the directory \qtime\bin as part
of the Quantum installation procedure. To use this driver, add the line:
device=wemu387.386
to the 386 section of your Windows system.ini file. (If you are unsure about where to add
this line, look for the line [386Enh] and insert the new line beneath it.)
No XMS limit
Windows normally limits the amount of extended memory for applications running in
DOS windows to 1,024 bytes. This is insufficient for the C compiler that Quantum uses,
so you’ll need to change your configuration so that this limit is removed. The instructions
that follow explain how to do this.
2. From the File menu select Properties and make a note of the filename shown in the
Command Line box. This will normally be DOSPRMPT.PIF.
4. Go to the Main program group and double click on the PifEdit icon to run the PIF
editor.
5. From the File menu select Open and open the file whose name you noted at step 2.
6. Look at the box labeled XMS KB and check whether it is set to −1. If not, overtype
the current value with −1, and then select File/Save to save your change.
7. From the File menu select Exit to leave the PIF editor.
3. Read and process the data using the program created at step 2.
You can either run all stages automatically one after the other, or you can run a specific
stage in isolation.
Your computer may have more than one version of Quantum available, for example, a
standard version for client tables and a newer version for in-house testing. To indicate
which version you wish to use, you must assign the group and account of that version to
an environment variable called QTHOME. For example:
Quick Reference
To run a complete Quantum job, type:
If you omit the program and/or data file names, Quantum will prompt you for them as it
needs them.
• run only one section of the job such as the compilation stage or the table creation
stage,
They are:
Quantum can cope with files that start with records you wish to ignore, or in which
records are not terminated by a new-line character. We refer to these globally as
non-standard data files. If you have a file of this type, create a dummy data file and enter
the name of that file on the quantum command line.
☞ For further information on dummy data files see the section entitled "Reading
non-standard data files" in chapter 26.
If you make changes to your program and then rerun it, Quantum will overwrite any
existing output files with the new versions. If you want to run more than one job in the
same account and group, you will need to assign a unique suffix to the files created for
each job to prevent files from one job being overwritten by those of another. Quantum
prompts you for a suffix with the words:
at the start of each run. Press the return key if you don’t want a suffix.
Quick Reference
To compile a Quantum program, type:
quantum –c program_file
To first step in any Quantum run is to check the syntax of your Quantum specification
and to convert it into C code. We call this compilation. You can run the compilation stage
by itself by typing the quantum command with the –c option:
quantum –c program_file
The compilation creates many files, the most important of which is:
out1 the program file listing made as the program is checked. If errors
are found, Quantum will mark them in this file.
☞ For further information on the contents of this file, see section 36.1.
Quick Reference
To load the C code created by a compilation, type:
quantum -1 data_file
After a successful compilation, Quantum converts the C code created by the Quantum
compile into a program and, if there are no problems, reads the data. We call this program
the datapass program. You can run this stage as a separate task by typing:
quantum -1 data_file
This stage also creates a number of files, most of which are normally deleted at the end
of the run. The file you need to know about is:
Quick Reference
To read the data file after a previous compile and load, type:
quantum –r data_file
The datapass program reads and processes data as you defined in your Quantum program
file. Normally, this happens as an automatic extension of the load phase, but if you have
corrected errors in the data or added more data to the data file, you may rerun the datapass
without recompiling and reloading your program file. To do this, type:
quantum –r data_file
The datapass reads and processes each record separately. If you requested that data
should be separated into clean and dirty data files, or that it should be written out to
another file, Quantum will do so during this stage. Any holecounts or frequency
distributions are also created now. Finally, Quantum sets flags indicating the cells and
tables in which each record is to be included.
Quick Reference
The weighting program weights records according to the figures given in your Quantum
program file. If the run has no weighting, the weighting program is ignored.
The accumulation program builds a file containing the cell values for each table.
If your job uses row or table manipulation, Quantum runs the manipulation program. This
carries out your manipulation requests and creates a second file of cell values. Note that
this file contains values for all tables whether or not they are the result of manipulation.
You cannot run the weighting, accumulation or manipulation stages in any way except as
part of a complete Quantum run.
Quantum creates the following files, amongst others, during these stages:
Quick Reference
quantum –o [program_file]
The final step in most runs is to take the cell values and use them to create tables.
Quantum reads the page and table headings and positions them as requested. If tables are
to be sorted, added or placed side by side, the relevant figures are rearranged or
combined.
To change the table layout without changing the cell counts (e.g., to print more decimal
places for percentages, or use special characters for absolute zero or rounding) you may
rerun just the compilation and output stages using the command:
quantum –o [program_file]
iles created during this phase which you should know about are:
If you want to rerun a single table only, you may run the Quantum output program by
name rather than via the Quantum shell script. Type:
where tab_file is the name of the file to which the table will be written and table_num is
the number of the table you wish to reprint. For example, to rerun table 10 and save it in
the file tab_10 you would type:
qout -o tab_10 -t 10
Quick Reference
The table of contents file is a list of the tables which have been produced in a run. There
is an example of one, produced for the sample tables in chapter 24, at the end of chapter
37. Normally, this file is produced automatically whenever Quantum creates tables, but
you may also create it with the command:
Quantum creates a number of files at each stage of the run. Many are intermediate files
which are used by later stages of the run and which Quantum may delete automatically
at the end of the run, others are files which may be of interest to you.
In this chapter we will look at the more important of these files and will explain when
they are created and what they contain.
The compilation stage of a Quantum run checks the syntax of your program file and, if
it is correct, converts it statements in the C programming language. As it reads your
program, Quantum writes the statements to a compilation listing file.
If Quantum finds errors, it prints messages explaining the error at the point at which it
finds the error.
Here is a short Quantum program and the out1 generated from it. The program is:
struct;read=2;ser=c(1,4);crd=c5;req=1;rep=2,3
ed
count c(101,180)
end
a;pagwid=80;side=20;dsp;op=12
tab age sex
l sex
col 110;Base;Male;Female l age
val c(111,112);Base;i;18-24;25-44;45-64;65+
The file starts with the maximum values to be assigned to various parameters which
control the run. These are read from the maxima.qt file; the values shown here are the
defaults.
Each time Quantum reads a tab statement, it increments its count of tables by 1 and prints
the table number. (These are useful if you want to refer to tables in a manipulation run.)
At the end of the run Quantum prints a set of compilation statistics summarizing what it
has read and what it will be required to do in later stages of the run.
Using maxima: 500 axes, 500 elements, 20000 heapsize, 600 incs,
8000 incheap
struct;read=2;ser=c(1,4);crd=c5;req=1;rep=2,3
ed
count c(101,180)
end
a;pagwid=80;side=20;dsp;op=12
tab age sex
000001
l sex
**** sex ****
col 110;Base;Male;Female
l age
**** age ****
val c(111,112);Base;i;18-24;25-44;45-64;65+
********** end of run **********
This file is created by the compilation stage of the Quantum run if the run contains a
tabulation section. It reports the columns and codes used in all non-ignored axes (i.e.,
those which are used to form tables). The format of each line is:
Columns Contents
1–8 axis name, left-justified
9 blank
10–12 n00 if condition was on n00, otherwise blank
13 blank
14–19 first/only column in the condition, right-justified
20–21 blank
22–27 second column in the condition, if any, right-justified, otherwise blank
28–29 blank
30+ either the words ‘all punches’ or space-separated codes, with a blank
represented by the letter B. Codes are listed in the order they are
mentioned in the axis
l sex
col 110;Base;Male;Female
l age
val c(111,112);Base;i;18-24;25-44;45-64;65+
sex 110 1 2
age 111 112 all punches
The column and code allocation file is called colmap on all systems.
The Quantum datapass creates files of clean and dirty data when the edit section contains
a split statement. Correct records are written to the clean data file and records with errors
are written to the dirty data file.
When Quantum reads the data and finds a record which fails a write or require statement
in the edit, it writes the record out to a a text file for you to check. For example:
1 in file
----+----1----+----2-- ... --9----+----0
columns 1 – 100 are |12345
write
2 in file
----+----1----+----2-- ... --9----+----0
columns 1 – 100 are |23456
write
The file ends with a summary of the errors found and a listing of any T variables with
non-zero values. This information is also written to the sorted summary of errors file
which is documented later in this chapter.
The Quantum datapass creates a holecount file is the edit section contains a count
statement.
DOS hct_
HP Spectrum qqhct
UNIX hct_
VMS hct_
The Quantum datapass creates a frequency distribution file is the edit section contains a
list statement.
DOS lst_
HP Spectrum qqlst
UNIX lst_
VMS lst_
The Quantum datapass creates a summary of errors found during the datapass. It shows
the number of records read, accepted and rejected, and then lists the number of records
written out by write and require statements. A separate line is printed for each such
statement. For example, the program:
ed
r b c15
if (films0 .gt. 0) write $bad film code$
15 records
15 (100%) records accepted
0 ( 0%) records rejected
DOS sum_
HP Spectrum qqsum
UNIX sum_
VMS sum_
If your program contains a require statement that writes records out as a data file, the
datapass will write them to a file called:
DOS punchout.q
UNIX punchout.q
VMS punchout.q
If your run contains weighting, the weighting program will generate a report of the
weights for each cell in each matrix. .For example, if the weighting matrix is defined
And there are 20 respondents in each cell, the weighting report will be as follows:
Summary of Weighting
Matrix 1
1 Given 100.00 125.00
Counts 20.0 20.0
Weights 5.000 6.250
Quantum displays these statistics on the screen and prints them beneath the matrix in the
weighting report file.
If the run is weighted using rim weighting, the file contains a set of tables showing, for
each cell in the matrix:
If Pj is the preweight for person j, and Rj is the rim weight for person j, the Rim
Weighting Efficiency is:
2
100.0 ∑ P j R j
j
--------------------------------------
∑ Pj ∑ Pj Rj
2
j j
If the data for many respondents needs to be weighted heavily up or down, the
efficiency percentage will be low. The greater the percentage the more well balanced
the sample.
Normally Quantum reports the rim weights as they are at the final iteration. If requested
to do so, Quantum will report the rim weights it calculates at every iteration separately.
DOS weightrp
HP Spectrum wghtrp
UNIX weightrp
VMS weightrp
The accumulation stage of a Quantum run calculates the values to be printed in each cell
of each table. If the run contains row or table manipulation statements, the manipulation
program creates a second file of cell counts containing manipulated figures where
necessary. The second file, if it exists, contains cell counts for each cell of every table,
and supersedes the first file of cell counts.
DOS tab_
HP Spectrum qqtab
UNIX tab_
DOSvms tab_
This file is created by the output stage of the run and contains a listing of the number of
pages and tables printed, and the number of tables suppressed. For example:
Quantum creates graphics files during the output stage if the program contains the
keyword graph= on the a, sectbeg, flt or tab statements. Files are named with a standard
name and a variable number. This number is the count of graphics files created; the first
one is number 1, the second is number 2, and so on. In the list below, the letter n
represents the graph number.
DOS abn.syl
HP Spectrum qrsyln
UNIX tabn.syl
VMS tabn.syl
If the project directory contains a file called private.c containing C subroutines, Quantum
will create a compiled version of the code in a file called private.o. This is a temporary
file which Quantum usually removes at the end of the run.
Quantum creates a number of intermediate files which it usually deletes at the end of each
run. You will see these files in the project directory if you use the version of Quantum
that does not delete intermediate files (installed as quantumx), or you use the –k option
on the standard quantum command.
The table of contents is a list of the tables which have been produced in a run. There is
an example of one, produced for the sample tables in chapter 24, at the end of this chapter.
Quantum on the HP Spectrum creates a table of contents at the end of any run which
produces tables. You can also create it separately by typing:
The table of contents program uses intermediate files generated during the Quantum run
which created the tables. Normally, Quantum deletes these files at the end of the run. If
you want a table of contents you must keep these files. To do this, either use the version
of Quantum which does not delete intermediate files, or use the –k option as part of your
standard Quantum command.
The sample table of contents at the end of this chapter shows the default layout and
content produced by the table of contents program. To obtain a table of contents like this,
type:
tabcon -o tofc
There are a number of other parameters you can use with tabcon to control how the table
of contents is formatted and printed, besides –o. The full list of options is:
The format file determines the layout for the table of contents and the type of information
it will contain. It consists of up to four statement types labeled a, tt, ord and sel
respectively. If used, these statements must appear in this order. The format file may
contain only one each of the a, ord and sel statements, and up to 32 tt statements. It may
not contain any blank lines. If you want to use blank lines to improve readability, type a
space or press the tab key before pressing Return.
The a statement
a;options
where options is a list of keywords from the list below separated by semicolons:
basettl print table titles starting with Base under the base column rather
than under the title column T}
date print the date
delim draw a line between entries
dsp print a blank line between entries
index=n determines which titles to print in the table of contents. Values for
this keyword are:
n10 print the first n10 statement in the row axis under the base column.
newp start a new page whenever the table numbering is restarted at 1.
page print the page number of the current table of contents page.
paglen=n set the page length to n lines.
pagwid=n set the page width to n characters.
roman print page numbers in roman numerals.
section print all section and filter titles associated with a table (i.e., all titles
between level 0 and the level of the table titles) as section headings
within the table of contents. Section headings are printed the first
time they are defined and then each time they change. This is the
default.
The following options may be preceded by no to turn off the defaults which they define:
a;pagwid=132;paglen=66;nodsp;nodate;page;nodelim;n10;basettl;noroman
• single spacing
• no date
• page numbers printed in arabic numerals (i.e., 1,2,3) on each contents page
• the first n10 element in the row axis is printed in the base column of the contents
• base titles starting with the word Base are printed in the base column.
The tt statement
tt statements define titles to be printed at the top of each page of the table of contents. The
format of these statements is:
ttxtext
where x is a letter defining the position of the text in the line. It may be either l for left
justification, c for centered, or r for right justification.
You may define 32 lines of titles. If you do not define any, the table of contents will be
labeled with the main title from the Quantum run itself.
The ord statement defines the content and layout of each line in the table of contents. Its
format is:
ord;options
where options is a list of keywords from the list below separated by semicolons:
blank=n print n blank spaces between this and the next column. Use this
keyword as many times as you need.
base[=n] print the table base in n spaces.
basettl[=n] print base titles in n spaces. This keyword is overridden by non10
or nobasettl on the a statement.
page[=n] print the table’s page number in n spaces.
table=[n] print the table number in n spaces.
title=[n] print, in n spaces, any titles selected by the sel statement.
You define the line layout by typing keywords from this list in the order you want the
items to appear in the line. If, for example, you want the line to contain the page number
in a field five spaces wide, the table number in a field 3 columns wide, and the table title,
and for each field to be separated by three blank spaces, you would type:
ord;page=5;blank=3;table=3;blank=3;title
If your ord statement includes both title and basettl, but only one is defined with a column
width, the other title will be printed in whatever space remains on the line. If both options
are present without column widths, tabcon takes the amount of space remaining after
other columns have been allocated and divides it equally between the two sets of titles.
ord;page=4;blank=1;table=5;blank=1;title;blank=1;basettl;blank=1;base=6
This prints:
If you reduce the page width on the a statement but do not define a new ord statement,
tabcon automatically adjusts the program defaults so that they fit the new page width.
The sel statement determines what types of titles are printed. Its format is:
sel;options
where options is a list of keywords from the list below separated by semicolons:
sel;flt;tab;side
It prints titles found under flt and tab statements, and titles defined in row axes.
Titles starting with the word base are normally printed in the base column. You may force
tabcon to treat these titles in the same way as other titles by placing nobasettl on the a
statement.
tabcon has default settings built into it, and if these are satisfactory you need not use a
format file at all. If you want a different layout for some tables, or you want to include
more or different information about each table, you need to create a format file. If you
call the file tc.def and create it in one of a list of directories that tabcon searches
automatically, there is no need to name the file on the tabcon command line. Where you
create tc.def depends on which tables it applies to.
If you want all tables of contents to look the same (i.e., you have a house style), then
create the file in the main Quantum include subdirectory:
DOS %qthome%\include\tc.def
UNIX $QTHOME/include/tc.def
VMS QTHOME:[INCLUDE]TC.DEF
If you have a style which you always use for your jobs, create the file in your home
directory.
If the layout or content is unique to one job, create the file in the project directory.
You can also create format files with other names and in other directories. This is
particularly useful if you do work for a number of clients who each have different
requirements for their tables of contents. If you create a separate file for each client and
keep it in, say, your home directory, you can call up the file for an individual client by
naming it on the tabcon command line with –f. For example:
Since it allows many format files to exist, tabcon always searches for them in a fixed
order:
1. internal defaults
2. main Quantum include directory
3. your home directory
4. project directory
5. file named with –f
tabcon reports the names of the format files it has used and the order in which it has used
them as it runs.
In the example below, tabcon has used the installation format file and the project format
file only:
Sometimes a file at a lower level overrides the same file at a higher level, at other times
the information in the lower level file is additional to that in the same file at the higher
level. The table below shows what to expect for each format statement:
a new keywords at lower levels are additional to those at higher levels. If the
same keyword is present with different values at different levels, the lowest
level overrides the higher levels.
tt lowest level is additional to higher levels.
ord lowest level overrides higher levels.
sel lowest level overrides higher levels.
--------------------------------------------------------------------------------------------------------------------------------------------
2 2 Q7. Have you visited the Museum before? All Respondents 605
Q8. If so, number of previous visits excluding
this one
----------------------------------------------------------------------------------------------------------------------------------------
3 3 Q12. Have you visitied any other museum/art All Respondents 605
gallery before today
and/or do you intend to visit any others?
Q13. Museum/Art Galleries visited/intended to All who visited other museums before today
visit
and/or intend to visit others
----------------------------------------------------------------------------------------------------------------------------------------
4 4 Q1. How long have you been in the Museum today? All Leaving Museum 301
Q2. Was your stay longer/shorter than intended? All Leaving Museum
--------------------------------------------------------------------------------------------------------------------------------------------
6 5 Q3. What do you remember seeing? All Leaving Museum 301
--------------------------------------------------------------------------------------------------------------------------------------------
8 6 Q7. How did you find your way around the Museum? All Leaving Museum 301
--------------------------------------------------------------------------------------------------------------------------------------------
9 7 Q8. Could signposting be improved? All Leaving Museum 301
Q9. How do you think it might be improved? All Leaving Museum
All who think signposting could be improved
--------------------------------------------------------------------------------------------------------------------------------------------
If you have a laser printer which recognizes the PostScript language, you can produce
Quantum tables in this format by running a postprocessor, pstab, after the standard
Quantum run. This provides you with:
• more flexible formatting capabilities for row text and column headings
The Quantum commands required for these facilities are described below.
The fonts which are used for printing Quantum tables on a laser printer are usually
proportionally spaced. This means that each character is printed in the minimum amount
of space required; thus, the letter ‘i’ takes up less space on a line than the letter ‘m’. In
axes where you define the column headings with g statements, or where you lay out the
row text in a particular way, the use of proportional fonts means that the printed output
will not look quite the same as the layouts in your Quantum program. It is therefore
necessary to insert some additional characters in these texts to define how they should
appear on the laser-printed output.
Quick Reference
To generate tables in PostScript format, run a standard Quantum tabulation job and then
type:
pstab [–x] [–d] [–s] [–f font_file] [–o output_file] [–p printer] [–t tabcon_file]
To print output on a PostScript printer, first run the job as usual to create a tables file.
Then run the program pstab to format and print the tables on the laser printer:
pstab [–x] [–d] [–s] [–f font_file] [–o output_file] [–p printer] [–t tabcon_file]
☞ For further information, see the section entitled "Fonts and logos" below.
–o output-file save the PostScript output in the named file. The file may then be
printed on a PostScript printer using the lpr command.
–p printer print the output on the named printer. (At Quantime the default is
the Agfa printer).
–t tabcon-file take the format for the table of contents from the named file. pstab
automatically generates a table of contents file. This will be printed
after the last table.
✎ pstab uses many of the intermediate files produced by the normal Quantum run. To
keep these files for use with pstab, either use the version of Quantum which does not
delete intermediate files or include the option –k on the standard Quantum
command line. Also, do not clean the directory in any way after the run has finished.
Quick Reference
The determine the justification of column headings above columns, use the characters:
Column headings which are generated automatically without the use of g and p
statements are laid out so that the table extends across the full width of the page. Long
texts are folded as necessary to create multi-line headings right-justified above the
numbers in each column.
The characters { } and ^ mark the start of a column and ~ marks the end. Any text
between these characters is justified according to the character which precedes them. For
example.
l color
col 32;Base;Red;Green;Yellow
g} Base~ } Red~ } Green~ } Yellow~
p x x x x
requests that the column headings should be right-justified in the space between the } and
~ characters – that is, the rightmost character of the column heading should be printed
immediately above the rightmost digit in that column, as specified by the p statement.
Without the ~, the text would be right-justified between the } signs before and after the
column text.
Similarly,
l color
col 32;Base;Red;Green;Yellow
g{ Base{ Red{ Green{ Yellow~
p x x x x
l color
col 32;Base;Red;Green;Yellow
g^ Color Preferred ~
g} Base^ Red^ Green^ Yellow~
p x x x x
indicates that the Base text should be right-justified while the remaining column texts
should be printed centrally above each column. The overall axis heading will be centered
above the column headings. Right-justified text above columns generally looks best,
particularly for the lowest level of headings (i.e., the colors in our examples), so you’ll
probably use the } character in most of your headings.
You will notice from these examples that the column heading is set out exactly as it has
always been; the only difference is the presence of the } and ~ characters at the start of
each column where you would normally have typed a space. The position of the special
characters is important since they determine the space in which the column headings will
be justified. In our examples we’ve placed the special characters immediately to the left
of the cell marker on the p statement so that the text is justified between the end of the
previous column and the end of the current column. If you place the special characters in
other positions, the text will still be justified between those characters even if this means
that it no longer lines up with the columns themselves.
If you type text on g statements but omit the layout control characters, the tables will
contain blank lines instead of column headings. Control characters do not affect your
tables when they are formatted for printing on a non-Postscript printer (i.e., you run
quantum but not pstab). The column headings retain their correct layout with the special
characters being replaced by spaces.
In tables with a large number of column headings, it may occasionally happen that there
is not enough room on the page to fit the individual column texts in the space allocated
to each column. For instance, where you ask for a long word to be centered across a
column, the column may not be wide enough to print the text in the font you are using. If
this happens, Quantum will squash the characters until the text fits in the space available.
Quick Reference
To define your own special characters create a file called qtform containing one line of
just six characters (blank counts as a character). The first character is the replacement for
the default ~ character, the second character is the replacement for the default ^ character,
and so on.
Alternatively, you may define the six characters using the variable QTFORM.
The characters:
~ ^ { }
on g statements and
| !
in element texts determine how the column headings will be laid out on the table. These
characters are defaults which you may change if you wish. There are several ways of
doing this, but in all cases you must always define all six special characters, in the order
they are shown here, even if some of them are the same as the defaults.
Replacement of special characters may be defined globally for all jobs, or individually
for particular jobs by placing the new characters in a file in one of the following
locations:
If you wish to define your own personal defaults, you may do so using the environment
variable QTFORM.
Quantum searches first for the environment variable, then for qtform at the project level,
and finally for qtform at installation level, stopping at whichever one it finds. If none of
these exist, Quantum uses the default characters.
To define your own special characters, create a qtform file containing one line of just six
characters:
where char1 is the replacement for the ~, char2 is the replacement for the ^, and so on.
You may include blanks (spaces) in the list of characters, but these mean that the special
characters they replace will have no special meaning at all to pstab.
The example below replaces the curly braces and the pipe symbol with a colon, a
semicolon and an ‘at’ sign respectively. All other characters are unchanged:
~^:;@!
The next example contains blanks for the { and } characters. If pstab reads these
characters on a g statement it will print them as part of the column heading rather than
reading them as left and right justification symbols:
~^ |!
The final example illustrates how to prevent the characters | and ! from having special
meanings in element texts. You might wish to do this if you need to print these characters
as part of the element text:
~^{}
✎ This change affects all axes in the run, not just the axis in which these characters
form part of the element text.
If you wish to use the environment variable QTFORM rather than creating a file, just list
the six characters as the value of the variable in the usual way. If the list of characters
contains blanks you must enclose it in double quotes.
If the file or character list contains more or less than six characters Quantum issues an
error message to this effect as the tables are formatted and uses the built-in defaults
instead. If this is your only error, you may correct the value of $QTFORM and rerun the
job with quantum –o.
Quick Reference
To underline text, type the following characters on a g statement:
Lines drawn with hyphens or underscores in column headings may still be printed. Thin
lines may be drawn with – or _ characters, and a single thick line with the = character.
Quick Reference
The determine the justification of words or sections in row texts, use the characters:
n03#& character
at the start of the axis, where character is one of the special characters shown above.
{, }, ^ and ~ may also be used with row texts which need to be laid out in a particular
format. For instance, where the row texts are scores you may wish to print the factors
directly underneath one another rather than just in the next free column, thus:
Excellent (+2)
Very Good (+1)
Average (0)
Poor (−1)
Very Bad (−2)
In order to have row texts aligned in columns you need to start the axis with a n03
statement of the form:
n03#& X
where X is one of the special characters { } or ^. The number of spaces between the &
and the special formatting character depends on where you want the aligned column to
start. For instance, if you want text to be aligned on column 20 you would type 19 spaces
and then the formatting character in the 20th position.
Then in the individual elements you place a ~ immediately before the first character
which is to be part of the aligned column (e.g., the scores). Long texts before the ~ will
be folded, but texts after it will be squashed if they will not fit within the remaining space
in the side text.
Here is an example which lines up the first character of each score in column 20. Notice
the { for left-justification on the n03#& statement and the ~ on the elements:
n03#& {
n10Base
n01Excellent ~(+2);fac=2;c=c237’1’
n01Very Good ~(+1);fac=1;c=c237’2’
n01Average ~(0);fac=0;c=c237’3’
n01Poor ~(-1);fac=-1;c=c237’4’
n01Very Bad ~(-2);fac=-2;c=c237’5’
Since this is a simple axis it could have been written using a col statement:
n03#& {
col 237;Base;Excellent ~(+2);%fac=2-1;Very Good ~(+1);
+Average ~(0);Poor ~(-1);Very Bad ~(-2)
Quick Reference
To define the percentage by which titles should be enlarged or reduced in size, type:
#snumber
at the start of the text on the tt statements. The standard size is 100 (i.e., 100%).
#fnumber
at the start of the text on the tt statements. number is the number of the font you wish to
use as it is defined in the font table.
Table titles defined on different parts of the table, for example, a ttl followed by a ttc, will
be printed on the same line. A new-line will be produced when a subsequent title is
defined for the same position as a previous one, for example, a ttr followed by a ttr. The
texts are not checked as to whether there is sufficient space to fit them in. Therefore, the
user must check whether this is likely and control the production of a new-line by means
of blank tt statements.
All numbers and texts in a table are printed in a standard size using the fonts you request
(see below). Text defined on tt and n03 statements may be enlarged or reduced in size by
preceding the text with the notation:
#sn
The #s150 indicates that it should be printed 50% larger (1.5 times as large) as the rest of
the table text. A similar n03 statement would read:
Quantum has a standard set of fonts that it uses for printing tables. The default is
Helvetica. These fonts are declared in a font table, and each font is represented by a
number based on the position of that font in the table. The first font in the table is font 0.
You might wish to print titles in a different font; for example, Helvetica Bold or
Helvetica-Oblique (italic). To do this, type:
fnumber
tells Quantum to print the title in font 2. If you use Quantum’s standard fonts this is
Helvetica Bold.
☞ For more information on fonts and the font table, see section 38.7
Quick Reference
To suppress the automatic table border, type:
#nopagebox
boxl draw a line between the start and end of the box
boxg start boxes above column headings on g statements (used with boxs).
Tables printed on the PostScript printer are automatically enclosed in a border. If you
want unboxed tables, place the statement #nopagebox in the font-file (specified with the
–f option on the command line).
Additional statements have been added to Quantum to enable you to place boxes around
selected cells in a table. For example, it is possible to print a table in which rows 3 to 5
and columns 1 to 4 are enclosed in a box.
boxl draw a line from the start of the box to the end of the box. In row axes this
generates a horizontal line; in column axes a vertical line is drawn.
boxg start boxes above rather than below column headings defined with g
statements. This keyword requires boxs to be specified as well.
Boxes will only be drawn if these statements are present in both the row and column axes;
if they are present in only one axis in a table, no boxes will be drawn. Additionally, a box
extends the full width of the column rather than enclosing just the numbers themselves.
Box statements in illegal or irrelevant positions are silently ignored. Here is an example:
This generates a table with 16 cells of which four are enclosed in a box, thus:
Quick Reference
To introduce a list of fonts for the tables, type:
#fonttable
in the font file. Follow this with the names of the fonts to be used, one per line. Terminate
the list with:
#endfonttable
To scale a font up or down from its default size, enter the font name followed by a space
and the scaling factor.
The PostScript laser printer offers a wide variety of typefaces and type styles. The font=
option on the a statement determines which parts of the table are printed in which
typeface – for example, you may decide to print absolutes in the standard typeface,
percentages in italic and the run title in bold.
Quantum provides a standard set of fonts which will be used if you use font= but do not
name the fonts to use. These are:
font=0 :Helvetica
font=1 : Helvetica-Oblique
font=2 : Helvetica-Bold
font=3 : Helvetica-BoldOblique
font=4 : Times-Roman
font=5 : Times-Italic
font=6 : Times-Bold
font=7 : Times-BoldItalic
The first font in the list is the default font which will be used if no other fonts are defined,
or if an invalid font= option is given (font=8 when no font is defined for position 8), or
if you misspell a font name.
You may also choose to print your tables using an entirely different font to the default –
for example in the AvantGarde or Souvenir font. A full list of the fonts available may be
obtained from your printer supplier.To print in a different font, create a font-file starting
with a line containing the word #fonttable and then list the names of the fonts you wish
to use, one font name per line. The list must be terminated with a line containing the word
#endfonttable. If you want to print your tables using the Avant-Garde font, for example,
your font file might be:
#fonttable
AvantGarde-Book
AvantGarde-Book
AvantGarde-Demi
AvantGarde-DemiOblique
AvantGarde-BookOblique
#endfonttable
Here, the first font corresponds to font=0 (the default font), the second font corresponds
to font=1, and so on. If the font= option is:
font=(a=2,tab=2,pc=3,numb=1,page=4,date=4,type=4)
titles following the a and tab statements will be printed in font 2 (i.e. AvantGarde-Demi),
percentages will be printed in font 3 (AvantGarde-DemiOblique), all other numbers will
be printed in font 1 (AvantGarde-Book), and the page number, date and output type will
be printed in font 4 (AvantGarde-BookOblique). All other parts of the table will be
printed in font 0 (AvantGarde-Book).
If you find that the standard character sizes are too small or too large you may alter them
by scaling the character size for a font up or down. You do this by entering the font name
in the font table followed by a space and the scaling factor.
Scaling factors are proportions written as decimal values, so to make characters twice as
large you would enter the scaling factor as 2.0. To make the characters one and a half
times their default size you need a factor of 1.5, whereas to make the characters 95% of
their current size you need a factor or 0.95.
Logos are defined using the PostScript language and are included in the font-file after the
font definitions. They are printed at the same time as the tables. If you wish to include a
computerized logo on your tables, contact your Quantime Support Representative.
Quick Reference
To print small tables left-justified on the page, type:
#tableleft
Because the laser printer uses proportionally-spaced fonts and prints in a smaller type
size than a line printer, tables which would normally fill a page of line printer paper will
only partially fill the page when the table is laser printed. When the column headings are
defined on g statements and the table is narrower than the page width, Quantum will print
it centrally in the width of the page. If you want the table to be left-justified on the page,
enter the command:
#tableleft
You may also scale the table as a whole up to a larger type size by defining smaller values
for pagwid= and paglen= on the a statement. For example, when printed using the
default type size, the sample table occupies half a page. We have scaled it to fill the whole
page by defining the page to be just a little larger than the table itself – that is,
pagwid=115; paglen=50.
If you are printing more than one table, you will need to take the width of the widest table
and the length of the longest one as guidelines for the overall length and width required.
Since it is not possible to gauge exactly how many lines the table will occupy when it is
laser printed, it is best to increase your line counts by a small amount as in the example
above.
pstab automatically creates a table of contents which it prints, starting on a new page, at
the end of the tables.
Under Unix you can suppress the table of contents if you do not want it, by running qout
and q2ps rather than pstab:
If you want the table of contents but not the tables, you will need to strip out all lines
before the start of the table of contents. The edit command that you type at the Unix
prompt is:
This command is too long to print on one line, but you should enter it all on one line.
If this command is to work it is important that you type the substitution text enclosed in
quotes exactly as it is shown here. If you insert extra spaces or use upper case when the
command shows lower case, the command will not work.
Font encoding is a method of defining and printing characters that are not part of the
standard ASCII character set. You can think of it as a translation service whereby the
printer reads a character from the tables or table of contents, looks it up in a translation
table and prints the corresponding translation character.
Quantum comes with one encoding scheme already set up, which allows you to print
characters from the Microsoft Multilingual (Latin 1) character set (in DOS this is known
as code page 850), but you can define other translations if you wish.
Each font encoding scheme is held in a separate file in the QTHOME include directory. The
files are called name.fen, where name is any name you choose (it is sensible to choose a
name that reflects the name of the character set or code page). The name of the Microsoft
Multilingual encoding file is cp850.fen.
–e encoding_scheme
in your pstab command. encoding_scheme is the name of the encoding file without the
.fen extension, so to request encoding using the Microsoft code page 850 scheme you
would use the option -e cp850.
If you are using a different character set you may create your own encoding files. For
example, if you are working on a PC that uses the Slavic character set (code page 852)
you may wish to create an encoding file for that character set so that your printed tables
match those you display on your screen.
Creating encoding files requires some knowledge of PostScript, but the general
procedure is as follows.
First, make a list of the characters in your character set that print differently to the way
they appear on your screen (mostly, these characters will appear as blanks in the printed
output). These are the characters for which translations are required. Normally these will
be the characters 127 to 255 which form the extended ASCII character set: characters 0 to
31 are not used for keyboard characters, and characters 32 to 126 (the standard ASCII
character set) are common to all code pages.
Create a new encoding file with an appropriate name and a .fen extension. You could use
cp850.fen as a template.
For each character requiring translation, write an Encoding statement that defines the
character’s decimal value and the name of the PostScript character or symbol you want
to print in its place. You’ll find a list of character and symbol names for all standard
PostScript fonts in the back of the Adobe PostScript Reference Manual.
An alternative to creating an encoding file is to place all the encoding information in the
Quantum job’s PostScript format file as user-defined code, as described in the next
section.
You may now include any PostScript code you like in the format file for pstab and
pstabcon. Type:
#postscript
#endpostscript
after the last line. You could use this feature for declaring your own encodefont routine
for font encoding.
This chapter introduces a number of utility programs which are included as part of each
Quantum release. These include a number of programs for tidying up Quantum
directories, and a program for reporting on column and code usage in the data and the
spec file.
All programs mentioned in this chapter are available under Unix, DOS and Vax VMS
unless otherwise stated.
Quick Reference
To remove unwanted files after a Quantum run, use one of the programs quclean,
qtlclean, qteclean, qtoclean or manipclean as follows:
program_name [–a] –y] [–n] [–td path_name] [–pd path_name] [–id suffix]
Use –a to delete all intermediate files, –y to delete all output files without prompting, or
–n to keep all output files without prompting.
The standard Quantum program deletes temporary files as it goes along; the alternative
quantumx program does not. To help you remove unwanted files quickly and easily,
Quantum provides a number of programs designed to remove files created by particular
types of runs:
quclean remove all intermediate and output files from the directory.
qtlclean remove temporary files created during compilation from the
project directory or from the named temporary directory. Files
deleted are filedefs.h, params.h, qtheader.h, requires.h, variable.h,
varset.*, editQ.*, tabsQ.*, tabsQ1.* and axesQ.*.
qteclean remove the files created by an edit-only datapass.
qtoclean remove the files created by a quantum–o run.
manipclean remove all files other than those required for a manipulation run.
The syntax for these commands is identical apart from the program name, so we’ll use
quclean as a model:
quclean [–a] [–y] [–n] [–td path_name] [–pd path_name] [–id suffix]
where:
–a removes all intermediate files. Normally, the C source and object files (e.g.,
editQ.cold) which Quantum creates are not deleted.
–y silently deletes the files which normally cause the program to prompt for
permission to delete. May not be used with –n. These files are out1, out2,
out3, the holecount and frequency distribution files, the column and code
map, the clean and dirty data files, the output data file, the weighting report,
the tables file, the error summary, and the cell counts files.
–n silently keeps the files which normally cause the program to prompt for
permission to delete. May not be used with –y.
–id names the run suffix of files to be deleted. Files with other suffixes or no
suffix are not deleted.
–pd names the directory containing the output files to be deleted.
–td names the directory containing the intermediate files to be deleted.
The DOS and Unix operating systems allow the use of special characters (wildcard
characters) in commands to represent characters that are common to a group of filenames.
You may use this facility with quclean, qtlclean, qteclean, qtoclean and manipclean to
remove, say, all files from temporary directories whose names start with temp.
Under DOS, just type the wildcard character on the command line and it will be passed
directly to the commands in the DOS program you are running. For example, typing:
removes all files in directories whose names start with temp and which are at the same
level as the directory in which the Quantum job is specified.
Under Unix you must either enclose the parts of the command that contain wildcard
characters in single quotes, or precede the wildcard characters with backslashes. This
tells the Unix shell that it must pass the wildcard characters to the program exactly as they
are rather than intercepting and interpreting them as it normally does. If you type:
Unix will issue the message ‘Unknown flag’ because the shell will read the asterisk as an
option to quclean. To make this command work, type either:
or
Quick Reference
To compare column and code usage in the data file with the columns and codes
mentioned in the axes of your run, type:
The colrep program summarizes information found in the holecount and column map
files into a single report. Quantum creates a holecount file if the edit section of your
program contains a count statement. It always creates a column map file.
colrep [options]
where options define the type of output required. If no options are given, a file called
colm_ is created listing the columns, codes, and the number of occurrences of each code
present in the holecount which are not mentioned in the column map. Columns which are
missing from the column map altogether are flagged with the letter M. For example:
M: 101: 0/878
M: 102: 1/101 2/100 3/96 4/99 5/95 6/99 7/100 8/88 9/1 0/99
.
113: B/2
.
M: 140: 1/3 2/5 4/75 B/799
.
149: 4/23 5/3 6/2 7/2 B/827
In this example, the holecount file shows that 878 records have a ’0’ in c101. This column
is not used in the axes defined for the run and, therefore, does not appear in the column
map file. Column 149 is present in both the holecount and column map files. However,
the holecount reports 23 records coded c149’4’, 3 records coded c149’5’, 2 records coded
c149’6’, 2 records coded c149’7’, and 827 records coded c149 ’’ which do not appear in
the column map and are therefore not part of any axis.
113: B/2
–b do not print columns in the colm_ file where only blanks are missing from the
axes. In the example above, column 113 would not .
–o as –a with the addition of a flag marking the way in which the axis is used. Flags
are ‘p’ for standard use as a condition, or ‘o’ for a filter option on an n00
statement.
M: 104: 1/89 2/88 3/88 4/87 5/88 6/86 7/88 8/87 9/88 0/89
113: B/2
–c list in the cola_ file the axis names mentioned in the column map file, the
columns used in those axes, and any columns used as filters (n00s) in the axes. A
colm_ file is also created in the format specified by the options a, b or o.
ax.name punches n00 from.to
extent : 110
112
113
114
112
110
113
1,2,3,4,5,6 1,2
extent : 110
1,2,3,4,5
112
1,2,3,4
113
1,2,3
114
1,2,3,4,5,6,7,8,9,0,&,-
–h lists in the file colh_ the columns named in the column map file and, for each
column, the codes which are used. A colm_ file is also created in the format
specified by the options a, b or o.
105: 1 2 3 4 5 6 7 8 9 0 & - B
106: 1 2 3 4 5 6 7 8 9 0 & - B
107: 1 2 3 4 5 6 7 8 9 0 & - B
108: 1 2 3 4 5 6 7 8 9 0 & - B
109: 1
110: 1 2 3 4 5
Options may be combined as necessary; for instance, –ab to request a colm_ file which
excludes columns where only blanks are missing, but which shows, for all non-missing
columns, the codes not mentioned in the column map file, and the axes in which those
columns are used.
This chapter describes utilities for converting Quantum data and/or program files into
formats suitable for further processing with other packages. The utilities are:
q2cda/qvq2cda convert Quantum and Quanvert tables into comma delimited ASCII
format
qtspss convert a Quantum program and data file into SPSS format
nqtspss a variation of qtspss with additional or variable functionality,
including:
• coding of data values as 0/n, where n is the element’s position
in the axis, rather than as 0/1.
• processing of fld/bit statements.
• user-definable value for missing values.
• processing of hd= on the l statement.
• titles for numeric variables created from inc=.
qtsas convert a Quantum program and data file into SAS format
nqtsas a variation of qtsas. If you rename nqtsas to nqtspss you can create
spss data instead. This version of nqtspss offers different
functionality to the standard nqtspss, including:
• the ability to process subaxes.
• fixed or free-format data.
• control over the number of digits in numeric variables.
• control over the number of decimal places for real values.
• missing values may be output as blanks.
Quick Reference
qout –p temporary_file
To convert Quanvert tables, create them with the export option in Quanvert, then leave
Quanvert and type:
q2cda and qvq2cda are postprocessors to Quantum and Quanvert which convert a tables
file into a comma-delimited ASCII file. By this we mean a file in which items (texts or
numbers) are separated by commas. Files in this format can be read into many PC
packages including Lotus 123 and Harvard Graphics.
These programs have been designed with a wide range of requirements in mind and offer
options for choosing a different delimiter, printing or suppressing table titles and
percentage signs, as well as for printing user-defined strings before or after each entry in
the file.
When q2cda or qvq2cda reads a Quantum or Quanvert table it takes each non-blank line
in the table and converts it into a list of texts and/or numbers separated by commas and
writes it out to an output file. Texts are enclosed in double quotes. Thus, the row:
becomes
"Base",173,34,139
The exception is the column headings of the table which, if created by g and p statements,
are not enclosed in double quotes. If column headings are created automatically from
n01/col/val/bit statements, each entry will be enclosed in double quotes.
Blank lines anywhere before the first row of the table are treated as empty text lines and
appear as two consecutive double quotes at the start of the next text row in the table. If
there are more than one blank line, the text line will start with as many empty entries as
there are blank lines. Lines of the form:
""
""
"","Base Male Female"
define a set of column headings. The two empty entries represent two blank lines before
the column headings. The empty entry at the beginning of the third line is the row text
(which is blank) before the column headings. The empty entry at the end of the line
indicates that there is a blank line between the column headings and the first row of the
table.
Blank lines within the table (e.g., if it has been created double spaced) are ignored.
When a row in the table consists of more than one type of figures (e.g., absolutes and
column percentages), the row text for the percentage lines is often blank. This is treated
as a blank text area and the entry in the output file will therefore start with two
consecutive double quotes.
When the table contains percentages, these appear as a row of numbers without
percentage signs.
Page 1
London Shopping Survey
Absolutes/col percentages
Notice the page number. This is always shown as #page for every table and it appears
after the table title, whereas in Quantum the page number precedes the title.
In order to use q2cda you must run the initial Quantum job using the version of Quantum
which does not delete the temporary files. Normally this is called quantumx, but if it is
standard policy at your installation not to delete temporary files this program may have
been renamed to quantum.
qout –p creates an intermediate tables file containing the texts and numbers
for each table. As this program runs it displays information about
when it was created followed by the number of tables printed and
suppressed.
q2cda reads the intermediate file and prints it in the format defined by the
user.
where asc.filename is the name of the comma-delimited file you want to create. Options
is a list of parameters which you may use when you want your output in a format other
than the default shown above. These parameters are described later on. Use of the pipe
mechanism ( | ) to link output from qout is passed directly to q2cda without an
intermediate file being written to the disk.
On DOS systems you need to run the two programs separately, so type:
qout –p intermed.file
intermed.file is the file in which the intermediate tables will be saved, and asc.filename
is the name of the comma-delimited ASCII file you want to create. Options are one or more
parameters defining the format required (see below).
When you’re running Quanvert and you want to process your tables with qvq2cda, create
them using export as one of the table output options. This creates two versions of your
tables, one in the standard tabular format and the other in Quantime intermediate code the
same as that created for Quantum tables. The intermediate code file is called name.exp,
where name is the name you enter when prompted for a tables file name.
When you leave Quanvert you can convert this file by typing:
where asc.filename is the name of the comma-delimited ASCII file you wish to create and
options define any additional formatting requirements (see below).
Options allow you to control more precisely the type of output you create. Some
parameters are followed by a character or filename. Spaces between the key letter and the
character or filename are optional.
Options are:
–a string insert the given character or string after each entry. The default is
the double quote. You’d use this option together with the–b option
if you wanted non-numeric items to be enclosed in something other
than double quotes.
–b string insert the given character or string before each entry. The default is
the double quote. You’d use this option together with the –a option
if you wanted non-numeric items to be enclosed in something other
than double quotes.
–d char use the given character as the delimiter between items. The default
is a comma.
–h treat column headings in g statements as separate blocks of text,
one per column, rather than as a single block of text converting all
columns.
If you want this type of layout, you must include a g statement with
underlining using = or – signs, and the underlining must extend the
full width of the longest text in each column since it is this which
q2cda and qvq2cda use to determine the column widths and the
positions of the column breaks.
q2cda and qvq2cda check each table to see whether the column
headings are defined with g statements. If they are, it then looks at
the last g statement in the block to see whether it defines
underlining with = or – signs and, if so, breaks all texts in the block
wherever it finds a space in the underlining. If the column headings
are not defined on g statements or there are g statements but the last
one does not contain underlining, q2cda and qvq2cda place all the
column texts in a single block as it would without this option.
–S string replace asterisks with the given string or character. Neither q2cda
nor qvq2cda prints asterisks at all and normally replaces them with
zeroes. The program issues a message for each asterisk found
confirming that it’s been replaced with the given string or
character.
–s string insert the given string or character in the first cell of the second and
subsequent lines of an entry consisting of more than one line. The
default is to leave that cell blank (i.e., two consecutive double
quotes).
Multi-line entries generally occur when the table consists of more
than one type of figure for each row. For instance, if the table
contains absolutes and column percentages, the absolutes are the
first row and have the row text in the first cell, while the column
percentages are the second line and have a blank first cell. With this
option you could replace this blank cell with a special continuation
character such as @.
This option also applies to the blank cell at the start of the column
headings line.
–t don’t print table titles. The only texts are the column and row
headings.
–x print a reminder of the syntax and options available.
Let’s use some of these options to produce a different form of output for the sample table
we used earlier in this document. If we type:
we’ll have a file called sample.out in which percentages are followed by percent signs,
items are separated by tildes, texts are preceded by carets and followed by dollar signs,
and continuation lines have @ in the first cell. Here is sample.out:
Now let’s look at an example of using –h to treat each column heading as a separate block
of text. The default is to treat each line of column headings as a single block of text:
If you include –h, q2cda and qvq2cda will treat each column heading as a separate text.
The column headings shown above would become:
" "," "," "," "," Tho- "," "," Buy "," "
" "," "," Some-"," Knew "," ught "," "," 1 Or "," Buy "
" "," Bght."," one "," It "," It "," First"," Less "," 2 - 4"
" "," Issue"," Else "," Was "," Was "," Issue"," Out "," Out "
" "," My- "," Bght."," Extra"," Reg. "," Ever "," Of 4 "," Of 4 "
" Total"," Self "," Issue"," Issue"," Issue"," Bght."," Issue"," Issue"
" ====="," ====="," ====="," ====="," ====="," ====="," ====="," ====="
The layout above shows the column headings enclosed in double quotes separated by
commas. These are the defaults. If you define other characters using options on the
command line, q2cda and qvq2cda will use those characters instead.
Quick Reference
To create a qtspss data description and data set, run Quantum using the version that does
not delete its intermediate files (or use the –k option with the standard version). Then
type:
qtspss
qtspss is a postprocessor to Quantum which converts a Quantum program and data file
into an SPSS data description file with a transformed data set which matches that data
description. There are two ways you can benefit from this program:
1. You don’t have to write the SPSS specification to match the raw data (this would be
the procedure without qtspss).
2. Any complicated editing done in the Quantum edit is already implemented in the SPSS
data set.
The notes in this section are designed for users who are unfamiliar with SPSS, but feel they
would like some background information for checking that their run has worked.
✎ If you’re already familiar with Quantum and SPSS, you may wish to skip this section
and go straight to the section called "Preparing the Quantum run".
The tab section of a Quantum run consists of tab statements defining the tables required,
and axes which define the elements which will form the rows and columns of those table
and the conditions under which a respondent will be accepted into those elements.
qtspss takes this specification and uses the intermediate files created from it to generate
an spss data file and data description file.
• A TITLE statement.
This defines the run title which is read from tt statements following the a statement.
If there is no run title in the Quantum program, this statement is omitted.
will become:
The order of variables in this list defines the order of values in the data file (see
below).
l sex
col 15;Base;Male;Female
becomes:
The file ends with a comment reporting the number of cases in the data file.
The data file which qtspss creates is not directly related to the original Quantum data file.
In that file, the answers to any one question are always found in the same location in each
record (e.g., always in column 15 of card 1). In addition, the codes in the data correspond
to the codes on the questionnaire.
The records in an SPSS data set are lists of values separated by spaces. A record always
contains one value for every variable in the DATA LIST, in the order the variables are
given in that list. However, because some values may contain more digits than others, the
values for a given variable will not always appear in exactly the same place in the record.
Here is a sample data set:
1 0 1 5 8 4 7 4 123.45 9 5 4
6 4 4 6 12 3 5 6 99.01 4 0 0
999 0 5 7 9 5 3 87 16.12 11 10 6
The values which a variable has are assigned by qtspss according to the position of the
corresponding element in the Quantum axis. Thus, the first element is assigned value 1,
while the fifth element is assigned value 5, even if these are not the codes in the original
data file. Further information on the values you can expect to see is given below.
In order to understand how an axis becomes a variable, let’s look at the different types of
element you can have and see how they appear in the spss files. The simplest axis is one
in which the elements are single-coded. We’ll take sex as an example. The axis:
l sex
col 15;Base;Male;Female;Not stated=’&’
becomes:
In the axis, the element’s condition determines the code which represents the response in
the data. In the variable, the values for each answer are determined solely by the
element’s position in the axis. Thus, Not stated is the third answer so it becomes value 3
in the SPSS variable.
The base element is simply a count of respondents in the axis rather than an answer so it
is ignored.
SPSS does not handle multicoded data so qtspss converts it into the standard SPSS format.
When an axis is multicoded, qtspss creates as many variables as there are elements in the
axis. Each variable is named according to the axis name, but has a numeric suffix
associated with the element’s position in the axis. For instance, the multicoded axis:
l color;hd=Colors chosen
col 15;Base;Red;Blue;Green;Yellow;Others=’&’
will be converted into five variables named @color1 to @color5. The variable names
start with @ indicating that they are derived variables rather than variables related
directly to a Quantum axis.
Where the data is numeric (specified using the Quantum val statement), the axis is treated
as single-coded and the SPSS values are calculated according to the element’s position in
the axis. Thus:
l persons
n10TOTAL
val c(8,9);Base;i;1 person;2-3 people;4-5people;6-9 people;
+More than 9 people=10+
becomes:
qtspss cannot deal with fld/bit statements, so you’ll need to convert them to their
col/val/n01 equivalents.
If a respondent answers all questions in the questionnaire, the record will contain one
value for each variable. If some questions are unanswered, or if filtering excludes a
respondent from the axis, qtspss inserts dummy (999) values in the appropriate positions
in the records. In SPSS, the same dummy value is used for all variables and is called the
missing value.
Quantum runs can be weighted. In Quantum, weights (apart from pre- and postweights)
are defined in the tab section. qtspss converts the weighting matrix into a variable with a
numeric suffix, and appends each respondent’s weight to the end of the record. Because
the weight variable is not a bona fide Quantum axis, qtspss starts the variable name with
an @ sign. Thus, the statement:
wm1 sex;factor;100;200
will appear at the end of the DATA LIST as @weight1, and records will have either 100
or 200 as their last value, depending on whether the respondent is male or female.
If the run contains more than one weighting matrix, the second matrix will be named
@weight2, the third will be named @weight3, and so on.
Data in numeric variables (e.g., those created in the Quantum edit) is not normally
transferred to the SPSS data file. If you want it to become part of the record, create a
dummy tab statement with that variable as an inc:
real liters 1
ed
liters=c(41,42)*4.22
end
tab dummy dummy;inc=liters
This creates a variable with the same name as the numeric variable. Since this is a
variable rather than an axis, it has no predefined set of values, so there is no VALUE
LABELS section. Instead, the exact value of the variable is placed on a new line
underneath the record in the data file. A comment of the form:
1. Write a separate tab statement for each axis (variable) you want in the data set.
Variables which are not named in this way will be ignored.
2. Convert all global filters (i.e. on a/flt/tab statements) into axis-level filters.
4. Enter any numeric variables you wish to include in the SPSS data as inc’s on tab
statements.
To create a qtspss data description and data set, run Quantum using the version that does
not delete its intermediate files (or use the –k option with the standard version). Then
type:
qtspss
qtspss reads the Quantum intermediate files and creates two files:
Quick Reference
To create an nqtspss data description and data set, run Quantum using the version that
does not delete its intermediate files (or use the –k option with the standard version).
Then type:
nqtspss [options]
• Axis titles defined with hd= are transferred to the spss data description.
• Multicodes may now be coded as 0/n, where n is the element’s position in the axis,
rather than as 0/1.
• Titles may be assigned to numeric variables created from inc= variables in the
Quantum program.
• The values of weighting and numeric variables are appended to the end of the record
rather than being entered on separate lines after the record.
• A number of command-line options are available for altering the default output
format.
The notes in this section are designed for users who are unfamiliar with SPSS, but feel they
would like some background information for checking that their run has worked.
✎ If you’re already familiar with Quantum and SPSS, you may wish to skip this section
and go straight to the section called "Preparing the Quantum run".
The tab section of a Quantum run consists of tab statements defining the tables required,
and axes which define the elements which will form the rows and columns of those table
and the conditions under which a respondent will be accepted into those elements.
nqtspss takes this specification and uses the intermediate files created from it to generate
an SPSS data file and data description file.
• A TITLE statement.
This defines the run title which is read from tt statements following the a statement.
If there is no run title in the Quantum program, this statement is omitted.
will become:
The order of variables in this list defines the order of values in the data file (see
below).
l sex;hd=Sex of Respondent
col 15;Base;Male;Female
becomes:
The end of each statement group is marked with a dot, either at the end of the line or on
a line by itself.
The data file which nqtspss creates is not directly related to the original Quantum data
file. In that file, the answers to any one question are always found in the same location in
each record (e.g., always in column 15 of card 1). In addition, the codes in the data
correspond to the codes on the questionnaire.
The records in an spss data set are lists of values separated by spaces. A record always
contains one value for every variable in the DATA LIST, in the order the variables are
given in that list. However, because some values may contain more digits than others, the
values for a given variable will not always appear in exactly the same place in the record.
Here is a sample data set:
1 0 1 5 8 4 7 4 123.45 9 5 4
6 4 4 6 12 3 5 6 99.01 4 0 0
-99.99 0 5 7 9 5 3 87 16.12 11 10 6
The values which a variable has are assigned by nqtspss according to the position of the
corresponding element in the Quantum axis. Thus, the first element is assigned value 1,
while the fifth element is assigned value 5, even if these are not the codes in the original
data file. Further information on the values you can expect to see is given below.
In order to understand how an axis becomes a variable, let’s look at the different types of
element you can have and see how they appear in the SPSS files. The simplest axis is one
in which the elements are single-coded. We’ll take sex as an example. The axis:
l sex;hd=Sex of Respondent
col 15;Base;Male;Female;Not stated=’&’
becomes:
In the axis, the element’s condition determines the code which represents the response in
the data. In the variable, the values for each answer are determined solely by the
element’s position in the axis. Thus, Not stated is the third answer so it becomes value 3
in the SPSS variable.
The base element is simply a count of respondents in the axis rather than an answer so it
is ignored.
SPSS does not handle multicoded data so nqtspss converts it into the standard SPSS format.
When an axis is multicoded, nqtspss creates as many variables as there are elements in
the axis. Each variable is named according to the axis name, but has a numeric suffix
associated with the element’s position in the axis. For instance, the multicoded axis:
l color;hd=Colors chosen
col 15;Base;Red;Blue;Green;Yellow;Others=’&’
will be converted into five variables named @color1 to @color5. The variable names
start with @ indicating that they are derived variables rather than variables related
directly to a Quantum axis. Each variable has two values: 1 if the respondent gave that
answer, or 0 if he did not. The definition for @color1 is therefore:
A similar thing happens with numeric codings specified using the Quantum bit/fld
statements. nqtspss creates a separate variable for each value on the bit/fld statement and
inserts a 0 or 1 in the record depending on whether the respondent gave a particular
answer. For example:
l cars
fld (c126,c129) :3;Base;VW/Audi=500-572;Honda=990,991
is converted into two variables @cars1 and @cars2, each with 0/1 values.
Where the data is truly numeric (i.e., numbers rather than numeric codes) the axis is
treated as single-coded and the SPSS values are calculated according to the element’s
position in the axis. Thus:
1 persons
val c(8,9);Base;i;1 person;2-3 people;4-5 people;6-9 people;
+More than 9 people=10+
becomes:
If a respondent answers all questions in the questionnaire, the record will contain one
value for each variable. If some questions are unanswered, or if filtering excludes a
respondent from the axis,nqtspss inserts dummy (−99.99) values in the appropriate
positions in the records. In SPSS, the same dummy value is used for all variables and is
called the missing value.
Quantum runs can be weighted. In Quantum, weights (apart from pre- and postweights)
are defined in the tab section. nqtspss converts the weighting matrix into a variable with
a numeric suffix, and appends each respondent’s weight to the end of the record. Because
the weight variable is not a bona fide Quantum axis, nqtspss starts the variable name with
an @ sign. Thus, the statement:
wm1 sex;factor;100;200
will appear at the end of the DATA LIST as @weight0, and records will have either 100
or 200 as their last value, depending on whether the respondent is male or female.
If the run contains more than one weighting matrix, the second matrix will be named
@weight1, the third will be named @weight2, and so on.
Data in numeric variables (e.g., those created in the Quantum edit) is not normally
transferred to the spss data file. If you want it to become part of the record, create a
dummy tab statement with that variable as an inc:
real liters 1
ed
liters=c(41,42)*4.22
end
tab dummy dummy;inc=liters
This creates an SPSS variable with the same name as the Quantum numeric variable. As
with axes without an hd=, these variables have their name as their description. In the
example here, the definition of liters will be:
Since this is a variable rather than an axis, it has no predefined set of values, so there is
no VALUE LABELS section. Instead, the exact value of the variable is placed in the
appropriate place in each record.
If you would prefer the variable to have a more detailed description, create it as a dummy
axis with the same name as the numeric variable. Place the inc= on an n01 and define the
variable’s description with hd= on the l statement. Don’t forget to include a tab statement
that uses the dummy axis. For example:
An option is available for merging the two variables into a single variable.
1. Write a separate tab statement for each axis (variable) you want in the data set.
Variables which are not named in this way will be ignored.
2. Convert all global filters (i.e. on a/flt/tab statements) into axis-level filters.
3. Enter any numeric variables you wish to include in the SPSS data as inc’s on tab
statements. If you want the variable to have a title, create a dummy axis as well.
To create an nqtspss data description and data set, run Quantum using the version that
does not delete its intermediate files (or use the –k option with the standard version).
Then type:
nqtspss [options]
where options are one or more optional keywords that determine how the SPSS files will
be created (see below).
nqtspss reads the Quantum intermediate files and creates two files:
Options
Unless otherwise stated, options may be entered in upper or lower case. They are:
will be:
VARIABLE LABELS @color4 "Colors chosen"
VALUE LABELS @color4
0 "Not Green"
4 "Green"
use this option to create one variable containing all the information about
this variable rather than two in the SPSS data description. In this example
the variable liters will be defined as:
VARIABLE LABELS liters "Number of liters bought"
The second variable whose name would normally start with the letter A
(e.g., Aiters) is not created.
fn.d defines the format of values in numeric variables. n is the maximum
number of characters per value, including the decimal point, and d is the
number of decimal places required. Thus, with –f6.2, the value
365.26001 becomes 365.26.
–h display a list of options.
–k Indicates that where element texts start with a number, variables should
be numbered according to those values rather than sequentially from one.
If an element text does not start with a number, it will be assigned a value
one greater than the current maximum value so far. For example:
l age;hd=Age of respondent
val c(20,21);Base;Under 21=21–;21–24;25–34;35–44;45–54;
+55–64;Over 64=64+
normally becomes:
VARIABLE LABELS age "Age of respondent"
VALUE LABELS age
1 "Under 21"
2 "21–24"
3 "25–34"
4 "35–44"
5 "45–54"
6 "55–64"
7 "Over 64"
✎ Note that when deciding whether to suppress a number, nqtspss looks at it as a string
rather than as a numeric value. Thus, 01 is not seen as being the same as 1.
Quick Reference
To create a qtsas data description and data set, run Quantum using the version that does
not delete its intermediate files (or use the –k option with the standard version). Then
type:
qtsas
qtsas is a postprocessor to Quantum which converts a Quantum program and data file into
a SAS data description file with a transformed data set which matches that data
description.
There are two ways you can benefit from using this program:
1. you don’t have to write the SAS specification to match the raw data (this would be the
procedure without qtsas); and
2. any complicated editing done in the Quantum edit is already implemented in the SAS
data set.
The notes in this section are designed for users who are unfamiliar with SAS, but feel they
would like some background information for checking that their run has worked. A
glossary of Quantum terminology is included in this section. Users whose background is
mainly SAS and who want to compare their SAS files with the Quantum originals may find
it helpful to read the glossary before continuing.
✎ If you’re already familiar with Quantum and SAS, you may wish to skip this section
and go straight to the section called "Preparing the Quantum run".
The tab section of a Quantum run consists of tab statements defining the tables required,
and axes which define the elements which will form the rows and columns of those tables
and the conditions under which a respondent will be accepted into those elements.
qtsas takes this specification and uses the intermediate files created from it to generate a
SAS data file and data description file.
This section is related to single-coded and numeric axes in your Quantum program.
Each Quantum axis becomes a SAS variable and this section defines, for each variable
(axis), the values which may be found in the data and the responses which those
values represent. For example:
l sex
col 110;Base;Male;Female
becomes
proc format;
value sexfmt
1 = ’Male’
2 = ’Female’
;
run;
Each format name is generated by taking the first five characters of the axis name (or
the full name if shorter) and appending the string ‘fmt’.
The section ends with a run statement which tells SAS to process the statements in this
section.
Statements in this section name the data file and list the names of the axes (variables)
present in that file in the order they occur. Axes which appear on more than one tab
statement are ignored after the first time. Thus, the Quantum statements:
become:
data;
infile "statdata";
input
age
sex
occup
;
This section associates a format from the proc format section with each single-coded
or numeric variable in the data list. It also defines a descriptive text for the SAS labels
generated for other types of axes and variables in the Quantum run. These include
complex axes formed from more than one question and variables whose values are
generated during the run rather than being read directly from the data file.
Since the sex axis is single-coded, its format will have been defined in the proc
format section, so the entry in this section will simply name the format to be used:
The section ends with a run statement telling SAS to process these statements.
Sometimes you’ll find that qtsas has omitted elements or axes from the SAS specification.
With elements this is because there are no respondents in those elements; with axes it is
because they contain no ‘basic’ elements (e.g., n01/col).
The data file which qtsas creates is not directly related to the original Quantum data file.
In that file, the answers to any one question are always found in the same location in each
record (e.g., always coded in column 15 of card 1). In addition, the codes in the data
correspond to the codes on the questionnaire.
The records in a SAS data set are lists of values separated by spaces. A record always
contains one value for every variable in the data section, in the order the variables are
given in that list. However, because some values may contain more digits than others, the
values for a given variable will not always appear in exactly the same place in the record.
Here is a sample data set (it doesn’t matter what the data refers to, it is the layout that’s
important):
1 0 1 5 8 4 7 4 123.45 9 5 4
6 4 4 6 12 3 5 6 99.01 4 0 0
. 0 5 7 9 5 3 87 16.12 11 10 6
The values which a SAS variable has are assigned by qtsas according to the position of the
corresponding element in the Quantum axis. Thus, the first element is assigned value 1,
while the fifth element is assigned value 5, even if these are not the codes in the original
data file. In the second line of the example above, the respondent gave the 6th answer to
the first SAS variable, the 4th answer to the second SAS variable, and so on. Further
information on the values you can expect to see is given below.
In order to understand how an axis becomes a variable, let’s look at the different types of
element you can have and see how they appear in the SAS files. The simplest axis is one
in which the elements are single-coded. We’ll take sex as an example. The axis:
l sex
col 110;Base;Male;Female;Not stated=’&’
becomes:
proc format;
value sexfmt
1 = ’Male’
2 = ’Female’
3 = ’Not stated’
;
run;
data;
infile "statdata";
input
sex
;
format sex sexfmt.;
run;
In the axis, the element’s condition determines the code which represents the response in
the data. In the variable, the values for each answer are determined solely by the
element’s position in the axis: ‘Not stated’ is the third answer so it becomes value 3 in
the SAS variable. However, if an element in the axis contains no respondents (i.e., no one
gave that answer), that element is ignored and subsequent elements are renumbered.
Thus, if column 110 of the Quantum data file never contains a code 2, the SAS
specification for sex will be:
proc format;
value sexfmt
1 = ’Male’
2 = ’Not stated’
;
The base element in the axis definition is simply a count of respondents in the axis rather
than an answer so it is ignored. Axes which contain only a base are ignored.
A similar thing happens with numeric data specified using the Quantum val statement.
qtsas creates a format description for the variable listing the values present in the data.
These will not be the numeric values in the Quantum data but codes allocated to those
values according to the element’s position in the axis (and whether or not that element
contains any respondents). qtsas then assigns this format to the variable, as shown below:
l persons
val c(8,9);Base;i;1 person;2-3 people;4-5 people;6-9 people;
+More than 9 people=10+
becomes:
proc format;
value persofmt
1 = ’1 person’
2 = ’2-3 people’
3 = ’4-5 people’
4 = ’6-9 people’
5 = ’More than 9 people’
;
run;
data;
infile "statdata;
input
person
;
format person persofmt.;
Notice that the axis name has been truncated to five characters to make the format name.
You may wish to ensure that all the original Quantum axes have names of fewer than five
characters to avoid duplicate names in the SAS data description (tmpxxy and tmpxxz
would both become tmpxxfmt).
SAS does not handle multicoded data so qtsas converts it into the standard SAS format.
When an axis is multicoded, qtsas creates as many variables as there are elements used
in the axis. Each variable is named according to the axis name, but has a numeric suffix
associated with the element’s position in the axis. For instance, the multicoded axis:
l color
col 115;Base;Red;Blue;Green;Yellow;Others=’&’
will be converted into five variables named _color1 to _color5. The variable names start
with an underscore (_) indicating that they are derived variables rather than variables
related directly to a Quantum axis.
Each variable has two values: 1 if the respondent gave that answer, or 0 if he/she did not.
Since there are no other values, no entry is generated in the proc format section. Instead,
the answers which the variables represent are defined in the format section as follows:
As with single-coded variables, any unused values are ignored and subsequent items are
renumbered.
Sometimes responses are coded with numeric codes of more than one digit; for example,
500 for a VW Scirocco, 520 for an Audi Quattro. Axes for data of this type will often be
set up with fld or bit statements. If so, qtsas deals with them in same way as for
multicoded axes. Thus, the axis:
l cars
fld (c126,c129) :3;Base;VW/Audi=500-572;Honda=990,991
is converted into two variables _cars1 and _cars2 labeled with the element text defined
in the axis:
data;
infile "statdata";
cars
;
label _cars1 = "VW/Audi";
label _cars2 = "Honda";
If a respondent answers all questions in the questionnaire, the record will contain one
value for each variable. If some questions are unanswered, or if filtering excludes a
respondent from the axis, qtsas inserts a dot in the appropriate positions in the records. In
SAS, the dot is called the missing value.
Quantum runs can be weighted. In Quantum, weights (apart from pre- and postweights)
are defined in the tab section. qtsas converts the weighting matrix into a variable with a
numeric suffix, and appends each respondent’s weight to the end of the record. Because
the weight variable is not a bona fide Quantum axis, qtsas starts the variable name with
an underscore. Thus, the statement:
wm1 sex;factor;100;200
will appear at the end of the data list as _weight0, and records will have either 100 or 200
as their last value, depending on whether the respondent is male or female. For instance:
proc format;
value sexfmt
1 = ’Male’
2 = ’Female’
;
run;
data;
infile "statdata";
input
sex
_weight0
;
format sex sexfmt.;
label _weight0 = "@weight0’;
run;
As you can see, the format definition for a weighting variable is the same as the variable
name except that the underscore is replaced with an @ sign. If the run contains more than
one weighting matrix, the second matrix will be named @weight1, the third will be
named @weight2, and so on.
Data in numeric variables (e.g., those created in the Quantum edit) is not normally
transferred to the SAS data file. If you want it to become part of the record, create a dummy
tab statement with that variable as an inc:
real liters 1
ed
liters=c(41,42)*4.22
end
tab dummy dummy;inc=liters
This creates a variable with the same name as the numeric variable. Since this is a
variable rather than an axis, it has no predefined set of values, so there is no proc format
section. Instead, the variable is defined with a label statement:
and the exact value of the variable is placed in the appropriate place in each record.
1. Write a separate tab statement for each axis (variable) you want in the data set.
Variables which are not named in this way will be ignored.
2. Convert all global filters (i.e., on a/flt/tab statements) into axis-level filters.
3. Enter any numeric variables you wish to include in the SAS data as inc’s on tab
statements.
To create a qtsas data description and data set, run Quantum using the version that does
not delete its intermediate files (or use the –k option with the standard version). Then
type:
qtsas
qtsas reads the Quantum intermediate files and creates two files:
Quick Reference
To create an nqtsas data description and data set, run Quantum using the version that does
not delete its intermediate files (or use the –k option with the standard version). Then
type:
nqtsas [options]
nqtsas is a variation of qtsas which converts a Quantum spec and data file into a SAS data
description and transformed data set. Enhancements include:
–G1 when an axis is exploded into a binary variable, create a SAS format clause
without a corresponding variable in the variable list.
–G2 as G1, except that variable labels and formats for binary variables are
excluded from the SAS commands file.
–g create format clauses for binary variables and include their variable labels
in the commands file. This is the default.
–U elements in subaxes contribute once only to the SAS data description and
data set.
–u elements in subaxes contribute to the data description and data set once for
every subaxis in which they appear. This is the default.
–A ignore subaxes. Elements in subaxes are assumed to be part of the main
axis only.
–f create a free format data set.
–F create a fixed format data set. This is the default.
–Wn create numeric variables with n digits. The default is 8 digits.
–Pn create real variables with n decimal places. The default is 2 decimal
places.
–O behave as the existing qtsas.
–mchar or –Mchar
defines the character to represent missing data. The default is a dot.
–b or –B show missing values as blanks.
Preparing and running an nqtsas job are as described above for qtsas, except that you list
the option you wish to use at the end of the nqtsas command. For example:
qtsas -W3 -f
to create a free format data set with numeric values shown as 3-digit values.
If you copy or rename the nqtsas program to nqtspss, it will generate SPSS output instead.
If you want to keep the original nqtspss program, remember to rename it before you copy
or rename qtsas.
Before you can convert a Quantum spec and data file into a Quanvert database there are
several tasks you may need to carry out first. These include checking the Quantum
program to ensure that it will create the required information in the appropriate places,
and setting up subdirectories if variables are not to be stored in the main project directory.
If you have a large database from which you require only a few variables, you may use
the raw Quantum data rather than creating a full Quanvert database. Notes on this facility
are included at the end of this chapter.
1. Make sure axis names reflect the contents of the axis. Names such as age, childs and
tried are much more helpful than ax01 or q1.
2. Put axis titles on tt or n23 statements within the axes, or define it with hd= on the l
statement.
Of the three options, hd= is the best choice for two reasons, First, when the axis is
used as the columns of the table Quanvert will print the hd= text above the columns
of the table. Quanvert also does this with n23 texts, but if you use tts, Quanvert prints
them at the top of the table so it is unclear whether they belong to the rows or the
columns.
Second, if you will be using the database in a multiflip project, you can change the
hd= text between projects without having to change the titles of earlier projects to
match. If you use n23s, and you merge axes based on element texts, mflip will not
merge elements that are identical apart from the subheadings.
Any axis which may be used as a third or higher dimension should have n23
statements. tt statements following tab statements are ignored.
3. Split up column axes containing more than one type of information into the
individual axes for each of the questions involved in the axis. For example, if the
column axis contains sex, age, and marital status information, split it into three
separate axes. You can still keep the original axis, but by splitting it you offer
Quanvert users the choice of using just part of it if the original is too wide to be
displayed across the screen.
An easy way to create the individual axes is to insert the keyword groupbeg and
groupend around the elements which make up each axis. For example:
l agesex
n10Base;group=all
groupbeg sex
col 110;Male;Female
groupend sex
groupbeg age
col 111;Young;Middle aged;Old
groupend age
4. Put axis conditions on the l statement rather than on an n00 if the condition refers to
all elements in that axis. Using n00s in these circumstances increases the amount of
disk space and processing time required to deal with the data.
If the axis is filtered you must ensure that this is clear to the Quanvert user by
amending the text of the base element or adding an axis title that explains who has
been included in the axis.
5. Ensure that each axis is named on at least one tab statement. Axes will not be flipped
if:
b) they have illegal or reserved names and the .ax files cannot be renamed for any
reason (see item 7. below);
c) they are weighting axes which do not have a tab statement, in which case the
necessary information is not present in Quantum’s intermediate file;
6. Do not leave column axes without row text (i.e., only g and p statements) since this
will cause problems if the axis is used as a row axis in Quanvert. Insert n01
statements with suitable text.
7. The words dummy, project, cancel, help, q, qu, qui, quit, restart and same as axis
names have special meanings within Quanvert. If you have axes with these names,
rename them. If you forget to do this, Quantum will flip them with temporary names:
where nn is 01, 02, 03, and so on. nn is reset to 01 for each variable type.
8. Numeric variables (i.e., inc=) will be included if they appear on tab statements.
Make sure they have useful names rather than just column references; for example,
numloaf rather than c(132,133). For further control over which numeric variables are
flipped, create a flip configuration file naming the variables to be flipped.
☞ See section 41.2 for further information about the flip configuration file.
9. Conditions on filters (c= on flt and flt=) are ignored. To make them part of the
database, create an axis in which each filter condition is an element. Remember to
include a tab statement for this axis.
10. To avoid wrap-around on an 80-column screen, set the page width to less than 80
columns with pagwid= on the a statement, and shorten text on g statements
accordingly. If you remove the g statements altogether, Quanvert will format the
column headings internally which allows users to select a suitable page width during
the session.
11. Always use foot in preference to bot. The latter causes the tables to scroll off the top
of the screen.
12. Quanvert does not add tables. Axes on add statements are treated in the same way as
those on tab statements. However, be sure to remove any dummy elements (i.e.,
n01;dummy) from the axes named on the related tab statements.
13. The following points apply to databases created from Quantum programs containing
weighting:
a) All weight matrices are copied into the Quanvert database and the user may
weight tables using any matrix of his/her choice.
To create the database with a default weight matrix, name that matrix on the a
statement using wm=. Quanvert will then display figures weighted using this
matrix and will create tables using this matrix unless the user chooses a different
matrix or requests an unweighted table.
c) The flip process silently ignores wm= on tab statements. It also ignores wm=
with any value other than zero on elements, but issues a warning message for
each option ignored.
e) Quanvert users can choose between creating weighted and unweighted tables. If
a Quantum axis contains both weighted and unweighted base elements remove
one of them.
14. To make the record serial numbers available in the database, insert statements in the
Quantum edit to create a variable called serial and copy the serial number into it.
Then add inc=serial to any of the tab statements.
15. Trailer card data is best analyzed using levels. However, when trailer cards are
analyzed according to the contents of the variables firstread, lastread etc. (i.e.,
analysis levels are not used), there are two points to consider:
a) Wherever possible, axes that are not at the lowest level should have a filter (e.g.,
c=lastread) as an option on the l statement not the tab statement.
l level
n01Household;c=thisread1
n01Person;c=thisread2
n01Trip;c=thisread3
b) Numeric variables (inc=) on the tab statement are flipped at the level of the tab
statement;
c) In weighted runs, each level must have a wm statement associated with it. If there
is only one set of weights which applies to all levels, it must be repeated at each
level.
e) Define at least one table at each level so that the levels cross-reference (qvlv*)
and weights files contain the correct information for each level.
17. If the data file contains text fields which will need to be available in Quanvert, add a
statement of the form:
For example:
call alpha($address$,c210,50)
reads each respondent’s address into the variable ‘address’. 50 columns are read
starting in column 210.
where level is the level’s number determined by its position in the levels file (e.g., 5
for the fifth level defined), and all other parameters are as described above for call
alpha.
✎ In a panel study you can use call lalpha for setting up the cross-referencing for the
levels in the database.
18. Quanvert can deal with missing values in numeric variables. These are referred to as
DNA (question did not apply to respondent) and NA (question was applicable to the
respondent but was not answered) and have the values −2.0e9 and −1.0e9
respectively. You may have coded these as, say, & for DNA and blank for NA.
Quantum does not create these special values automatically, but you can create them
when the data is flipped by including code of the form shown below in the edit
section of the Quantum program:
real dn103 1
ed
dn103=c103
#c
if (c[103]==2048) q_dn103 = -2.0e9;
else if (c[103]==0) q_dn103 = -1.0e9;
#endc
return
end
This example is setting missing values for a numeric variable in column 103. The
lines of C code between the #c and #endc statements are the equivalent of:
if (c103’&’) dn103=DNA;
else; if (c103’ ’) dn103=NA
19. process statements create databases with misleading counts of respondents, so write
an edit program to format the data into the standard card format for a levels job first.
Then modify the axes definitions in the original Quantum program to match the new
data format. The example below illustrates a typical process problem. If there are
1,000 respondents and the Quantum program is:
ed
c180=’1’
do 5 t1 = 112,116,2
if (c(t1,t1+1)=$ $) go to 5
c(181,182) = c(t1,t1+1)
process
5 continue
c180=’ ’
end
tab filmax region;c=c180’1’
ttlBase: Number of Viewings
flt;c=c180’ ’
tab age sex
Quanvert will claim to have a base of 3,000 respondents rather than 1,000 since
respondents are counted each time they pass through the tab section.
In addition, since flip ignores filter conditions on flt statements (see point 9 above),
the table of age by sex will report 3,000 respondents rather than 1,000, even though
the percentages will be correct. If some respondents bypass process because one or
more of the fields is blank, the percentages will be incorrect as well.
20. If disk space is likely to be a problem, consider the following as ways of reducing the
amount of space required:
a) Use numcode on the l statement of single-coded axes. This reduces the amount of
space required for Quantum’s intermediate files and also the length of time taken
to create the Quanvert database.
b) Use fac= instead of n25 and inc=. n25 creates .mul files whereas fac= does not.
21. Check all elements that are flagged with norow, nocol or nohigh and decide whether
these options are really necessary in the database.
These options carry forward from the Quantum spec into the database and also into
any new axes that the user creates using axes that contain those elements. For
example, if an axis contains an element flagged with nocol, and that element is used
in a new axis, that element will not be printed if the new axis is used as the columns
of the table.
22. The current versions of Quanvert cannot create n25 elements in new axes. If the user
creates a new axis using an axis that contains an n25, the n25 will be missing from
the new axis.
23. If you want Quanvert for Windows users to be able to request special T-statistics:
a) Add an nsw option to the a statement. If the Quantum run contains this keyword
but there are no tstat statements in the run, the default for the database will be no
T-statistics. However, the Quanvert user will be able to select whichever tests
he/she wishes to run and Quanvert will have all the information it needs to run the
test.
b) If some axes are multicoded, add the overlap keyword to the a statement. This will
ensure that Quanvert uses the formulae that take into account places where a
respondent is present in two or more of the elements being tested.
c) The minimum base for T-statistics is 30. If the base (effective base in weighted
tables) is less than 30 the test is not run. You may specify a different minimum base
for the database using the keyword minbase= on the a statement.
d) The default confidence level is 95%. Use clevel= on the a statement to set a
different default.
e) Quanvert does not normally save its intermediate figures in a debugging file.
Specifying tstatdebug on the a statement alters this so that the default is to save
debugging information in the file tstat.dmp.
f) A tstat statement in an axis in the Quantum spec carries over into the Quanvert
database, so that whenever the user selects that axis the T-statistic comes into
effect with any options that were also defined on the tstat statement. The user may
turn off the T-statistic or alter its options using Quanvert. If you want the default
to be no T-statistics, remove all tstat statements from the Quantum axes.
f) Any significance test that calculates means may only be run if the axis contains an
n12 element. You may wish to add n12 statements to axes where you think users
may wish to apply these tests.
☞ For further information on the keywords mentioned in this section, see chapter 32.
Quick Reference
The flip configuration file provides additional control over the way a Quanvert database
is created from a Quantum program and data file. Currently, it is only used for numeric
variables.
A Quantum flip run normally creates one numeric variable for each inc= in the run. By
placing statements in the flip configuration file you can exercise more control over
exactly which numeric variables are created.
inc = specification;
inc = all;
to flip all numeric variables; this is the default if there is no configuration file
inc = none;
As the database is created, Quantum builds a flip output file reporting which numeric
variables have and have not been flipped. For example, if your flip configuration file
contains the line:
and the variables serial, numpots and total are named on tab statements, the flip output
file will contain three statements:
The names of the flip configuration and output files are flip.cnf and flip.out
Quick Reference
To store variables in subdirectories under Unix and DOS, create a file called numdir.qv
containing the number of subdirectories you require.
Under Unix and DOS you may indicate that axes, numeric and text variables and named
filters should be stored in subdirectories to avoid cluttering up the main project directory.
To do this, create a file called numdir.qv containing the number of subdirectories you
want: any value from 1 to 99 is valid. You must create this file before you flip the
database so that the program knows how many subdirectories to create.
When the flip program is running, it creates the given number of subdirectories naming
them subdnn.qv, where nn is a number in the range 0 to the number you defined in
numdir.qv. As variables are flipped, the files associated with them are placed in these
subdirectories. There is no need for users to know which subdirectories contain which
files since Quanvert takes care of this itself.
If you flip a job with subdirectories and you later want to change the number of
subdirectories (e.g., if you are about to add extra variables via qvmerge or via commands
within Quanvert itself):
1. Create a new project directory and, in it, a file called numdir.qv containing the new
number of subdirectories required.
2. Run qvmerge to copy the whole of the original directory into the new one.
☞ See section 42.5 for further information on merging variables into databases.
3. Delete the old directory and qvmerge and/or create new variables in the new
directory. These variables will allocated to subdirectories according to the new
number.
As implied above, it is quite legal to merge variables (using the program qvmerge) from
databases with different numbers of subdirectories, or even between one with
subdirectories and one without. This is also true when using mflip. Each database that
makes up a multiflip project may have its own numdir.qv (or none at all) as may the
multiflip database itself, and the numbers of subdirectories in each may be different.
In order for the flip process to work with subdirectories, you must ensure that each user’s
$QTHOME/bin contains the version of quanflip (Unix) or quantum.bat (DOS) which make
calls to a program called qvaxmove. Without this, the .ax files will not be moved into the
subdirectories and the programs in the Quanvert suite will fail.
✎ Under DOS, you are strongly advised to use subdirectories for databases with more
than 100 variables since this enables Quanvert to run much faster. Additionally, you
should choose the value in numdir.qv such that between 50 and 100 variables or files
are allocated to each subdirectory.
Quick Reference
To allow Quanvert to read raw Quantum data, create a file called axesmap.qv in the
database directory and insert the following lines:
data_file_name
axes_definitions
Quanvert can read raw Quantum data. This facility should not be used as an alternative
to flipping data since it can only deal with simple axes and grid axes, and cannot read
trailer card or levels databases. Nevertheless it is useful when only a few variables are
required from a database which would take several hours to flip.
First, make sure that .ax files exist for the axes you wish to use. These will be created
automatically in a full Quantum run, but if you do not want to do this, a compilation-only
run (quantum –c) will suffice. Next, create a file called axesmap.qv in the database
directory defining the axes which should be made available to Quanvert. The format of
this file is as follows:
In the example below, the serial number starts in column 1 and is four columns long, the
card type starts in column 5 and is two columns long, there are a maximum of three cards
per record, and each card contains a maximum of 80 columns.
Lines 3 onwards define the axes to be used. Each line starts with an upper-case letter
indicating the type of information it contains:
in a Quantum program.
S Represents a side statement in a grid axis.
U Represents an unconditional element such as:
n10Base
P colnum codes Defines an element with a standard column and code definition.
For example:
P 2
Here is an axesmap.qv file and its Quantum equivalent. The axesmap.qv file is shown in
the left-hand column with the corresponding Quantum statements in the right-hand
column:
data
1 4 5 2 3 80
A sex l sex
U col 107;Base;
P 107 1 Male;
P 107 2 Female
A age l age
U val (108,109);Base;
I 108 109 21.0 24.0 21-24;
I 108 109 25.0 34.0 25-34;
I 108 109 35.0 44.0 35-44;
I 108 109 45.0 54.0 45-54;
I 108 109 55.0 64.0 55-64;
I 108 109 65.0 99.9 65 and over
A coding l coding
U col 110;Base;
P 110 12 First=’12’;
P 110 3 Second=’34’;
P 110 56 Third=’56’;
P 110 7890-& Other=’7/&
B 110 n01None;c=c110’ ’
G gridax l gridax
C 111 n01Brand A;col(a)=111
C 112 n01Brand B;col(a)=112
C 113 n01Brand C;col(a)=113
S side
U col a00;Base;
P 0 1 Very good;
P 0 2 Good;
P 0 3 Bad;
P 0 4 Very bad
Conditions which are more complex than those shown above or which use numeric or
text fields cannot currently be dealt with.
✎ If Quanvert will use the raw Quantum data rather than inverted data, you must
include a dummy axis, and a tab statement which uses it, in the Quantum job and the
axesmap.qv file. This axis must not exist for flipped jobs. If there are any errors in
the axesmap.qv file, Quanvert will not work.
A panel study is one in which the same set of respondents is asked the same set of
questions at periodic intervals. The data belonging to one period in the study is called a
wave. Since a panel study database is designed to enable users to look at more than one
database (wave) at a time, there are many similarities between creating a panel study
database and creating a multiflip database.
✎ Panel analyses cannot yet be performed within Quanvert Menus. Quanvert Menus
treats a panel study as an ordinary multiflipped database.
Each record in a wave must have a unique serial number, and the data file must be sorted
in ascending order on that serial number. Flip uses the serial number to determine which
record in each wave belongs to each respondent, so it is important that a respondent’s
record has the same serial number in all waves in which it is present. Don’t worry if a
respondent drops out of the panel; flip has been written to cope with gaps in serial
numbers. The serial number must be flipped as a text variable called serial.
✎ Use call alpha in the Quantum run to create the serial number variable as
documented in section 41.1.
In a levels job, there must be a serial number for each record at each level at which the
user may want to cross-reference records. For example, if there are three levels,
household, person and purchase the user may want to track a household or person through
the various waves, but is less likely to want to track the same purchase across all waves.
The serial number of the top level must be flipped as a text variable called serial1. Serial
numbers for lower levels may be flipped as text variables called serialn, where n is a
number corresponding to the level’s position in the levels file. In the example below, the
serial number for records at person level will be flipped as serial2, while records at trip
level will be flipped as serial4.
hhold cards=1,2,3
person cards=4,5 <hhold
purch cards=6,7 <person
trip cards=8 < person
where level_number is the level number as described above, and all other parameters are
as described above for call alpha.
The serial number for lower levels is a composite number which includes the serial
numbers of all higher levels as well as the level’s own unique number. Thus, if the
household serial number is in c(1,5) and the person serial number is in c(6,7), the full
serial number for the person level is columns 1 to 7, not just columns 6 to 7:
call lalpha(2,$serial2$,c1,7)
✎ You must ensure that you place these statements in the correct sections of the edit
because the compiler is unable to check this for you. If you do put a statement in the
wrong place (e.g., call lalpha(2 ..) in the edit for level 3), the data will be nonsense.
To link the individual waves into a panel database you use mflip with the –p option.
In a nonlevels job, the program expects to find a serial.alp file in each wave directory,
and expects it to be sorted in ascending order. If either of these conditions is not true, the
run terminates with an appropriate error message. Like flip, mflip –p has been written to
allow for missing serial numbers.
In a levels job, the program expects to find the file serial1.alp in each wave directory,
sorted as described above. It also looks for any other serialn.alp files for values of n
defined in the levels file, but does not complain if none exist.
If the panel study is weighted, you may need to define combined weighting for the
various combinations of waves in the study. You do this using Quanvert itself.
quanvert
2. Select the waves you are going to weight. This is the same as choosing projects in a
multiflip directory.
The help and what commands at this point tell you about the three analysis types and
also about the W option. The latter is only mentioned if:
b) you have access to unweighted figures (unwt=y in the users file), and
Since you should define combined weights as part of the database set-up, ordinary
users ought never to see the W option.
4. Enter the name(s) of the axis or axes upon which the weighting is based at the prompt
for axes names for tabulation. In a levels database, these must be at the same level,
and they determine the level at which the combined weights will be created.
5. Enter the name of the file containing weighting targets or press RETURN to enter them
manually. In this case Quanvert will read the axes you selected to calculate the
number of target figures required, and will then prompt ‘Enter n targets’, where n is
the number of targets you are to enter.
The values, either in a file or entered at the terminal, may be integers, reals or a
mixture. They may be separated by spaces, new lines and/or semicolons. If you key
them in on your terminal, Quanvert will re-issue the prompt, with n suitably adjusted,
each time you press RETURN until you have entered the expected number.
Once Quanvert has all the targets, it reads the data and creates the combined weights for
the level and set of projects chosen. When this is done, you are prompted again for the
axes for tabulation. If there are no more levels requiring combined weights for the current
set of projects, press to choose another set of projects.
Quanvert has a group of optional files which you may create in the database directory to
determine how various users may use the package.
Although all users need to be able to read these files, you may wish to restrict write access
to prevent them being accidentally changed or removed.
If you wish to provide Quanvert users with additional information about a variable (e.g.,
a fuller description of an axis, or advice on when to use a particular variable), create a file
called name.nts, where name is the name of the axis, variable or filter, and key in the text
using an editor. The first time in a session that a variable with an information file is used,
a message will be displayed inviting the user to list the file on his screen.
The profile options file allows you to define aliases for commands to be available as
output postprocessors for profiles created during a Quanvert session. You may define up
to twelve aliases per database.
Each definition covers two lines. The first line names the postprocessor and describes its
function. This line will be displayed if the user types the / character to request help at the
prompt for a postprocessor. The name may be up to 20 characters long, and the
description may be up to 80 characters. The two items should be separated by at least one
space.
The second line of the definition is the command(s) to be executed. Up to 120 characters
are allowed. Type the command exactly as you would if you were typing it at the system
prompt. For example, on a Unix system you might define a command for sorting profile
output as:
To use this command in Quantum, the user will type or select ‘format’ at the prompt for
a postprocessor.
Weighting matrices in Quantum are identified by numbers. Quanvert users will find it
more useful if weighting matrices have names reflecting the characteristics on which the
weights are based.
To name weighting matrices, create a file containing one line per weight matrix. Each
line has two fields separated by spaces. The first is the number of the weighting matrix
as it appears on the wm statement in the Quantum spec, and the second is the name by
which it is to be known in Quanvert.
For example:
1 agesex
2 region
Here, the user may use the matrix called wm1 in Quantum by asking Quanvert to weight
by agesex.
The users file determines which commands a user may or may not use with the current
database. It consists of a line or block of lines for each user.
Each user’s entry starts with the user’s login name. This is followed by a space-separated
list of keywords from the list below, either on the same line as the user’s name, or on a
number of lines underneath. If many users share the same permissions, you may define a
global set of access rights at the top of the file, under the special user name ‘all’, and omit
those users from the rest of the file.
un wt=y | n Allow/disallow access to unweighted counts, and the profile and list
commands. The default is unwt=y.
create=y | n Allow/disallow the creation of new axes, numeric variables and
named filters. The default is create=y. Create and basic are never
valid commands in multiflip projects.
alter=y | n Allow/disallow alteration of texts and titles. alter=n also disallows
use of rename and delete. The default is alter=y.
lang=language Name the language in which prompts should be displayed.
Possibilities are English (default), French, German, Italian and
Spanish, but others may be defined by adding their names to the
availang file.
☞ For further information about the availang file, see the section entitled "Translatable
prompts" below.
profopts=path_name
Names the profile postprocessor file for this database (the default is
profopts in the database directory). Under Unix and DOS you may
abbreviate long path names using the special characters . and .. (i.e.,
dot and dot dot). The Unix symbol ~ meaning the user’s home
directory is not recognized.
include=names Allow access to the named axes and variables only. Names must be
separated by commas. The dummy axis is always available to all
users. Lists of more than one line must be split at a comma with the
second (and subsequent) lines starting in a column other than 1.
Include and exclude (see below) should be used with care since it is
possible that a user may create axes in one session but be unable to
access them in another. Include= is not valid under DOS, and
Quanvert will issue an error message if such a statement is found.
exclude=names Disallow access to the named axes and variables. Notes are as for
include.
filter=axis_name.element_names
Filter all analyses by the given filter. You may specify the filter as a
named filter (e.g., filter=male, where male is a named filter which
counts men only), or as an axis name followed by one or more
element names. The axis name must be separated from the element
name by a dot and, if there is more than one filter element, the
elements must be separated by commas. Thus:
filter=tried.White,Brown,Wholemeal
will filter all analyses so that they include only those respondents
who tried white, brown and/or wholemeal bread.
page=y | n Determine whether or not to pause every 22 lines when displaying
output on the screen. The default is page=y.
termwid=n Define the number of characters to display per line on the user’s
screen. Any value between 60 and 160 is acceptable, the default
being termwid=80.
dummy=axis_name
Use the named axis as the default column (breakdown) axis instead
of the default dummy axis.
pagwid=n Use a page width of n characters. This overrides the default set on
the flipped a statement.
side=n Use a row text width of n characters. This overrides the default set
on the flipped a statement.
n11print=none | first | all
Determine which, if any, n11 elements are to be printed. The default
is none which suppresses all n11s. first prints the first n11 in the axis
as if it were an n10, and all prints all n11s as if they were n10s.
The defaults given above are system defaults which may be overridden by the same
parameter at ‘all’ or user level. You may place several parameters on a line as long as
they are separated by spaces, and may continue an entry over several lines by starting the
continuation lines with at least one space. A sample users file might be:
all unwt=y
ben unwt=n create=n alter=n
include=age,sex,marry,house,region,childs
barbara
profopts=pr_proc
exclude=tried,bought,likes1,likes2
why,numloaf
lang=f page=y termwid=80
This says that everyone except Ben has access to the unweighted figures. Ben may not
create or alter axes, and has access to the named axes only. Barbara reads profile options
from the file pr_proc in the database directory, is prompted in French, has output which
stops every 22 lines and has 80 characters per line. She may use all variables except those
named with exclude.
If you create a users file which contains the names of specific users but does not start with
the global user ‘all’, only users named in the users file will be allowed access to the
database. If unnamed users are to have unrestricted access to the database, enter the
username ‘all’ at the top of the file and follow it with a parameter with a default setting
(e.g., create=y; alter=y; etc.).
Translatable prompts
Quanvert provides facilities for displaying prompts, table options and help texts in
languages other than English. Up to three different languages are allowed per machine.
The default is English: other languages which are automatically searched for are French,
German, Italian and Spanish.
Help texts may simply be translated as they stand, but the prompts and other texts need
to be translated and then converted into the fast machine-readable form expected by
Quanvert.
Each text in the prompt files consists of a 4-digit zero-filled number (e.g., 0001), a space
and the text starting in column 6. You must retain this format in your translations.
Additionally there are some special characters used internally by Quanvert which should
be retained in their original form during translation – these are generally single characters
preceded by % or \ signs.
English texts are stored in the english subdirectory of the main Quanvert program
directory, $QVHOME. Files containing help texts have names starting with the letters
qvh_, while the files of program prompts are called qotext.dat and qvtext.dat.
1. Go to $QVHOME and create a subdirectory with the name of the new language.
2. Copy all the English text files into this new directory.
3. Translate the texts in the .dat files (and the qvh_ files if you wish).
textconv file_prefix
This creates a file prefixconv.dat (e.g., qvtext.dat creates qvconv.dat) which is used for
each Quanvert session by users whose entry in the users file requests this language. If
Quanvert is only ever to be run in one language you need only have that language
directory and can delete those you will never use.
To display prompts in any language other than those mentioned above, create the
language file as described above and then add a line to the file availang in the Quanvert
program directory as follows:
language [directory/]
where language is the name of the language you have just created as you wish to use it
in the users file, and directory is the name of the directory containing the language files.
If this is the same as the language it may be omitted. For example, on a Unix system we
might write:
english
if we have lang=english in the users file and want to read English texts from the
directory $QVHOME/english. Alternatively, if we have lang=french in the users file and
we want to read French texts from the directory $QVHOME/notenglish, we would write:
french notenglish
When textconv creates the ??conv.dat files, it stores the texts in blocks of a fixed number
of characters (bytes) each. On all machines other than PCs, there are 5,000,000 characters
(bytes) per block; on PCs there are 32,000 characters (bytes) per block. These values are
the maximum number of characters allowed per block; they are also the defaults.
Generally this blocking will not cause problems. However, PC/Quanvert may
occasionally fail with the message ‘not enough space for language text’. If this happens,
you should reformat the conv.dat files using a smaller block size with the command:
where chars_per_block is the number of characters per block and is any value less than
32,000. For example:
textconv -b25000 qv
✎ You may need several attempts before finding a blocking factor that works.
• cleaning a database
Quick Reference
–pd and –td allow you to read files from and create temporary files in directories other
than the directory in which you are running Quantum.
All Quanvert projects originate from Quantum. Although Quanvert produces tables
identical to those generated by Quantum, it does not normally use the raw data and
Quantum program files. Instead, it uses a series of compressed data and axis files, one
pair per axis, derived from the Quantum files. These individual databases are referred to
as inverted or transposed databases, and the process which creates them is called
flipping. In databases with simple axes it is possible to run Quanvert almost immediately
on the raw Quantum data.
✎ There are many ways in which minor modifications to your Quantum program file
can improve the content and usability of the Quanvert database. For a list of points
to consider, see chapter 41.
The –v parameter tells Quantum not to produce tables but, when it reaches the output
stage, to run the flip program instead.
The –pd and –td parameters allow you to read files from and create temporary files in
directories other than the directory in which you are running Quantum.
✎ For further information about these options, see section 34.10 and section 34.11
☞ The flip process may run for some time so you may prefer to run it in the background
or via the batch system if these facilities are available on your machine.
File creates a number of files. The ones which are important to Quanvert are:
Filename Contents
*.ax axes text files
*.fli inverted data files
*.inc numeric variables (inc) files
*.mul values for numeric variables in axes
*.bit bit files for named filters
*.btx text for named filters
*.alp text (alphanumeric) variables files
axes.inf names of axes present in the database
incs.inf names of numeric variables present in the database
alpha.inf names of text variables present in the database
bits.inf names of named filters present in the database
qvinfo levels and weighting information
qvlvmn levels cross-reference files defining the relationship between the
higher level m and the lower level n
seg1.qv default run conditions and titles from the a statement
wmvalsn.q weights for weight matrix n
The sex axis, for instance, will have two sex.ax containing the element texts and sex.fli
containing the inverted data for that axis.
The levels cross-reference files, qvlvmn, define the relationship between the data at levels
m and n. The values of m and n represent each level’s position in the levels file. For
example, if your database has the following structure:
lev1
|
--------------
| |
lev2a lev2b
| |
--------- |
| | |
lev3a lev3b lev3c
|
lev4a
lev1 cards=1
lev2a cards=2 <lev1
lev2b cards=3 <lev1
lev3a cards=4 <lev2a
lev3b cards=5 <lev2a
lev3c cards=6 <lev2b
lev4a cards=7 <lev3b
The file qvlv12 defines the relationship between lines 1 and 2 – that is, lev1 and lev2a,
the file qvlv27 defines the link between lines 2 and 7 – that is, lev2a and lev4a, and so
on. Notice that there is no file qvlv26, for example, because lev3c is not related to lev2a.
If you alter the order of the levels in this file, the names of the cross-reference files will
be different even though the information they contain will be the same. If the levels file
is:
lev1 cards=1
lev2a cards=2 <lev1
lev3a cards=4 <lev2a
lev3b cards=5 <lev2a
lev4a cards=7 <lev3b
lev2b cards=3 <lev1
lev3c cards=6 <lev2b
For Quanvert to work, the database directory needs to be readable by everyone who will
be using the database. The following files must be present in the database directory:
Additionally, if users are to be allowed to create new axes and numeric variables, the
directory must have write permission for those users and the files axes.inf and incs.inf
must have write access for those users.
Quick Reference
To tidy a directory once the database has been created, type:
flipclean [–a]
flipclea [–a]
under DOS.
flipclean [–a]
This deletes any temporary files created during the flip process but leaves intact any files
which are needed for Quanvert.
If any of the following files is present, flipclean will prompt for confirmation before
deleting them: out1, out2, out3, the holecount and frequency distribution files, the tables
file, and the error summary.
Use the –a option on the command line to delete the source and object code files created
by the Quantum compilation and load phases.
Quick Reference
To clean a database directory when the database is no longer required, type:
qvclean
When you have finished with a Quanvert database you are advised to delete as many files
as possible to save disk space. qvclean is a variant of quclean and flipclean which will
delete all the files which make up a database. Just go to the database directory and type:
qvclean
The DOS version of qvclean has a standard list of files that it tries to delete, and has no
way of testing whether a file exists before trying to delete it. When qvclean tries to delete
a file that does not exist it issues an error message saying that it cannot find the file. You
can ignore these messages.
✎ Do not remove axes files with operating system commands. Always use the
Quanvert delete command for the relevant axes.
If you’re not happy with any axes in the database, you can replace them with new or
updated versions of themselves by reflipping them and then merging them into the
database as described in the next section. When you rerun qvmerge, any axes in the
database directory with the same names as those being merged will be overwritten. The
program will report the number of axes replaced and will list their names on the screen.
Once a project has been flipped you can axes or variables to it or replace existing axes or
variables with revised versions. Examples of new axes are if extra questions are added to
the questionnaire or if you decide you need some extra information such as the
interviewer’s number which was present in the original data but wasn’t flipped. An
example of a revised axis is where the axis was incorrectly specified in the original
Quantum, or specified in a way that did not produce the best results in Quanvert, and you
want to replace the existing version of the axis with a new one.
1. Either update the original Quantum program or write a new Quantum program
containing just the new or amended axes or variables, making sure that each axis is
mentioned on at least one tab statement. Unless you want to change the default
options and titles for the database, keep the options on the a statement as they were
for the original database.
Whether you write a new program or just change and rerun the old one is up to you
and probably depends on the number and type of changes you wish to make.
2. Flip the data using the new specification and the original data.
The merge program assumes that there is no change in the order of the data in the data
file between the original flip run and the merge run, and that there is an equal number of
cases in each run.
If you are working on a weighted database, do not change the weighting between the
original flip run and the merge run. Changes to the weighting, including the definition of
new weighting matrices, count as changes to the basic structure of the database and the
merge run will fail. If you need to change the existing weighting or define new matrices
you must delete the database and recreate it.
Quanvert users may also add new variables to a database themselves using the program
qvnewvar.
☞ qvnewvar is described in the Quanvert Menus and Quanvert Text User’s Manuals.
Quick Reference
To merge variables into an existing database go to the directory containing those
variables and type:
To merge new or replacement axes or variables into an existing database, flip the variable
definitions against the original Quantum data as described above. You will now have
pairs of axis and data files for each new axis or variable.
where database_path is the pathname of the database directory into which the variables
are to be merged. Pathnames may be absolute or relative.
As it merges variables, qvmerge updates the seg1.qv, wmvals and qvlv files with
information about the new variables.
If the command line includes parameters, only the variables named by those parameters
will be merged. You can enter lists of names by separating the names with commas (a
space indicates a new parameter).
To merge all variables for which there are no parameters (e.g., only named axes but all
numeric and text variables), use the notation −x+, −i+ or −p+ as appropriate (e.g., –i+ to
merge all numeric variables).
In levels jobs, only axes and variables at the named levels are merged; if you omit the −l,
all levels are assumed.
If you are using Unix and you wish to merge new files from /usr/qt/barbara/temp into the
database in /usr/qt/quanvert/summary, you would type:
cd /usr/qt/barbara/temp
qvmerge /usr/qt/quanvert/summary
If you are using DOS, if you wish to merge files from \barbara\temp
into \quanvert\dbases\summary, you would type:
cd \usr\qt\barbara\temp
qvmerge \quanvert\dbases\summary
The examples which follow assume that the files to be merged are in the current directory
and that the database directory, myproj, is a subdirectory within the current directory.
The first command merges just the axes age, sex and mstat into the database. Since there
are no other parameters on the command line, all numeric or text variables, and any other
axes in the current directory are ignored:
The next example merges the axes age, sex and mstat, and all numeric and text variables.
If there are no numeric or text variables, or you name a non-existent axis or variable, a
warning is issued:
The next command merges axes at household or person level which are called income.
All other axes and variables at those levels are ignored, as are all axes and variables at
other levels:
Sometimes a survey may be conducted as a set each questionnaire has the same axes. For
instance, a nationwide survey of newspaper readership may be carried out on a regional
basis, the data for each region being loaded and processed separately as if each region
were an entirely independent survey. However, when all regional tabulations are done, it
may be necessary to produce tables based on the population as a whole.
Users requiring information based on all regions could create separate tables for each
region using Quanvert and then add up the corresponding cells in each table manually.
However, by using the mflip program, it is possible to link the regional data into a
national database so that the required tables can be produced by using Quanvert on that
database instead.
When databases are combined in this way, mflip builds an internal cross-reference table
showing which axes and variables are present in each database. When the user comes to
analyze this new database he/she may choose whether to look at all its constituent
databases or only a subset of them. For example, he/she may choose to look at all regions
or only the South and East.
The cross-reference table is checked to see which variables are common to the selected
databases and these become available for analysis. For example:
When all regions are selected, only age may be used; when North and Midlands are
selected, age and mstat are available; for North and South, age and sex may be analyzed,
and so on.
Axes that are common to all directories in a multiproject directory need not contain an
identical set of elements in each of the constituent directories. If you know that you have
axes of the same name that are similar but not identical (for example, when an element is
present in some databases but not in others), you may specify when you run mflip
whether to use the first or last named database as the master, and whether to combine
those axes using element texts or positions. For example, suppose the shop axis is present
in varying forms in the three regional databases:
If you combine the databases in the order shown here, and use the last database (South)
as the master, and merge axes of the same name using element texts, a response of
Gateway in the North (row 4) or Midlands (row 3) will be merged into row 2 as this is
where it appears in the master South database.
A response of Sainsbury in North will not contribute to the multiproject database because
there is no element for it in the master database, although respondents who gave that
response will still appear in the base for the shop axis.
If you use the first database as the master, the shop axis in the multiproject database will
contain the four elements shown in North. Elements in the Midlands and South databases
will be merged into the appropriate rows in a similar manner to the one just mentioned.
If users want all elements to appear in the table, even if they do not appear in all the
chosen databases, you should ensure that all related axes contain the same elements in all
databases. To prevent these extra elements being printed when a single database is used,
place the option nz on the elements or on the l statement:
l shop;nz
col 132;Base;Gateway;Safeway;Sainsbury;Tesco;Waitrose
On long-running projects such as those where you gather the same type of data on a
monthly basis, it is possible that an axis may change during the life of the project. For
example, if a shop goes out of business you may decide to replace it with a new shop and
to re-use the code that represented the old shop in the data for the new shop.
If you want the mflip database to include the original element and its replacement, you
will need to create an axis that contains the full list of element texts. Elements which no
longer exist in the database should have conditions that cannot be met, so that the element
appears in the database but has no respondents in it.
In the example, if Safeway is replaced by Asda, you could write the new axis as:
l shop;nz
col 132;Base;Gateway;Asda;Sainsbury;Tesco;Waitrose
n01Safeway;c=c132=’1/&’
If you merge the constituent databases on element texts, and use the latest database as the
master, your multiflip database will contain six supermarkets.
Sometimes an axis will contain two or more elements with the same text. This often
happens when the axis contains subgroups (the Quantum group keyword) or subheadings
(n23s). When this happens, mflip checks whether the element is in exactly one subgroup
and, if not, inserts the subgroup name at the start of the element text in an attempt to make
the element text unique. Elements present in only one subgroup are left untouched. A
similar thing happens with subheadings, except that if the element is preceded by more
than one subheading, only the most recent is taken. An element text may be preceded by
both a group name and a subheading text. For example, if the axis is specified as:
l brands
n10Base
n23Biological
col 116;Brand A;Brand B;Brand C;Brand D
n23Non-biological
col 117;Brand A;Brand C;Brand D;Brand E
Base
Biological Brand A
Biological Brand B
Biological Brand C
Biological Brand D
Non-biological Brand A
Non-biological Brand C
Non-biological Brand D
Non-biological Brand E
It is now clear that the pairs of brand A, C and D elements are each different.
Having made this conversion, mflip then links the databases and merges elements if they
have identical texts. If axes in databases other than the master database still have
duplicate element texts, mflip issues a warning to this effect. However, if axes in the
master database have duplicate elements, but there are no duplicates in the constituent
axes being matched, no warning will be issued.
1. Although some basic comparisons are performed on the contents of axes of the same
name, they are not exhaustive. Make sure that variables that are not to be merged
have different names.
2. Once checking is complete across all constituent databases variables are merged into
the new database. If Quanvert is then used in any of the constituent databases,
renaming or deleting any variables will damage links to the multiproject directory.
You can correct this by rerunning mflip.
3. When databases are multiflipped, a database axis is created whose elements are the
individual databases. These elements are created with the nz option in the Quantum
program file. If a user runs Quanvert on a subset of the multiproject database (e.g.,
North and Midlands only) and uses this axis as the rows or columns of a table,
Quanvert will print only the elements for the databases being used (i.e., the ones
which are non-zero).
4. If a file starting with the string ‘tmpax9’ is found in a constituent database, an error
message is generated and the run is terminated.
To create the new database, use the mflip program as described below.
Running mflip
Quick Reference
To create a multiproject database create a new database directory, go to it and type:
To create a multiproject database create a new directory and go to it. Then type:
The command you type must include only one of the options –e, –p, –s or –t since these
determine how the axes in the constituent databases will be merged
–e merge on element position. If you use this option axes with the same
names should be identical. Quanvert does not check this; it simply
takes the first element in each axis and merges them, and then
continues with the second element. Unless you specify otherwise,
mflip uses the last-named database as the master against which axes
in other databases will be compared.
☞ Refer to the section entitled "Creating an mflip command file" for further
information about using the –i option on the mflip command line.
First, mflip prompts for a word which can be used to refer to the individual databases
which make up the multiproject database. This word may be up to seven characters long
(six with levels databases) and should not be the same as the name of any of the axes or
other variables in the database; the default is project (truncated to projec on levels jobs).
If you are combining regional databases into a national multiproject database you might
enter the keyword as region. Names may be entered in upper or lower case, but will be
converted to lower case before the file is created; for instance, if you enter the database
name as REGION, the database axis will be created with the name region.
Next, you are prompted to enter the database name and directory name or job id for each
project to be included in the new database. Type the two words separated by a space. An
error message is displayed if you fail to enter both items. The database name may be
anything you like as long as it is a valid filename and does not exceed ten characters in
length. Quanvert uses it as the element text for that database in the project axis. Under
Unix and DOS you may abbreviate directory names using the special characters . and ..
(i.e., dot and dot dot).
Examples
You may enter more than one directory name or job id at each prompt. This has the effect
of merging the named databases into a single element of the multiproject database. For
example:
These examples take the three databases for March, April and May and merge them to
form the Spring element in the mflip project axis (which you may have called season).
Similarly, the June, July and August databases are merged to form the Summer element.
As you’ll notice, directory names/job ids in the list are separated by spaces.
A database may be present in more than one element of the project axis: you may, for
instance, wish to include September in both the Summer and Autumn periods. Flipping
databases in this way affects the way that Quanvert works for users (it will always use
the project axis as the columns of the table), so it is advisable to include some references
to the constituent databases in the element texts of the project axis. For example:
When you press Return at the prompt, mflip lists the database names and their
corresponding directories and checks that this is correct before starting to link the
common variables into the multiproject database:
Name Directory
North /usr/barbara/area1/
South /usr/quanvert/area2/
East /usr/ben/area3/
West /usr/barbara/west/
O.k. to proceed? y
As each type of variable is copied, mflip reports the total number of each type linked.
It is at this time that the database axis is created. It consists of a Base element and as many
other elements as there are databases in the multiproject directory. In a national
readership survey, for example, the region axis may have five elements – Base, North,
South, East and West.
If there are no common variables or mflip is unable to copy files, an error message is
issued and the run stops. Where files already exist, they are overwritten without warning.
A message is displayed when the run is complete.
If you want to use the –i option on the mflip command line, you must create an input file
which contains the commands you would otherwise enter in response to the mflip
prompts. The first line of this file must be the name you wish to use for the database axis.
The last line of the file should be the response to the final mflip question, O.k. to
proceed? or a blank line.
For example, suppose you have four projects, North, South, East and West which you
wish to multiflip into a single project directory. If you do not use the –i option on the
mflip command line then under Unix you would enter data in response to mflip prompts
as follows:
To achieve the same result using the –i parameter you should create a file containing the
following text:
region
North /usr/barbara/area1
South /usr/barbara/area2
East /usr/barbara/area3
West /usr/barbara/area4
y
You would do the same thing under DOS except that you would use backslashes instead
of forward slashes in pathnames.
1. For each constituent database, create the new variables in the normal way in a
separate place.
2. Merge the new variables into each constituent database with qvmerge.
If you want to add variables to a database that is part of a multiproject database, and a
variable of a particular name already exists in that database, the new data will not be
merged with the old unless the two variables have the same number of elements. If they
differ, a warning is issued and the axis is ignored. You can get around this restriction by
deleting the variable from the database before you run qvmerge.
A multiproject database may contain data from projects conducted on different subjects,
in different areas or over different periods of time. When a user runs Quanvert on a
multiproject database, Quanvert needs to know which of the component projects the user
wants to work with during the current session. For example, in a multiproject database
whose component projects are january north, january south, february north, february
south, and so on, the user may choose to work on all component projects, or all january
projects, or all north projects, or any other combination of his/her choice.
Normally, Quanvert displays the list of component projects on the user’s screen at the
start each session and waits for him/her to select the ones to be used during that session.
The user may choose any combination of projects by tagging each one individually or all
projects by tagging the group as a whole.
If there are many projects, and only certain projects are required for this session it can
take some time to select them one at a time. One way round this problem is to use a
project selection file. This is a file that the user creates before each session containing
abbreviated references to the component projects to be used during the current session.
If this file does not exist, Quanvert displays the list of component projects as usual.
The project selection file can be built by hand using a text editing program, but since this
requires the user to know the abbreviations for each project, it can be created
automatically using the program qvprjsel. The sections which follow explain how to set
up the files used by qvprjsel and how to use it to create a project selection file.
The project selection file contains a list of references to component projects within a
multiproject database. Quanvert recognizes these references and is able to translate them
internally into the pathnames for projects within the multiproject database. For example:
brdjan
brdfeb
brdmar
These names are compound references to a project carried out in different periods. If the
project was also broken down by region, the references in the project selection file would
also include this information:
brdnjan
brdsjan
brdnfeb
brdsfeb
brdnmar
brdsmar
If the projects in the multiproject database also different studies – for example, UK
holidays and foreign holidays – the project parts of the references may vary too:
uknjan
uknfeb
uksjan
uksfeb
fornjan
fornfeb
forsjan
forsfeb
To ensure that the entries in the file are accurate it is best to create it using the program
provided for this purpose (qvprjsel) rather than using a text editor. This is particularly
true when the projects have not been given descriptive names or when the multiproject
directory has a large number of component projects.
Project references are made up of one, two or three parts, depending on how the projects
are organized. The first part is the project type or name: all project references must have
this section. The second part is the area which is optional as in our first example; the third
part is the period.
✎ As the examples illustrate, the naming conventions for the areas and periods
sections of the project references must follow the same pattern for all references in
the file. You cannot have one project whose references are broken down by month,
for example, and another project whose references are broken down by quarter, nor
can you have one project broken down by area and period and another broken down
by area only.
The name of the project selection file is mfwaves.line_number, where line.number is the
number of the line that connects the user to the computer. On multiuser systems, each
user will need his or her own project selection file. On standalone PCs, the line number
will usually be 00.
qvprjsel obtains the information it needs about the structure of the multiproject database
by reading three files:
These files enable you to control which project types, areas and periods can be chosen
together. Since all project references require a project type, you must create a types file.
The other two files are optional.
The types, areas and periods files share the same format. Each may contain up to four
lines laid out as follows:
which_lines is a list of line numbers, separated by commas, denoting other lines that may
be selected with this line. If, in the types file, the first entry reads:
it indicates that the user may select the bread project (line 1) with any of the projects
named on lines 2 and 3.
Creating and maintaining Quanvert databases – Chapter 42 / 717
Quantum v5e User’s Manual
it indicates that the user may select the northern region (line 2) with the area named on
line 1.
it indicates that the user may select the January period (line 1) with any of the periods
named on lines 2, 3 and 4.
An entry that contains only its own line number may not be used in combination with any
other entry.
You may also use negative numbers and zero as line numbers. Use line zero to indicate
that the current line may be combined with any other line. We could write the period
example for january (see above) as:
0 . jan January
A negative line number between −1 and −4 indicates that the current line may not be
combined with that line. For example:
-2 . jan January
means that the january line may be combined with any line except line 2.
The list of line numbers must be all positive, all negative, or zero.
path_name is the path name of the directory to move to after constructing the current
element of the project name in the project selection file.The examples we’ve used so far
all use a dot as the relative path. This represents the simplest case in which the types,
areas and periods files are all in the multiproject directory and you run qvprjsel in that
directory. However you can run qvprjsel from other directories as long as you tell it
where to find the types, areas and periods files.
The types file must be in the directory in which you run qvprjsel but this does not have
to be the multiproject directory. If the areas file is used and it is not in the directory in
which you run qvprjsel then you must specify the path to reach this directory in the types
file. Similarly you must specify the path to the periods file if this is used. You must also
ensure that you specify the path back to the multiproject directory in the last file accessed
by qvprjsel (the periods file if you use all three files).
|
________________________________________________
| | | |
bread1 bread2 brdall work
where brdall is the multiproject directory and work is the directory in which you run
qvprjsel. bread1 and bread2 each have subdirectories:
|
_______________________________________________
| | | |
North South East West
North, South, East and West all have subdirectories holding bread surveys for individual
months. These are the component projects of the multiproject in brdall.
The types file must be in the working directory, work. If the bread1 and bread2
directories both contain areas files then the types file must include the path to the
directory from which qvprjsel is to read the areas file:
..\brd1 or ..\brd2
Similarly if the periods files are held in the North, South, East and West directories the
areas file must include the path to the directory from which the periods file is to be read.
The periods file must include the path back to the multiproject directory:
..\..\brdall
project_name is the name of a project type, area or period, as appropriate. If you select
this type, area or period, Quanvert will use this word as the type, area or period
component of the project reference in mfwaves. For example, suppose you have these
three lines, one in each of the types, areas and periods files:
If you choose these entries from the lists, Quanvert will create the entry brdnjan in the
project selection file.
Once you have set up the types file and, if necessary, the areas and periods files, you can
run qvprjsel to create the project selection file(s). This is a simple procedure which
involves selecting types, areas and periods from descriptions displayed on the screen.
You can therefore, if you wish, leave it to users to run qvprjsel. If you decide to allow
users to do this, you must ensure that the types and, if necessary, the areas and periods
files exist as users cannot create these files themselves.
qvprjsel –i line_number
where line_number is a user’s line number to the computer. For example, to create an
entry for the user whose line number is p1 you’d type:
qvprjsel -i p1
Quanvert lists the description fields of the types, areas and periods files, one file at a time,
and waits for you to choose the project types, areas and periods you want to use. You do
this by using the up and down arrow keys to highlight an entry and then pressing Enter
to select it. Quanvert displays an arrow marker to the left of each entry you choose. If a
file indicates, through its line number field, that a certain combination of choices is
invalid, Quanvert disallows the selection.
If you change your mind about which entries to select, you may cancel an entry by
highlighting it again and pressing Enter. The arrow marker disappears from that entry.
When you have made all your choices, press F10 to confirm them.
When Quanvert has dealt with all the relevant files, it creates the entries in the project
selection file by merging each type selection with each area and period selection.
For example, if you choose brd and sp types, n and s from areas, and jan and feb from
periods, Quanvert will create eight entries:
brdnjan
brdsjan
brdnfeb
brdsfeb
spnjan
spsjan
spnfeb
spsfeb
If you are creating project selection files on behalf of users, remember to rerun qvprjsel
with a different line number for each user. To change the contents of an existing project
selection file simply rerun qvprjsel and choose a new combination of project types, areas
and periods.
Quick Reference
To compress .inc files, type:
.inc files store the values of numeric variables for a Quanvert database and can be quite
large. If the files contain only integers, you can reduce their size by compressing them
with qvshrinc. Type:
Just type the main part of each file name, not the .inc suffix.
The compressed version of each file overwrites the original version. As files are
compressed, qvshrinc displays messages naming files and any action taken. Options on
the command line change one or both these things:
The rates of compression vary according to the range of values found. Values in the range
0 to 255 result in a reduction of approximately 75% (on VAX-like machines) in the file
size. This does not mean that the file must contain numbers between 0 and 255
specifically; rather that the difference between the highest and lowest values in the file
must be in the range 0 to 255. For example, if the lowest value is 500 the highest may
then be up to 755 in order for a 75% reduction to be possible.
Similarly, values in the range 0 to 65535 will result in a reduction in size of around 50%.
Quick Reference
To create a titles file for Quanvert Menus databases flipped prior to version 3.6 and
Quanvert Text databases flipped prior to version 8.4a, type:
qvtitles
Quanvert Menus version 3.6 onwards and Quanvert Text version 8.4a onwards read axis
and variable descriptions from a file called titles.inf in the database directory rather than
extracting them from axes, numeric and alpha variables files. Since Quanvert needs to
read less files, this makes running Quanvert Menus and Quanvert Text considerably
faster.
✎ The Quanvert numbering scheme has been changed to bring it into line with the
numbering system already used for Quantum. Versions of Quanvert Menus after 3.6
and of Quanvert Text after 8.4a are now all numbered version 5d.x.
Versions of flip provided with Quantum version 5d.1 onwards create this file
automatically, but you can create it for databases flipped with an earlier version by going
to the database directory and typing:
qvtitles
For axes, the lines contain the axis title which will be printed at the top of any tables in
which the axis is used. For numeric variables, the lines contain the minimum, maximum
and mean of the values present in the variable. A similar thing happens with text variables
except that the figures refer to the text length.
These values aren’t very helpful to users, so you may like to replace them with
descriptions along the lines of the axis titles. You can do this by editing the titles.inf file.
This file contains two lines for each variable. The first line in each pair is:
#@variable-type variable-name
For example:
#@axis marry
The second line in each pair is a text line, as described above. For example:
#@axis marry
Marital status
#@inc resage
Min value 21.0,max value 80.0,mean value 45.9
Quanvert only reads the first line after every #@ line, so if you want to change the file in
any way it’s safest to insert the new text immediately after the #@ line, before the
existing title. This way you still have the original title generated by the flip run. This is
especially important with numeric and text variables whose entries are generated by the
flip run and which are not available anywhere else.
Quick Reference
To create a secure Quanvert database or check the status of an existing database, type:
Options are:
Quantum provides a program called qvsecure which enables make a Quanvert database
secure, preventing access to weighted or unweighted information below a given value.
Once a database has been made secure, options which are normally available in Quanvert
are disallowed, as follows:
qvsecure –m min_value –u
This causes unweighted absolute data with a value less than min_value to be replaced
with the special characters defined with spechar= in the Quantum spec. For example:
qvsecure -m 100 -u
creates a secure database in which all unweighted absolute values less than 100 are
replaced by special characters.
qvsecure –m min_value –w
This causes weighted absolute data with a value less than min_value to be replaced with
special characters.
You may secure a weighted database on weighted or unweighted figures but not on both.
The –w option is disallowed when making an unweighted database secure. However, you
must still specify –u since qvsecure has no default database type.
Provided that you enter a value for min_value that is greater than or equal to zero, and the
first number in any row or column of the table is less than that value, then that number is
replaced with the second spechar character and the rest of the row or column is replaced
with the first spechar character. If you do not define any special characters for the
database, Quanvert uses an asterisk as the second special character and a blank as the
first.
If you define min_value as zero, do not use either –u or –w. Instead just use –m0 by itself.
The database will continue to indicate the other features of secure Quantum but no
suppression of specific values will take place.
To change the security level of a database rerun qvsecure with a different value for n.
qvsecure –i
qvsecure –x
Quick Reference
To update an old database for use with a newer version of Quanvert, go to the database
directory and type:
qvupdate
As Quanvert develops, the internal structures of some files change and new files become
required in order for Quanvert to run. In most cases you can update a database just by
reflipping it, but where this is not practical you can run the qvupdate program instead.
• If file titles.inf does not exist, qvupdate runs qvtitles (wqvtitle.Exe within Quanvert
for Windows) to create it with a title for each axis, numeric, alpha and bit variable.
Quanvert Menus and Quanvert for Windows both use this file but Quanvert Text
does not.
• If any of the following are out of date, qvupdate runs the program qvaddmar
(wqvadmar.exe within Quanvert for Windows) to make them up to date:
– If marginal values have not been copied from *.fli to *.ax, they are copied.
Quanvert Menus and Quanvert for Windows look for marginals in the .ax files but
Quanvert Text does not.
– If flags indicating which axes are multicoded have not been set in the axes.inf file,
they are set. Only Quanvert for Windows uses these.
– If flags indicating which multicoded axes are single coded for the purposes of
exporting in SAS, SPSS or ASCII format have not been set in the axes.inf file, they
are set. Only Quanvert for Windows uses these.
• If the database is weighted and the file wtnames does not exist, qvupdate creates one
file per weight matrix with the name Weight_n where n is a number. All versions of
Quanvert use this file if it exists.
• If any of the following are out of date, qvupdate runs the program wtincs
(wwtincs.exe within Quanvert for Windows) to make them up to date:
– For each weight matrix, a numeric variable is created using the names in the
wtnames file which by default (see previous point) is Weight_n.
– For each valid pair of levels, qvupdate creates a file called lv_m_n (where m is the
upper level and n is the ;lower level) containing, for each case the upper level the
number of lower level cases inside it. For example, if level 1 is households and
level 2 is people, lv_1_2 is created at level household and contains a count of
people in each household. A pair of levels is valid if they are
ancestor-and-descendent, but not if they are siblings, uncles, and so on.
✎ Users of Quanvert Text databases may also find these files useful for checking
individual respondent records or for filtering.
The notes in this section describe facilities for databases designed for use with Quanvert
for Windows.
Quanvert for Windows users may obtain information about the database they are using
by selecting the About database_name command from the Help menu. The text that this
command displays is read from the file db.nts in the database directory. The file is a text
file that you may create with any text editor.
There are no restrictions on the length of lines or the number of lines in the notes file. If
the file is wider or longer than the notes frame, Quanvert will provide scroll bars for
reaching the extra text.
Quanvert for Windows’ online help screens display the Quanvert icon at the start of their
title bars. Each database may also have its own icon defined in the file db.ico in the
database directory. If this file exists, Quanvert will display it next to the notes frame of
the Help About window.
3. Copy the bitmap file into the database directory and call it db.ico.
In large companies or those with regional offices, you may need to make Quanvert
databases available on more than one computer; for example, on the main machine at
head office and on other machines in various regional offices, or on the main machine
and a number of PCs. Although you could flip the database on each machine in turn, it
saves time if you flip it on one machine and then copy the whole database onto any other
machines which require it.
There are three things to consider when transferring databases between machines:
i) A Quanvert database consists of many files, so copying it onto other machines can
be time-consuming, especially if you have to deal with each file separately.
ii) Depending on the size of the database and the medium you are using to make the
transfer, you may or may not be able to transfer the database all in one go. For ex-
ample, if you are transferring the database by copying it onto floppy disks, you will
be constrained by the amount of data the disks can hold.
iii) Although all computers are able to store the same type of information, they often do
so in different ways. For instance, an axis file on a Unix system does not have the
same format as the corresponding file on a PC.
Quanvert provides a group of programs to help you deal with these potential problems.
The first three are available on all machines:
qvpack (or qvpack.exe) to pack and optionally compress the database files into a
single file prior to transmission.
qvtrans (or qvtrans.exe) to uncompress and unpack the database into separate files
after transmission.
You will notice that there are no programs for converting between different file formats
or for copying the packed database between computers. Conversion programs are not
necessary because qvpack and qvtrans are able to deal with most file format conversions
internally. File transfer programs are not provided since these are readily available from
elsewhere. You may use any method of your choice for moving the database from one
machine to another.
☞ For further information on file formats, see the section entitled "Unknown file
structures".
For advice on copying files, see the section entitled "Copying the database".
–u Do not compress the database. If you are packing a database for a client who does
not have a version of qvtrans that can uncompress files, you should use this option
to pack the database without compression or provide the client with the latest
qvtrans.
–b The maximum size in bytes for each file if the database needs to be split. A byte is
the same as a character. The default is 1,048,576 bytes.
A qvpk script or batch file has two stages: first run the program qvpack to pack and
compress the database, then, if necessary, run qsj to split the packed file. The default size
for packed files is 1,048,576 characters, but you may change this by defining a different
size with the –b parameter on the command line. Once the database is packed, qvpk
checks its size and, if it is larger than the set number of characters, passes it to qsj for
splitting.
✎ If you omit –b and the packed database file is smaller than the 1,048,576 bytes, qvpk
will report that the file is to small to need splitting.
For example, to pack an uncompressed database into the file mydb by placing 1,400,000
characters in each file, you’ll type:
When the database is split into a number of files, the names of those files consist of a
common root and a three-character suffix separated by a dot. The suffix usually starts
with the letter q. In the example shown above, the split files will be called mydb.q01,
mydb.q02, and so on.
If the name given for the database contains a dot, or more than one dot, qsj takes the file
name up to the first dot and checks to see what the next character is after the dot. If there
is something after the dot, qsj uses that character as the start of the suffix; if not, it uses
the default q. For example:
If the root of the file name is longer than eight characters, it is truncated to this length so
that it will be valid on any machine. Hence mydatabase.jan produces files such as
mydataba.j01.
If the database name is a complete path name, qsj makes a note of the path name so that
the files can be created in the right location, and then follows its usual rules for file
naming.
Once all the necessary files have been created, qsj creates an index file suffixed xzz (e.g.,
mydb.qzz) containing a list of the files which make up the complete database and the
order in which they must be joined.
If splitting is unnecessary, the database is packed into the file named with the –p option
as usual.
qvpack, which packs and compresses the database, expects to find the files qvinfo,
axes.inf and incs.inf which will tell it which weights, levels, axis and incs files to expect.
The files alpha.bit and/or bits.inf, if they exist, provide the same information for alpha
variables and named filters.
As each file is packed, its name is echoed on your screen. If qvpack is unable to find an
expected file, it issues a message to that effect and terminates immediately.
You may copy the packed database file onto the destination machine by copying it onto
media such as floppy disks or a tape, or by using a file transfer program such as rcp, ftp
or kermit. If you choose a file transfer program, you must copy the file in binary format
with no conversion whatsoever. If you are transferring to or from a PC using kermit, you
can set the file type on the PC by typing:
once you have started kermit up, but before you transfer the packed file. On Unix systems
you can send the file by typing, for example:
kermit -i -s packed_file
where packed_file is the name of the packed database. The –i option tells kermit to send
the file exactly as it is.
When you have finished, go onto the destination machine ready to run qvtr or qvtrans as
described above.
To unpack a database on a machine running SCO Unix, SunOS or DOS you use qvtr. This
runs the program qsj to join any split files which make up the database, and then runs
qvtrans to uncompress and unpack the packed file, and make any conversions in file
formats that are necessary.
1. Go into the directory containing the packed file(s) you have just transferred or
loaded.
2. Create a directory to contain the files which qvtrans will create. This is the directory
which will become the new database directory in which users can run Quanvert. It
may be a subdirectory of the current directory or a separate directory elsewhere on
the system. If you do not create a new directory, qvtrans will place the database files
in the current directory.
where file_name is the name of the packed database, and directory_name is the name of
the directory in which to place the unpacked files. If you are already attached to the
database directory you may omit this parameter.
Like qvpk, qvtr has two stages: run qsj to join any component files which exist and then
pass that file to qvtrans for unpacking. When it starts, qsj checks to see whether a file
already exists with the database name you have given. If so, its assumes that the
component files have already been joined and therefore stops. If there is no such file, qsj
calculates the name of the index file from the name given on the command line using the
same rules as for splitting the database.
qvtr -p mydatabase.jan
and there is no file with this name, qsj will look for an index file called mydataba.jzz.
If qsj cannot find the index file, it will not run. If the file does exist, qsj joins the files
named in the index in the order they are named. If any of the files are missing from the
directory, the run fails.
You can run qsj outside qvpk and qvtr to split and join files of your choice. You split files
with:
where file_name is the name of the packed file you wish to split.
qsj –j file_name
The notes given above for qvpk and qvtr regarding file naming conventions apply in both
cases.
When you pack a database, qvpack writes an entry at the start of the packed file naming
the file structure of the files in the database. When you unpack a database, qvtrans reads
this entry and makes any file structure conversions that are necessary.
Although qvpack and qvtrans recognize many file structures, and some systems share the
same file structure, there may occasionally be times when you are working on a computer
whose file structure is not known to qvpack and qvtrans. (This is most likely to happen
as new computers are developed.) This does not necessarily mean that you will be unable
to pack or unpack the database. Usually you will find that the unknown file structure is
the same as one that these programs recognize, but that it just has a different name. If so,
you can inform qvpack and qvtrans of this relationship and they will pack and unpack the
database as required.
✎ The vax option refers to Unix machines, apart from SCO Unix which is represented
by the keyword scou. qvpack and qvtrans do not recognize prime, pcs, vax vms and
hp spectrum.
If the unknown file structure is identical to one of the known file structures, you can
either define an alias which links the unknown structure to the known one, or you may
enter the name of the known file structure as part of the qvpack and qvtrans commands.
Aliases
For example, if the alias for Sun Unix systems was not built into qvtrans itself, you could
define it as:
sun = mips
Once you have defined an alias, you should run qvpack as described below and name the
alias with the –m option on the command line.
where packed_file is the name of the packed database file you wish to create, and –u
prevents qvpack from compressing the database once it has been packed.
The machine name is the name of the computer or operating system on which you are
working. If you have defined an alias for this system you should type the alias name,
otherwise type the computer or operating system name. For example, if you are working
on an xyz computer whose file structures are identical to those of a Vax, you could define
an alias:
xyz = vax
and qvpack will assume that the xyz file formats are identical to the vax file formats it
already knows, and will pack the database accordingly. You will be reminded of this with
the message ‘Warning: assuming vax to be identical to xyz’.
If the packed file is large, run qsj as described for the appropriate machine type above,
then copy the file(s) to the destination machine. On the destination machine, run qsj if
appropriate and then qvtrans as described above.
If you copied the database files individually rather than packing them, they will still be
written using their original file structures. To convert them, run qvtrans and include the
name of the originating machine as part of the command. Type:
Parameters are:
–m names the machine on which the database originated. This may be the machine
name of an alias as described for qvpack.
–s names the directory containing the database files to be converted. If you are running
qvtrans from this location you may omit this parameter.
–d names the directory in which the converted files should be placed. If you are running
qvtrans from this location you may omit this parameter.
✎ There must be at least one space between the option letter and the name which
follows it.
If you create a database on a machine running Unix and you have PCs that connect to that
machine using PC-NFS, you can make that database available to those PCs simply by
converting it into MS-DOS format and placing it in the directory to which PC-NFS users
have access.
To do this:
2. If necessary, copy the files into the directory to which PC-NFS users have access. You
may use pvpk or qvpack if you wish or, since the files are remaining on the same
computer, you may prefer to copy or move them instead.
3. Go to the new database directory and run qvtr or qvtrans to convert the files into
MS-DOS format. If you moved or copied the files, you must run qvtrans with the –m
parameters on the command line naming the type of Unix system you are using.
PC users may now go to this directory and run Quanvert as if the database was located on
their PCs.
Sometimes a new version of Quantum or Quanvert will contain changes to files that result
in you being unable to use any existing databases with the new versions. You may decide
to keep the old versions of Quantum and/or Quanvert on your machine for use with these
databases or you may wish to upgrade the databases so that they can be used with the new
versions of the software.
qvaddmar upgrade the way in which marginals are stored in databases created prior to
June 1989
✎ These programs are not part of a standard Quantum/Quanvert release. They are
available on request from your account manager.
✎ Quanvert releases now have the same version number as Quantum releases. This
was not always the case. Without the renumbering version 5d.5 of Quantum would
have been version 8 of Quanvert.
To upgrade a Quanvert version 7 database for use with Quanvert version v5 (previously,
version 8), go to the database directory and type:
qv7to8
This program is available only for VMS and MSDOS systems since Quanvert version 7 was
only available for these systems. It is not available for other systems.
If you need to convert a database created with Quantum version v5d.5 for use with
version 7 of Quanvert, go to the database directory and type:
qv8to7
In both cases, messages are displayed on the screen as each axis and variable is updated.
The way in which Quanvert stores respondent counts (marginals) has changed since June
1989. If you have users who are still using databases created prior to this, you can
improve response times for them by updating the files to store marginals in the new way.
The program to do this is qvaddmar. Just type the program’s name:
qvaddmar
✎ Quantum Menus will still run if you don’t run qvaddmar since, if it can’t find the
marginals in the new storage place, it looks in the original file.
1000 columns per card in multicard data records (read=2). No limit for
read=0
327 highest card number that can be written out to a data file
32 different fonts
10 minimum number of lines per page (paglen=)
10000 maximum lines per page (paglen=)
10 minimum page width (pagwid=)
10000 maximum page width (pagwid=)
120 characters of row text (side=)
6 dimensional tables
30 rows of bot titles per table (15 with underlining)
30 rows of foot titles per table (15 with underlining)
40 sid statements per table
15 characters for a named filter name
100 named filters
10 levels of nested filter titles (sectbeg/sectend)
7 decimal places for absolutes
7 decimal places for percentages
54 hugged rows per hug= (minumum is 2)
9 weighting matrices
9 axes per weight matrix
16 weighting axes per rim weighting run
2000 weights per run
256 characters for all level names together (each name ends with a null
termination character, so each name is 1 character longer than it
looks)
16 levbase statements per run
There is no practical limit to the number of characters in the axis text heap. The text heap
is made up of:
This appendix lists the error messages generated by the various stages of a Quantum run.
Explanations are included where the meaning of the message is not immediately obvious
or where limits have been exceeded.
109 Punch( )= not allowed on col cards when within grid axes
Use n01s instead
110 Incorrect form of priority statement
111 Incorrect entry in ttord/beg/end statement
112 Add or bnk must follow 2-d tab
Only 2-dimensional tables can be added or banked
113 Too many rej=0s in axis
You may have ten elements in an axis with rej=0 for each element with
rej=1.
114 Rej=1 without preceding rej=0
115 Keyword can only occur on a 'basic' row
Basic rows are those created directly from the data (i.e. n10, n11, n01, n15
and their counterparts on col/val/fld/bit statements)
116 Illegal characters following 'ed' statement
Only level names defined in the levels file may follow ed
117 Variable subscript out of range
You are trying to refer to a non-existent cell in an array (e.g. brand(10)
when brand has be defined as having only 6 cells).
118 Illegally defined variable subscripts
Subscripts may be whole numbers or integer variables
119 M card must have an EX= option on it
Quantum needs to know how to manipulate the data in order to create this
element.
120 Missing text after keyword
121 Reverse polish stack full
There is no room to store the rest of the manipulation expression
122 Reverse polish stack empty
The manipulation expression has not been converted into anything
123 Invalid expression following EX=
124 Unbalanced parentheses
Each ( must have an ), and vice versa.
125 Unexpected comma in expression
126 Incorrect number of arguments for function
You have enclosed to many or too few items in the parentheses
127 Element text or identity name not defined
However, specifications of this type must come at the end of the fld or bit
statement. You have written a statement that defines non-numeric codes in
the middle of the numeric specifications. If this is really what you want,
split the specification into a number of fld or bit statements so that the
non-numeric specifications come at the ends of the statements.
319 Axis has no elements
320 Axis has no basic elements
Basic elements are elements that are created by reading the data. They are
created by n01, n15,n10 and n11 statements and their equivalents on col,
val, fld and bit statements. Although the axis may contain other elements
that produce numbers (e.g., statistical elements), it does not contain any
that appear in this list.
321 Min weight cannot be >= max weight
322 Min/max weight must be > 0.0
All weights must be positive values greater than zero. You have specified
a minimum or maximum weight with minwt= or maxwt= that is zero or less
than zero.
323 basecol or extmap argument incorrect
324 Non-printed row has tstatdump flag, will not print in dump
325 inc= and missing= texts together are too long
Quantum stores the field references or values given with inc= and
missing= is a temporary holding area of a fixed size. The values and field
references require more space to hold them than is available.
You may be able to reduce the amount of space your field references and
values require by replacing options of the form inc=c(132,135) with,
say, inc=t1, and adding an assignment statement such as
t1=c(132,135) to the edit.
This message is the result of a problem in one of the files editQ.c or axesQ.c that the
Quantum compiler creates from the edit and axes sections of your run. editQ.c contains
everything that was in your edit section, translated into the C programming language.
axesQ.c contains just the c= conditions from n statements and their equivalents on col,
val, fld and bit statements.
The error message means that the amount of code that needs to be processed is too large.
The only solution to this problem is to reduce the size of the code.
if (c(123,124)=$99$) .....
If you find a lot of these lines, see if you can replace them with loops or fetch files.
n01Element text;c=c(132,134)=$123$
As in the edit section, this type of coding is inefficient and would be better defined using
an integer variable defined in the edit section:
int myvar
ed
myvar=c(132,134)
end
....
n01Element text;c=myvar.eq.123
Also look for places where the number of conditions on elements can be reduced by
filtering at a higher level. For example, look for places where you can use c= on an n00
statement or on the l statement. An even better solution would be to move some of the
filtering from the axis section into the tabs section.
If you are having difficulty locating the source of the problem, try commenting out sets
of tab statements and rerunning the job until you find the offending axis.
This message is the result of referring to a nonexistent column of the C array. The most
likely cause is a subscripted column reference such as c(t1), where the value of the
subscript is zero or negative.
A number of the messages issued by accum refer to temporary files, and you will often
see the message:
Fileno number: time time dev dev mode mode size size
giving information about the file. number is the file number, time is the time, dev is the
device on which the file is open, and size is the number of characters in the file.
nums is the file containing the values calculated during the datapass. It is read by the table
output program, qout. The message means that qout could not find sufficient values in
nums to fill the cells of the tables. There are a number of reasons this could happen.
• A previous run filled up the disk so the datapass could not write the complete nums
file to the disk. This is most likely to happen under DOS but could happen on any
machine.
• You ran a complete job using options to retain Quantum's temporary files. You then
edited the table specs and reran qout. If the changes you made increased the size of
any table, then qout will now be attempting to process a nums file with fewer values
that the new specs require.
• You ran the datapass and output stages as separate jobs but at the same time. qout
was therefore trying to extract values from nums before they were there.
On computers which store information in ASCII format, records in Quantum data files
are stored in ASCII format. Any multicoded columns that cannot be represented using
ASCII characters are shown as an asterisks in the appropriate columns, and are coded at
the end of the line as follows:
1. One byte with octal value 0177 to indicate that the multicoded information is about
to start.
2. Two bytes for each asterisk in the preceding part of the record. The lower order six
bits of the first byte correspond to codes &–0123, the lower order six bits of the
second byte correspond to codes 456789.
The top two bits of each byte are always set to 01. For example, to represent the
multicode '&159':
First Second
Byte Byte
-------- --------
01100100 01010001
| | | |
&-0123 456789
3. The line as a whole is terminated by the new line character ‘\n' (decimal 10, octal 12,
character ^J – the line feed character).
For example:
converts the Quantum file qdata into a text file called textdata.
When the program finds a multicode, it leaves the asterisk in the column but converts the
multicode at the end of the line into a text of the form col=codes, where col is the column
number and codes are the codes in the multicode. A multicode of '126' in column 56, for
instance, would result in an asterisk in column 56 and the notation 56=126 at the end of
the record.
Once you have converted the data file, you can copy it to the destination machine using
a file transfer program of your choice and then convert it back to Quantum format by
typing:
The standard version of Quantum recognizes characters (except DEL) in the standard
ASCII character set; that is, decimal values 32–126 on Unix, DOS and VMS machines,
and the EBCDIC equivalents of these characters on EBCDIC machines. Characters in the
extended character set (i.e., 128–255) are set to blank and have the following results in a
run:
• if Quantum finds a character outside the standard range it issues an error message and
sets the column to blank,
• if a column in the data is set to a multicode which does not correspond to one of the
standard characters (e.g., '09' is the same as the letter Z) Quantum puts an asterisk in
the column and then lists the bits in the multicode at the end of the card.
If you have data that contains non-ASCII characters, and you do not want Quantum to
treat them as miscellaneous multicodes, you may modify your version of Quantum to
accept some or all characters in the extended ASCII character set.
The notes in this section provide background technical information on how Quantum
stores data. You should find that it helps you understand what you're doing when you
make your modifications. You may skip this section and come back to it later if you like.
The computer maintains a table in which it stores all the characters is recognizes in a
variety of formats. This allows it to read and write information in decimal, octal,
hexadecimal or ASCII (text) format. The information in this table is available to
Quantum. We will refer to this as the ASCII character set.
Quantum has a similar table which tells it the relationship between characters in the
ASCII character set and multicodes which may appear in the data. The multicodes are
listed in the same order as their corresponding characters in the ASCII character set so
that, for example, position 101 in the ASCII table contains the letter A while position 101
in the punch code table contains the multicode '&1'. This means that if you create a data
file containing just the letter A, you can write a statement of the form:
to check the contents of this column as punched codes rather than as a single letter.
For Quantum to compare the contents of cell 101 in the two tables, it needs to look at the
contents in the same format. The format it uses is octal.
A column of data may contain up to 12 codes. Quantum treats a column as a binary array
in which a cell is set to 1 if a punch is present in that column, or zero if it is not. For
example, if a column is multicoded '09', the array will be:
& - 0 1 2 3 4 5 6 7 8 9
0 0 1 0 0 0 0 0 0 0 0 1
Quantum converts this into an octal value by reading the cells in the array three at a time.
In the example above, the first three cells are 001. In octal, the first cell in the group has
a value of 4, the second has a value of 2, and the third has a value of 1. If a cell is ‘on'
(i.e., it is set to 1), Quantum adds the value for that cell into the total for that cell. When
we apply this rule to the first three cells of our array, we obtain a value of 1.
The second three cells are all zero, as are the third three, but the last set of three cells is
001 like the first set. The octal value of our array is therefore 1001.
& - 0 1 2 3 4 5 6 7 8 9
1 0 0 1 0 0 0 0 0 0 0 0
a) If the data is a standard ASCII character, Quantum finds its position in the ASCII
character set and then looks up the octal code in the corresponding cell of the punch
array. For instance, if the column contains the letter A, which is in position 101,
Quantum will look in cell 101 of the punch array and will find code 4400 (the same
as 100100000000 or '&1'). It stores this code as the data for that column.
b) If the data is a character in the extended character set, the position for that character
will be set to zero. Quantum therefore stores a blank as the data for this column.
The same actions are taken, but in reverse, when you write your Quantum data out to a
file. Quantum reads the octal punch code for the current column and finds its cell number
in the punch array. It then writes out the character in the corresponding cell of the ASCII
character set. For example, if the punch code is 4400 which is in cell 101 of the punch
array, Quantum will output the character in cell 101 of the ASCII array; namely, the
letter A.
If the column is stored as blank, an empty column will appear in your data. If the column
is stored as a miscellaneous multicode, Quantum will write out that multicode.
If Quantum makes the wrong association between the ASCII character set and the punch
array, your data will not be the same when you write it out as when you read it in.
Quantum takes the same approach with the extended ASCII character set. However, since
the positions of characters in the extended ASCII character set may vary according to
which hardware and software you are using, Quantum cannot obtain this information
from the system. Instead, you need to create a file which tells it which octal punch codes
represent each character.
This is not as difficult as it sounds, since the Quantum directory contains two files to help
you:
binasc.dat 1i contains octal punch codes for the standard ASCII character set.
bineas.dat contains octal punch codes for the most common version of the
extended ASCII character set.
By editing binasc.dat and replacing some or all the zero values with octal values from the
corresponding cells in bineas.dat you can have Quantum recognize as many characters
from the extended ASCII character set as you like.
☞ For step by step instructions on editing these files, see section D.3 and section D.4
below.
The file you create is user-readable, but this is not very efficient for using in a Quantum
run. You'll therefore need to convert your text file into a binary file called:
Unix $QTHOME/include/bintab.qt
DOS %QTHOME%\include\bintab.qt
HP Spectrum bintab.h.$QTHOME
VMS [$QTHOME.INCLUDE]bintab.qt
When Quantum starts, it looks for this file. If the file exists, Quantum will use the
information in it to work out whether the multicodes it reads are characters or
miscellaneous collections of punches. If bintab.qt file does not exist, Quantum uses the
standard ASCII character set, as described above.
Follow the instructions in the next section to create the necessary files. If you have any
problems using your extended ASCII character set (e.g., unexpected characters appearing
on the screen), or are at all unsure about what to do, please contact your account manager
for assistance.
If you want Quantum to recognize the full extended ASCII character set, you can convert
the bineas.dat file provided with Quantum without any intermediate editing. Type:
Unix cd $QTHOME/include
DOS cd %qthome\include
HP Spectrum cd $QTHOME
VMS cd [QTHOME.INCLUDE]
and then
✎ .If you are using an HP Spectrum, type the output file name as bintab.h rather than
bintab.qt.
The -o on the command line indicates that the numbers in bineas.dat are octal values.
If you want to revert to using the standard ASCII character set later on, either rename
bintab.qt (useful if you may want to use the extended character set again in the future),
or replace it with a standard ASCII file:
There is no need to use the full extended ASCII character set if you do not want to. For
example, a French company may wish to convert only the French characters, and leave
others such as German and Spanish characters as multicodes.
1. Make a list of the characters you want to use, and their positions in the extended ASCII
character set. For example, ä is in position 132.
2. Edit bineas.dat and go to that position in the file. (Start counting from 0 not from 1
and count across the rows). Note down the value which appears in this position. For
the ä character you'll find that its octal punch code is 07104 (all punch codes for the
extended ASCII set start with 07). Repeat this step for each character that you need.
Quit out of bineas.dat when you've finished.
3. Take a copy of binasc.dat and edit this copy. Do not edit binasc.dat itself.
4. Go to the position of the first character you want to use (you should find a zero value
in this position) and replace the current value with the one you noted down for this
character. Repeat this step for all other characters you want to use. For example, go
to position 132 and replace the zero with 07104.
• position 127 (i.e., the 127th value) under Unix and DOS, and position 255 under
IBM, since Quantum uses this character to mark the end of the data and the start
of the multicoded bit strings in each card.
You're now ready to convert your file into binary. To do this, type:
If you find that your extended ASCII implementation does not have characters in the same
positions as the one in bineas.dat, note down the positions of those codes, as above, and
then enter a unique value between 07100 and 07777 in the corresponding position of your
new file. These codes represent punch combinations which are relatively rare, having
punches '&–0' all set.
The code you choose acts as a marker to prevent Quantum replacing the code with a
blank. For example, if ä is code 222 under your implementation of the extended ASCII
character set, you might choose punch code 07111 to represent it. When Quantum reads
ä it will store code 07111; when it writes out your data it will read this code and, when it
looks it up in bintab.qt will find that it represents ä. You will therefore find ä in your new
data file.
If you prefer, you can create a new input code file rather than editing one of those
provided. You can enter codes in octal, decimal or hexadecimal as long as all codes in
the file are of the same format, and each code is separated from its neighbors by white
space, either spaces, tabs or new-lines.
✎ Before you attempt this, we would prefer that you contact you account manager for
advice.
where input is the name of the file you've created and output is the name under which
the binary conversion is to be saved. The options on the command line indicate what type
of values the file contains:
The table below shows the punch code equivalents for characters in the standard ASCII
character set.
In spite of having ever more powerful machines with more and more diskspace and
memory, there may still be times when jobs will fail due to lack of disk space or memory.
If you have a very large or complex spec or a huge data file which you suspect may cause
problems, there are some preliminary checks you can make that may save you wasting
valuable time running a job that is too big for your machine.
• how to check whether you have enough disk space for intermediate files associated
with levels.
• how to have the Quantum load phase use a directory other than /tmp for its temporary
files.
The notes in this section explain how to check whether you have enough disk space for
the intermediate files Quantum creates when processing levels data.
int nincs[ ] =
int tabseqs_[ ] =
int bufsize_[ ] =
On each of these lines there are a series of numbers enclosed in braces ({numbers}).
You can ignore the first number (it is usually zero). Each of the other numbers is
associated with one of the levels in your job. There should be one number for each
level. Make a note of these numbers.
4. Using the information you have noted down, calculate the size of each of the
intlev?.q files as follows.
Let nincs[X] be the value from the file params.h on the line beginning ‘int nincs[ ] =’
which is associated with level X.
Let tabseqs_[X] be the value from the file params.h on the line beginning
‘int tabseqs_[] =’ which is associated with level X.
Let bufsize_[X] be the value from the file params.h on the line beginning
‘int bufsize_[ ] =’ which is associated with level X.
Then:
Then:
The C compiler which loads the files that the Quantum compiler creates into a runnable
program needs space in which to create temporary files. It creates these files in the
system's temporary directory, /tmp.
Often, although user filesystems are large, the /tmp directory may be on a smaller
partition so there is the possibility that the size of the temporary files will exceed the
amount of space in /tmp. (This can happen even on small jobs if your job is running when
large data files are being sorted or there are other programs running that create large files
in /tmp.)
2. Run the load phase by hand, naming a directory other than /tmp for the C compiler
to use for its temporary files. This means that the standard compilation commands
need to change from:
cc -c editQ.c
cc -c -temp=dirname editQ.c
If you put the following lines in a file called cctemp in your project directory, you
can run the compilation simply by using the Unix source command. This example
places the temporary files in the project directory but you could choose another
directory if you wish:
cc -c -temp=. editQ.c
cc -c -temp=. tabsQ.c
cc -c -temp=. tabsQ1.c
cc -c -temp=. axesQ.c
cc -c -temp=. varset.c
cc -temp=. editQ.o tabsQ.o tabsQ1.o axesQ.o varset.o \
/usr/qtime/qt/v5e.3/lib/libzz1.a
If you are using a version of Quantum other than v5e.3 type the version number in
place of v5e.3.
3. If the manual compilation works you can then use Quantum's -r option to read the
data, as in:
quantum -r datafile
✎ If you use this method, you must ensure that each step has worked successfully
before going on to the next step.
This method works for all users, but if your system administrator is available you could
ask him/her to make a symbolic link between the /tmp directory and another directory.
You should then be able to rerun your job in the usual way.
Keywords preceded by an asterisk may be used with the prefix no to switch off a global
requirement for a specific table or element only. If the keyword is followed by an equals
sign it is omitted when no is used (e.g., inc= becomes noinc).
Index / 785
Quantum v5e User’s Manual
786 / Index
Quantum v5e User’s Manual
Index / 787
Quantum v5e User’s Manual
788 / Index
Quantum v5e User’s Manual
D Databases continued
make secure 723
d, delete codes in online edit 153 maximum size of packed file 728
Data packing 728, 732
automatic filtering of in Quanvert 697 Quanvert for Windows 726
C array 17 split large packed 731
checking and verifying 4 store variables in subdirectories 687
compressed, reading 591 transfer format 729
convert to Quanvert database 701 transfer programs for 729
convert to SAS format 669, 676 unknown file formats 731
convert to SPSS format 656, 660 unpack 730
converting multicoded to single coded 169 update from v7 Quanvert 734
correcting 149 update how Quanvert marginals are stored 735
counting responses with numeric codes 100 Datapass error messages 763
define structure in levels file 461 Datapass error summary file 613
merging cards from different files 55 Datapass program 593, 605
merging fields from an external file 57 create 32-bit DOS version 599
merging files 582 date, print date on table 194
non-standard, db.ico file for QVTW 726
processing 604 db.nts file for QVTW 726
non-standard, processing 591 debug, see intermediate figures for special T-statistics
output file for require 614 555
overlapping, with special T-stats 212, 557 dec=, decimal places for absolutes 194, 291
Quantum format 767 with stat= 480
reading into C array 44 Deciles 317
reading non-standard format 59, 417 Decimal places
types of 43 absolutes 194, 291
write out fixed length records 65, 69 in significance levels 480, 483
write out in user-defined format 78 in statistics 480, 483
Data constants, comparing 28 percentages 194, 291
Data files decp=, decimal places for percentages 194, 291
#include with 417 with stat= 480
define T variables in 105 Default options file 581
reading non-standard 59, 417 #def, global values for symbolic parameters 426
Data variables with grids 432
blank out 103 *def (see #def)
checking contents of 29, 31 definelist, name a list 40
comparing 28 delete, delete codes from columns 96
comparing contents of 32 Descriptive statistics, exclude elements from 293
defining in subroutines 176 di, display columns in online edit 152
Database icon for QVTW 726 Difference between .eq. and = 33
Database notes file for QVTW 726 Differences between celllev and uplev 472
Databases Dirty data file 611
access Unix with PC-NFS 733 dirty.q, dirty data file 594, 612
add variables to 705 Disk space
change security level 724 check machine has enough for job 777
check security status 724 reduce amount needed for Quanvert 684
converting unpacked files 733 temporary required during run 778
copying 729 Distribution, comparing 487, 492
do not compress 728 div, divide one table by another 351
join split for unpacking 731 example of 352
linking similar 708 options with 351
Index / 789
Quantum v5e User’s Manual
790 / Index
Quantum v5e User’s Manual
Index / 791
Quantum v5e User’s Manual
792 / Index
Quantum v5e User’s Manual
Index / 793
Quantum v5e User’s Manual
friedman, Friedman’s two-way analysis of variance 495 header=, header length in non-std data file 417
Friedman’s test 495 Hierarchical data
example of 497 process with 475
Friedman’s test, formula 500 processing with clear= 476
F-test see Analysis of variance, one-way, ANOVA processing with levels 459
Functions, C library 178 Highest card type 53, 461, 580
hitch=, print table on same page as previous table 197,
353
how Quantum compares table texts 357
G paper saving mode 355
paste one table under another 358
g, layout column headings 334
short tables with 354
combining groups of 335
table texts with 355
in laser printed tables 629
hold=, rows to reprint at top of continued tables 288,
sid statements with 346 292
spacing with 335 Holecount file 613
go to, routing in edit section 110 Holecounts 123
.ge., greater than or equal to 28 basic 125
graph=, create graphics input files 197 double quotes in headings 125
files created by 617 filtered 126
Grid axes see Grids multiplied 126
Grid tables see Grids weighted 126
grid, identify a grid table 433 hug=, space required at bottom of page 287
Grids
*def with 432
components of 427
creating tables 433 I
example of 430
with code symbolic parameters 428 Icons for QVTW databases 726
with column and code symbolic parameters –id, multiple runs in a directory 596
429 id=, manipulation id 292, 342, 452, 456
with column symbolic parameters 428 on n/col/val/fld/bit 444
exporting in SAS format in Quanvert 438 ident, default print parameters for write 75
filtered columns in 436 print/suppress ruler with 77
in levels jobs 434 ident, turn off defaults 77
increments in 432 Identical statements, filing and retrieving 415
recognizing 427 Ids for manipulation 292
rotated, op= with 434 if, conditional actions 107
weighted 435 .inc files, compressing 721
group=, axis group for element 259, 292 forced editing with 149
groupbeg, start of subaxis 257 with missingincs 162
groupend, end of subaxis 257 with require 146
.gt., greater than 28 inc(), increments in grids 432
.in., comparing values to a list 39
inc=, increment for cell counts 210, 223, 297
element for maximum values of 299
H element for median values of 299
element for minimum values of 299
Harvard Graphics, convert Quantum data for use with
example of 297
650
exclude missing values from calculations 315
hct_, holecount file 594, 613
flipping for Quanvert 685
hd=, axis subheading 223
in grids 432
hdlev=, nested subheadings for column axes 243
in same axis as fac= 312
hdpos=, position of subheadings above columns 245
794 / Index
Quantum v5e User’s Manual
Index / 795
Quantum v5e User’s Manual
796 / Index
Quantum v5e User’s Manual
Index / 797
Quantum v5e User’s Manual
Multicoded data, convert to single coded 169 n20, error variance of the mean 310
Multicodes, entering 14 suppress if has small base 204, 359
Multidimensional tables, example of 340 n23, subheading 242
Multiproject databases n25, component values for means etc. 313
creating 711 in Quanvert databases 684
selecting projects from 715 manipulating components of 445
Multiproject directories, merging into components of print in column axes 292, 313
715 print in row axes 293, 313
Mutually exclusive elements in axes 483 n30, medians 317
n33, text continuation 246
Named filters 379
.in. 40
N
disallowing creation of in Quanvert 696
n statements, options on 290 using 195, 380
n00, filtering within an axis 283 Naming variables 181
example of use 430 nand, force same table number for and tables 344, 373
with n04 and n05 308 ndi, distribute element cases across axis 305
with redefined base 281 .ne., not equal to 28
n01, basic counts 230 Nested filter sections 380
n03, text only 238 Nested subheadings in column axes 243
n04, total 307 net, start a net 247
example of 297 Nets
with n00 308 accumulation of suppressed elements in 252
with n05 308 cases in previous elements 229
n05, subtotal 308 cases not yet counted 229
with n00 308 collecting suppressed elements 198, 224
with n04 308 description of 246
n07, average 311 for previous lines 249
n09, start new page 286 for subsequent lines 247
n10, base 237 percentaging with 254
n11, base, non-printing 237 sorting by net level 198, 224, 250, 528
how to deal with n11s in Quanvert 697 switching off 251
n12, mean 313 sorting within 527
suppress if has small base 204, 359 example of 528, 533
with ANOVA 514 with totals 308
with T-tests 508 netsm, small suppression with nets 198, 224
with two sample T-tests 512 netsort, sort nets by net level 198, 224, 250, 528
n13, sum of factors 316 New axes, disallowing creation of 696
n15, basic counts, non-printing 236 New cards, creating, example of 234
n17, standard deviation 310 New page, starting 286
suppress if has small base 204, 359 Newman-Keuls test on differences between means 515,
562
with ANOVA 514
example of 516
with T-tests 508
P-values for 563
with two sample T-tests 512
Newman-Keuls test on differences between means,
n19, standard error of the mean 310
formula 521
alternative formula for 316
nft, F and T statistics 323
calculate using weighted figures 213
formula 327
suppress if has small base 204, 359
nk, Newman-Keuls test 516
with ANOVA 514
nkl, Newman-Keuls test 562
with T-tests 508
noacr100, suppress 100% on base row 213
with two sample T-tests 512
noaxttl, suppress table headings using axis names 213
798 / Index
Quantum v5e User’s Manual
nobounds, switch off array bounds checking 104 nqtspss, convert Quantum data & spec into SPSS format
nocheck_, possible syntax errors not fatal 10 660
nocol, not a column element 292 how differs from qtspss 661
in Quanvert databases 684 options with 666
nodate, suppress date 213 nsw, squared weight element 212, 316, 547
nodp, suppress double precision calculations 213 ntd, significant net difference test 563
nodsp, no double spacing 213, 226 ntot, exclude element from totals 293
noexport, don’t export element to SAS or SPSS 293 numb, number of codes in a column 25
nofac, no factors 296 Numbering tables 372
noflush, percentages not flush with absolutes 213 Numbers
nograph, suppress graphics 213 decimal places with 16
nohigh, not a higher dimension element 293 integer 16
in Quanvert databases 684 large, in tables 210
noident, switch off default write parameters 77 real 16
noinc, suppress incremental values 213, 226, 298 whole 16
nonetsort, turn off sorting by net level 213, 251, 528 numcode, flag axis as single coded 225
Non-identical statements, filing and retrieving 418 numdir.qv, number of variables per directory 687
Non-standard data files, reading 59, 417 Numeric codes
Non-standard data, processing 591, 604 elements for 274, 276
nontot, exclude element from totals 293 Numeric conditions, defining with val 268
nonz, print all-zero elements 226 Numeric fields, missing values in edit section 161
nonzcol, print all-zero columns 213 Numeric variables
nonzrow, print all-zero rows 213 creating for Quanvert 681
nooverlapfoot, suppress overlap footnotes for special T- define which to flip 685
statistics 199, 558 nums, unmanipulated cell counts file 595, 607
#nopagebox, suppress border on laser printed tables nums.man, manipulated cell counts file 595, 616
636 numsman, manipulated cell counts file 607, 616
nopage, suppress page numbers 202, 213 nz, suppress all-zero elements 226, 293
nopc, suppress percent signs 202, 213 apply to manipulated elements 446
noprint, suppress printing of tables 199 with nocol/norow 293
noround, element not to be force rounded 203, 300 nzcol, suppress all-zero columns 199
norow, not a row element 293 apply to manipulated elements 446
in Quanvert databases 684 nzrow, suppress all-zero rows 199
noscale, ignore scaling factor 213 apply to manipulated elements 446
nosmcol, print small columns 213
nosmrow, print small rows 213
nosmtot, print small totals 213
nosort, unsorted element in sorted table 293, 528 O
nosort, unsorted table in sorted run 213
One dimensional chi-squared test 485
nosort, unsorted table or axis 226
example 485
notauto, suppress automatic titles for T-statistics 199
formula 498
notbl, suppress table numbers 199
One sample T-test 507
Notes file for 726
example 509
Notes file for QVTW
in weighted runs 508
.not., negate logical expressions 36
One sample T-test, formula 520
notitle, suppress page numbering 213
One sample Z-test 501
notopc, suppress percent sign at top of column 213
example of 502
notstat, exclude element from special T-stats 214, 225,
293, 545 formula 518
notstat, suppress special T-statistics 226 One-way analysis of variance 514
notype, suppress output type message 208, 213 example of 514
nqtsas, convert Quantum data & spec into SAS format formula 520
676
Index / 799
Quantum v5e User’s Manual
800 / Index
Quantum v5e User’s Manual
Index / 801
Quantum v5e User’s Manual
802 / Index
Quantum v5e User’s Manual
Index / 803
Quantum v5e User’s Manual
rangeb, test arithmetic value of field, allowing blanks 35 require, validating codes and columns 134
Ranges as conditions 272 action codes 135
Ranking see Sorting actions when test fails 145
Ranks in Friedman test 495 automatic error correction 140
Raw Quantum data, Quanvert analyses using 688 checking codes in columns 138
read=, how to read data 49 checking equivalence of logical expressions 144
Real numbers 16 checking exclusive codes 139
copying into columns 92 checking logical expressions 143
saving in integer variables 89 checking routing 144
significant figures with 16 checking type of coding 136
Real variables 20 comments with 137
defining in subroutines 176 correcting errors from 150
reset to zero 103 data output file for 614
Reals and integers in the same expression 25 data validation 133
rec_acc, number of records accepted 117 defaults with 142
rec_count, number of records read so far 48 file of records failing 612
rec_rej, number of records rejected 117 with if 146
reclen=, record length 50, 417, 580 Required card types 580
Record length 50 Required card types, defining 52, 460
in levels data 462 Reserved variables
in non-std data files 417 allread 46
with levels 580 card_count 48
Record structure, defining 49 first card in record read 46
Record type, defining 49 firstread 46, 476
Records last card in record read 46
counting by axis name 208 lastread 46, 477
distribute one element across the axis 305 lastrec 48
examining with list 128 number of cards read so far 48
last in file, checking for 48 number of records accepted 117
maximum cards in, in levels jobs 580 number of records read so far 48
maximum sub-records per record in levels data 462 number of records rejected 117
multicard with more than 100 columns per card 58 printed_ 63
number read in so far 48 rec_acc 117
printing 135 rec_count 48
rejecting from tables 135 rec_rej 117
types of 43 record written to out 63
writing out parts of 64 rejected_ 116
Redefined base, percentaging against 200 stop statement executed 118
Reformatting data 234 stopped_ 118
rej=, excluding elements from the base 300 this record rejected 116
reject, omit record from tables 116 thisread 46
with require 117 with trailer cards 46
rejected_, current record has been rejected 116 Reserved words with flip 680
Rejecting records from tables 116, 135 Resetting variables between respondents 90
rep=, repeated card types 52 Respondent serial number, using with Quanvert 682
Repeated card types return, go to tabulation section 117
defining 52 with levels 464
in unusual order 47 with reject 117
missing 47 rgrid, rotated grid tables 434
report, write data to report file 66
report=, report type for rim weighting 411
req=, required card types 52
804 / Index
Quantum v5e User’s Manual
Index / 805
Quantum v5e User’s Manual
806 / Index
Quantum v5e User’s Manual
Index / 807
Quantum v5e User’s Manual
808 / Index
Quantum v5e User’s Manual
Index / 809
Quantum v5e User’s Manual
810 / Index
Quantum v5e User’s Manual
Index / 811
Quantum v5e User’s Manual
812 / Index