LWPRG1
LWPRG1
LWPRG1
Essentials
Course Notes
SAS Programming 1: Essentials Course Notes was developed by Charlot Bennett and Kathy Passarella.
Additional contributions were made by Davetta Dunlap, Michele Ensor, Susan Farmer, Ted Meleky,
Linda Mitterling, Bill Powers, Jim Simon, and Roger Staum. Editing and production support was
provided by the Curriculum Development and Support Department.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of
SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product
names are trademarks of their respective companies.
SAS Programming 1: Essentials Course Notes
Copyright 2013 SAS Institute Inc. Cary, NC, USA. All rights reserved. Printed in the United States of
America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in
any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written
permission of the publisher, SAS Institute Inc.
Book code E2132, course code LWPRG1/PRG1, prepared date 14Feb2013.
LWPRG1_003
ISBN 978-1-61290-489-4
Table of Contents
Course Description ...................................................................................................................... ix
Prerequisites ................................................................................................................................. x
Chapter 1
1.1
1.2
1.3
Chapter 2
2.1
2.2
2.3
2.4
iii
iv
3.2
3.3
Chapter 4
4.1
4.2
4.3
4.4
Chapter 5
5.1
5.2
Exercises.................................................................................................................. 5-23
5.3
Chapter 6
6.1
6.2
6.3
Chapter 7
7.1
7.2
7.3
Chapter 8
8.1
8.2
vi
Exercises.................................................................................................................. 8-34
8.3
8.4
8.5
Chapter 9
9.1
9.2
9.3
Chapter 10
vii
viii
Course Description
This course is for users who want to learn how to write SAS programs. It is the entry point to learning
SAS programming and is a prerequisite to many other SAS courses. If you do not plan to write SAS
programs and you prefer a point-and-click interface, you should attend the SAS Enterprise Guide 1:
Querying and Reporting course.
To learn more
For information about other courses in the curriculum, contact the SAS
Education Division at 1-800-333-7660, or send e-mail to [email protected].
You can also find this information on the Web at support.sas.com/training/
as well as in the Training Course Catalog.
For a list of other SAS books that relate to the topics covered in this
Course Notes, USA customers can contact our SAS Publishing Department
at 1-800-727-3228 or send e-mail to [email protected]. Customers outside
the USA, please contact your local SAS office.
Also, see the Publications Catalog on the Web at support.sas.com/pubs for
a complete list of books and a convenient order form.
ix
Prerequisites
Before attending this course, you should have experience using computer software. Specifically, you
should be able to
understand file structures and system commands on your operating systems
access data files on your operating systems.
No prior SAS experience is needed. If you do not feel comfortable with the prerequisites or are new to
programming and think that the pace of this course might be too demanding, you can take the SAS
Programming Introduction: Basic Concepts course before attending this course. SAS Programming
Introduction: Basic Concepts is designed to introduce you to computer programming and presents a
portion of the SAS Programming 1: Essentials material at a slower pace.
Chapter 1 Introduction
1.1
1.2
1.3
1-2
Chapter 1 Introduction
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3
3
What Is SAS?
SAS is a suite of business solutions and technologies
to help organizations solve business problems.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-3
1-4
Chapter 1 Introduction
5
5
Extensible
Integrated
Powerful
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-5
1-6
Chapter 1 Introduction
11
11
12
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1,000
employees
90,000
customers
150,000
orders
64 suppliers
13
13
data newemp;
set orion.emp;
where Salary le 100000;
run;
proc means data=newemp;
class Job_Title;
var Salary;
run;
SAS Windowing
Environment Editor
14
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-7
1-8
Chapter 1 Introduction
15
15
16
16
Editor
Program Editor
Formatting
automatic
manual
Syntax Help
context-sensitive
Output
SAS Report
HTML
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
yes
no
Autocomplete yes
no
Program
Flow
Analysis
no
yes
17
18
18
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-9
1-10
Chapter 1 Introduction
Noninteractive Mode
Example file:
//jobname JOB
// EXEC SAS
//SYSIN DD *
proc freq data=x.pay;
tables ID;
run;
Directory-based example:
SAS filename
z/OS (OS/390) example:
SAS INPUT(filename)
19
19
20
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
p104d01x
chapter #
type
a=activity
d=demo
e=exercise
s=solution
item #
placeholder
21
21
%let path=s:\workshop;
filename sales "&path\sales.dat";
infile "&path\payroll.dat";
22
22
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-11
1-12
Chapter 1 Introduction
Level 2
23
23
Getting Help
In class, you can get product help in several ways,
depending on the editor being used.
Getting Started tutorials
Help facilities included in the software
web-based help, if web access is available
24
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
27
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-13
1-14
Chapter 1 Introduction
28
31
31
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-15
Business Scenario
Identify a location for the course data files and execute
programs to create the files and define the location.
Data
location?
cre8data.sas
libname.sas
32
32
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-16
Chapter 1 Introduction
The data location might be different in your libname program. It was defined based on the
data location specified in cre8data.
2. Submit the program. Click the Log tab and verify that there are no errors or warnings.
Exercises
You must complete the exercises to create the course data files. If you do not create the data files,
all programs in this course will fail.
Required Exercise
1. Creating Course Data
a. The default location for all course data is s:\workshop. If your data files are to be created in a
location other than s:\workshop, you must identify a location for your SAS data files.
Create the SAS data files here: ________________________________________
b. Select File Open Program.
c. Navigate to the data folder, select cre8data, and click Open. The program is displayed in an
editor. Observe the default value in the %LET statement.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-17
d. If your files are to be created at a location other than s:\workshop, change the value assigned to
PATH= to reflect the location of your SAS data files.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-18
Chapter 1 Introduction
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.2
2.3
2.4
2-2
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3
3
SAS Programs
A SAS program is a sequence of one or more steps.
DATA Step
PROC Step
4
4
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-3
2-4
5
5
Step Boundaries
SAS steps begin with either of the following:
a DATA statement
a PROC statement
SAS detects the end of a step when it encounters
one of the following:
a RUN statement (for most steps)
a QUIT statement (for some procedures)
the beginning of another step (DATA statement
or PROC statement)
6
6
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.01 Quiz
How many steps are in this program?
data work.newsalesemps;
length First_Name $ 12
Last_Name $ 18 Job_Title $ 25;
infile "&path\newemps.csv" dlm=',';
input First_Name $ Last_Name $
Job_Title $ Salary;
run;
proc print data=work.newsalesemps;
run;
proc means data=work.newsalesemps;
var Salary;
run;
p102d01
7
7
p102d01
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-5
2-6
p102d01
10
10
11
p102d01
11
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.02 Quiz
How does SAS detect the end of each step in this program?
data work.newsalesemps;
length First_Name $ 12
Last_Name $ 18 Job_Title $ 25;
infile "&path\newemps.csv" dlm=',';
input First_Name $ Last_Name $
Job_Title $ Salary;
run;
proc print data=work.newsalesemps;
proc means data=work.newsalesemps;
var Salary;
12
12
16
16
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-7
2-8
Business Scenario
Orion Star programmers will create and execute SAS
programs and view results in an interactive environment.
They must become familiar with both environments.
SAS Enterprise
Guide
SAS Windowing
Environment
17
17
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-9
You can use the Program tab to access and edit existing SAS programs, write new programs, submit
programs, and save programs to a file. The program is color-coded in the Program Editor.
2. To submit the program for execution, select Program Run program-name on <server> or click
Run on the Program tab. The F3 or F8 keys can also be used to submit a program.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-10
The statements added by Enterprise Guide are used during processing and are referred to as
wrapper code. To suppress the display of the wrapper code in the log, select Tools Options
Results General and clear Show generated wrapper code in SAS log.
To scroll horizontally in the log, use the horizontal scroll bar. To scroll vertically, use the vertical scroll
bar or use the PAGE UP or PAGE DOWN keys on the keyboard.
To scroll vertically in the Results window, use the vertical scroll bar or use the PAGE UP or PAGE
DOWN keys on the keyboard.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-11
3. Click the Program tab to display the program and then run the program again.
4. Click Yes to replace the results from the previous run.
6. Reset the output formats to generate SAS Report or text output (or both) as desired.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-12
Idea Exchange
Have you worked with SAS Enterprise Guide? If so, what
do you like about it?
19
19
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
or Close.
2-13
You use the editor to access and edit existing SAS programs, write new programs, submit programs,
and save programs to a file. Programs are color-coded in the Editor window.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-14
Alternatively, you can issue the INCLUDE command to open or include a program in your session.
With the Editor window active, type include on the command bar followed by the name of the
program file enclosed in quotation marks. SAS looks for the file in the current or active folder.
Type include 'p102d01.sas' and press ENTER. The program is displayed in the Editor window.
, or select Run
2. To submit the program for execution, with the Editor window active, click
Submit. Alternatively, you can type submit on the command bar or press F3 or F8.
You should always check the log for errors or warnings, even if the program generates output.
1. To access the Log window, click the Log tab near the bottom of the SAS window. You can also select
View Log or submit the LOG command using the command bar.
With the Log window active, use the vertical scroll bar or the Page Up/Page Down keys to scroll up
to the first line of the program that you just submitted. The contents of the Log window are
cumulative, with the most recent information added to the bottom, so be sure you are looking at the
most recent information.
In this example, the Log window contains no warning or error messages.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2. To clear the contents of the Log window, make sure that the window is active and then click
can also issue the CLEAR command or select Edit Clear All.
2-15
. You
SAS HTML output, the default output in SAS 9.3, is displayed automatically in the Results Viewer.
Use the vertical scroll bar or the PAGE UP and PAGE DOWN keys on the keyboard to scroll from top to
bottom. You can also enter the TOP and BOTTOM commands on the command bar to scroll vertically.
Use the horizontal scroll bar, if displayed, to scroll side to side.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-16
3. Return to the Editor window for p101d01 and resubmit the program. The output is added to the
Results window, so it now contains two copies of the report. The log also contains information about
the previous and current programs, unless you cleared it before resubmitting the program.
Output Window
1. LISTING output, produced by request only, is displayed in the Output window. Click the Output tab
to view the Output window.
In the SAS windowing environment, the content displayed in the Output window is the last page of
output generated by the most recent program. Scroll up to see the top of the report.
To scroll vertically in the Output window, use the PAGE UP or PAGE DOWN keys on the
keyboard, use the vertical scroll bar, or submit the FORWARD and BACKWARD commands.
You can also use the TOP and BOTTOM commands to scroll vertically in the Output window.
Use the horizontal scroll bar or issue the RIGHT and LEFT commands to scroll horizontally.
2. To clear the contents of the Output window, make sure that window is active and then click
issue the CLEAR command, or select Edit Clear All.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
or
2-17
Click a plus sign to expand or collapse individual bookmarks, or right-click the word Results and select
Expand All/Collapse All. Each bookmark contains an icon indicating the file type: HTML or LISTING.
Double-click a bookmark to view the corresponding report in the appropriate viewer.
2. In the SAS windowing environment, select Help SAS on the Web Customer Support Center.
Select Documentation on the Customer Support web page. The SAS Product Documentation web
page is displayed.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-18
21
Exercises
You can use SAS Enterprise Guide or the SAS windowing environment to complete your exercises.
Select the type of output that you prefer. LISTING output is shown in the exercises.
Level 1
1. Submitting a Program
a. With the appropriate Editor window active, open the SAS program p102e01.
To open the program in SAS Enterprise Guide:
Select Program Open Program, select p102e01, and click Open.
To open the program in SAS:
Select File Open Program and select p102e01.
Windows or UNIX
z/OS (OS/390)
b. Submit the program for execution. How many rows and columns are in the report?
rows: __________
columns: __________
c. Examine the Log window. Based on the log notes following the DATA step, how many
observations and variables are in the work.country data set?
observations: __________
variables: __________
d. Clear the Log and Output windows (SAS windowing environment only).
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-19
Level 2
2. Exploring Your Environment: SAS Enterprise Guide
a. Customize the appearance and functionality of the Editor by selecting Tools Options
SAS Programs. For example, click Editor Options and click the Appearance tab to modify the
font and font size.
b. Customize the type of output produced by selecting Tools Options Results General. Select
or deselect SAS Report, HTML, PDF, RTF, and Text Output. The default format is
SAS Report. Text output is shown in the course notes.
3. Exploring Your SAS Environment: Windows
a. Customize the appearance and functionality of the Enhanced Editor by selecting Tools
Options Enhanced Editor. For example, click the Appearance tab to modify the font and
font size.
b. Customize the type of output produced by selecting Tools Options Preferences. Click the
Results tab and select or clear Create listing and Create HTML as appropriate. The default
format is HTML. LISTING output is shown in the course notes.
Challenge
4. Enabling and Disabling the Project Log (SAS Enterprise Guide Only)
a. Use the Index tab in SAS Enterprise Guide Help to find information about the project log.
b. Enable and turn on the project log. Run a few programs and view the log to see its contents.
5. Setting Up a Function Key to Clear the Log and Output Windows (SAS Windowing
Environment Only)
a. Issue the KEYS command or select Tools Options Keys to access the KEYS window. The
KEYS window is a secondary window used to browse or change function key definitions.
b. Type the following commands in the Definition column for the F12 key:
clear log; clear output
c. Which key is programmed to submit a KEYS command? _____
d. Close the KEYS window. This saves the key definition in your user profile.
e. Press the F12 key and confirm that the Log and Output windows are cleared.
It is not necessary to clear the Log and Output windows in SAS Enterprise Guide because
this is done automatically each time a program is submitted.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-20
24
24
Business Scenario
Well-formatted, clearly documented SAS programs are
an industry best practice.
25
25
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
26
26
2.04 Quiz
How many statements make up this DATA step?
a.
b.
c.
d.
one
three
five
seven
data work.newsalesemps;
length First_Name $ 12
Last_Name $ 18 Job_Title $ 25;
infile "&path\newemps.csv" dlm=',';
input First_Name $ Last_Name $
Job_Title $ Salary;
run;
27
27
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-21
2-22
p102d02
29
29
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-23
Recommended Formatting
formatting
White space can be blanks, tabs, and new lines. Add them to increase the readability of the code.
View the log and output to confirm that the program ran successfully.
3. Click the Program tab and select Edit Format Code. The program is formatted automatically.
You can undo the automatic formatting by selecting Edit Undo or pressing the CTRL and
Z keys simultaneously.
To customize the formatting, select Program Editor Option and click the Indenter tab.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-24
Program Documentation
You can embed comments in a program as explanatory
text.
/* create a temporary data set, newsalesemps */
/* from the text file newemps.csv
*/
data work.newsalesemps;
/* comment */
length First_Name $ 12
Last_Name $ 18 Job_Title $ 25;
*read a comma delimited file;
infile "&path\newemps.csv" dlm=',';
input First_Name $ Last_Name $
Job_Title $ Salary;
* comment statement ;
run;
* comment;
/* comment */ These comments can be any length and can contain semicolons. They cannot be nested.
SAS Comments
This program contains four comments.
*----------------------------------------*
|
This program creates and uses the
|
|
data set called work.newsalesemps.
| n
*----------------------------------------*;
data work.newsalesemps;
length First_Name $ 12 Last_Name $ 18
Job_Title $ 25;
infile "&path\newemps.csv" dlm=',';
input First_Name $ Last_Name $
Job_Title $ Salary /*numeric*/;o
run;
/*
proc print data=work.newsalesemps; p
run;
*/
proc means data=work.newsalesemps;
*var Salary; q
run;
34
p102d03
34
In SAS Enterprise Guide and the Enhanced Editor in SAS, to comment out a block of code using
the /* */ technique, you can highlight the code and then press the CTRL key and the / (forward
slash) key simultaneously. To uncomment a block of code, highlight the block and press the
CTRL, SHIFT, and / keys simultaneously.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.05 Quiz
Open and examine p102a01. Based on the comments,
which steps do you think will execute and what output will
be generated?
Submit the program. Which steps are executed?
35
Business Scenario
Orion Star programmers must be able to identify and
correct syntax errors in a SAS program.
38
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-25
2-26
Syntax Errors
A syntax error is an error in the spelling or grammar of
a SAS statement. SAS finds syntax errors as it compiles
each SAS statement, before execution begins.
Examples of syntax errors:
misspelled keywords
unmatched quotation marks
missing semicolons
invalid options
39
39
2.06 Quiz
This program includes three syntax errors. One is an
invalid option. What are the other two syntax errors?
daat work.newsalesemps;
length First_Name $ 12
Last_Name $ 18 Job_Title $ 25;
infile "&path\newemps.csv" dlm=',';
input First_Name $ Last_Name $
Job_Title $ Salary;
run;
proc print data=work.newsalesemps
run;
invalid option
40
p102d04
40
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Syntax Errors
The Enhanced Editor in SAS and the Program Editor
in SAS Enterprise Guide use the color red to indicate
a potential error in your SAS code.
42
42
Syntax Errors
When SAS encounters a syntax error, it writes a warning or
error message to the log.
ERROR 22-322: Syntax error, expecting one of the following:
a name, a quoted string, (, /, ;, _DATA_, _LAST_,
_NULL_.
WARNING: Data set WORK.TEST was not replaced because this step was
stopped.
You should always check the log to make sure that the
program ran successfully, even if output is generated.
43
43
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-27
2-28
In SAS Enterprise Guide, you can use the Up and Down arrows in the Log menu to find the
previous or next warning or error in the log.
daat work.newsalesemps;
---14
WARNING 14-169: Assuming the symbol DATA was misspelled as daat.
78
length First_Name $ 12
79
Last_Name $ 18 Job_Title $ 25;
80
infile "&path\newemps.csv" dlm=',';
81
input First_Name $ Last_Name $
82
Job_Title $ Salary;
83
run;
NOTE: The infile "s:\workshop\newemps.csv" is:
Filename=s:\workshop\newemps.csv,
NOTE: 71 records were read from the infile "s:\workshop\newemps.csv".
NOTE: The data set WORK.NEWSALESEMPS has 71 observations and 4 variables.
84
85
proc print data=work.newsalesemps
86
run;
--22
202
ERROR 22-322: Syntax error, expecting one of the following: ;, (, BLANKLINE, DATA, DOUBLE,
HEADING, LABEL, N, NOOBS, OBS, ROUND, ROWS, SPLIT, STYLE, SUMLABEL, UNIFORM,
WIDTH.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-29
ERROR 202-322: The option or parameter is not recognized and will be ignored.
88
NOTE: The SAS System stopped processing this step because of errors.
89
proc means data=work.newsalesemps average max;
------22
202
ERROR 22-322: Syntax error, expecting one of the following: ;, (, ALPHA, CHARTYPE, CLASSDATA,
CLM, COMPLETETYPES, CSS, CV, DATA, DESCEND, DESCENDING, DESCENDTYPES, EXCLNPWGT,
EXCLNPWGTS, EXCLUSIVE, FW, IDMIN, KURTOSIS, LCLM, MAX, MAXDEC, MEAN, MEDIAN, MIN,
MISSING, MODE, N, NDEC, NMISS, NOLABELS, NONOBS, NOPRINT, NOTHREADS, NOTRAP,
NWAY, ORDER, P1, P10, P20, P25, P30, P40, P5, P50, P60, P70, P75, P80, P90, P95,
P99, PCTLDEF, PRINT, PRINTALL, PRINTALLTYPES, PRINTIDS, PRINTIDVARS, PROBT, Q1,
Q3, QMARKERS, QMETHOD, QNTLDEF, QRANGE, RANGE, SKEWNESS, STACKODS,
STACKODSOUTPUT, STDDEV, STDERR, SUM, SUMSIZE, SUMWGT, T, THREADS, UCLM, USS, VAR,
VARDEF.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
90
var Salary;
91
run;
NOTE: The SAS System stopped processing this step because of errors.
If you are using the Enterprise Guide Editor or the SAS Enhanced Editor, the program remains
in the Editor window. However, if you use the SAS Program Editor, the code disappears with
each submission. Use the RECALL command to recall the program. You can also select Run
Recall Last Submit, or press F4. The program is redisplayed in the Program Editor.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-30
When you modify a program, an asterisk (*) is displayed after the program name in the Editor
window. When you save the program, the asterisk is removed.
If the code is not in the Program Editor, recall it before saving the program.
2.07 Quiz
What is the syntax error in this program?
p102d05
45
2-31
There are no messages of any kind following each step in the log because the steps did not execute. The
absence of messages from SAS typically indicates unbalanced quotation marks. Another indication is the
DATA STEP running message in the banner of the Editor window. This is because the RUN statement
was viewed as part of the character literal and not as a step boundary.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-32
Stopping a Program
You can stop an executing program in an interactive session.
1. To stop a program in the Windows environment, click
the CTRL and Break keys.
2. Select 1. Cancel Submitted Statements in the Tasking Manager window and click OK.
3. Select Y to cancel submitted statements, and click OK. The program stops executing.
Resubmitting a Program
1. Add a closing quotation mark to the DLM= option in the INFILE statement to correct the program.
2. You might choose to insert the statements to balance quotations marks programmatically before your
program code. This is not necessary if you have already stopped the DATA step, but some SAS
programmers include this as the first line of every program that they write. Alternatively, some
programmers add these statements at the end of every program.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
*';*";run;
data work.newsalesemps;
length First_Name $ 12 Last_Name $ 18
Job_Title $ 25;
infile "&path\newemps.csv" dlm=',';
input First_Name $ Last_Name $
Job_Title $ Salary;
run;
proc print data=work.newsalesemps;
run;
proc means data=work.newsalesemps;
var Salary;
run;
3. Resubmit the program and verify that the program ran successfully.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-33
2-34
It is very similar to the statement that is entered in SAS to correct the unbalanced quotation marks,
but it also contains a single semicolon, a */ to close a comment, and a QUIT statement. SAS
Enterprise Guide is attempting to fix any potential unbalanced situation from a previously submitted
program.
The program did not execute successfully, but this time there is a warning and an error in the log, and
the DATA step stopped. (Your log might contain different warnings or errors than those shown here.)
WARNING: The quoted string currently being processed has become more than 262 characters long.
You might have unbalanced quotation marks.
36
37
38
39
%LET _CLIENTPROJECTNAME=;
%LET _SASPROGRAMFILE=;
;*';*";*/;quit;run;
____
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.NEWSALESEMPS may be incomplete. When this step was stopped there
were 0 observations and 3 variables.
The first warning above says that a quoted string has exceeded 262 characters. This is not a syntax
error, but the message is an indication that the program might have unbalanced quotation marks.
The error is caused by the QUIT statement. Although it is included to prevent errors, in this case it
caused an error, but the program did not hang as it did in the SAS session.
3. Add a closing quotation mark to the DLM= option in the INFILE statement to correct the program.
4. Resubmit the program, replacing the previous results. Verify that the program runs successfully.
Stopping a Program
SAS Enterprise Guide has built-in features to prevent SAS programs from hanging, but occasionally you
might need to stop or interrupt a program while it is executing. There are two methods:
Method 1: Click the Stop button next to the Run button. It is normally dimmed, but it is red and
selectable while a program is executing.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-35
Method 2: Click View Task Status to access the Task Status window.
The Task Status window shows the status of all running tasks and programs.
Right-click the task and then select Stop or End SAS Process.
Stop ends the current task while maintaining your connection to the server. If the server is in the
middle of processing a step, it might take a few minutes for the server to stop running the task.
End SAS Process immediately ends the current task and terminates the connection with the server.
Because the connection with the server is terminated, any temporary data is lost.
Exercises
Level 1
6. Diagnosing and Correcting Syntax Errors
a. With the appropriate Editor window active, open the SAS program p102e06.
b. Submit the program and use the notes in the SAS log to identify the error.
c. Correct the error and resubmit the program.
d. Save the corrected program.
Level 2
7. Diagnosing and Correcting Syntax Errors
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-36
a. With the appropriate Editor window active, open the SAS program p102e07.
b. Submit the program and use the notes in the SAS log to identify the error.
c. Correct the error and resubmit the program.
d. Save the corrected program.
Challenge
8. Identifying SAS Components
Explore the Internet to find how to list SAS software products licensed at your site.
2.4 Solutions
Solutions to Exercises
1. Submitting a Program
a. Submit the program.
b. The report has 238 rows and 3 columns.
c. The Log window indicates 238 observations and 2 variables.
d. SAS windowing environment only: Click
Clear All in both windows.
2.4 Solutions
2-37
5. Setting Up a Function Key to Clear the Log and Output Windows (SAS Windowing
Environment Only)
a. Submit the Keys command or select Tools Options Keys.
b. Add the commands to the Definition column for the F12 key:
Only the first three characters are needed for each command, so the following are
alternate ways to define the F12 key: clear log; clear out or cle log; cle out
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
31DEC2012
31DEC2012
2-38
31DEC2012
DATA Step
PROC Step
PROC Step
p102d01
8
8
13
The DATA step ends at the RUN statement. The PROC PRINT
step ends at the PROC MEANS statement. The end of the
PROC MEANS step might or might not be detected. A RUN
statement is recommended.
13
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.4 Solutions
one
three
five
seven
data work.newsalesemps;
work.NewSalesEmps;
length First_Name $ 12
Last_Name $ 18 Job_Title $ 25;
infile "&path\newemps.csv"
'newemps.csv' dlm=',';
dlm=',';
input First_Name $ Last_Name $
Job_Title $ Salary;
run;
28
28
36
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-39
2-40
invalid option
41
41
46
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.2
3.3
3-2
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3
3
Business Scenario
Many SAS data sets related to the Orion Star project
already exist. The programmers need to know how to
display the structure and contents of the data sets.
SAS Data Set
Report
4
4
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-3
3-4
observations
variables
Database Terminology
Table
Observation
Row
Variable
Column
6
6
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7
7
Descriptor Portion
The descriptor portion contains the following metadata:
general properties (such as data set name and
number of observations)
variable properties (such as name, type, and length)
Partial work.newsalesemps
Data Set Name WORK.NEWSALESEMPS
Engine
V9
Created
Mon, Feb 27, 2012 01:28 PM
Observations
71
Variables
4
...
First_Name
Last_Name
Job_Title Salary
$ 12
$ 18
$ 25
N8
general
properties
variable
properties
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-5
3-6
p103d01
9
9
WORK.NEWSALESEMPS
DATA
V9
Mon, Feb 27, 2012 01:28:51 PM
Mon, Feb 27, 2012 01:28:51 PM
Observations
Variables
Indexes
Observation Length
Deleted Observations
Compressed
Sorted
71
4
0
64
0
NO
NO
Variable
Type
Len
1
3
2
4
First_Name
Job_Title
Last_Name
Salary
Char
Char
Char
Num
12
25
18
8
10
10
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.01 Quiz
How many observations are in the data set
work.donations?
11
11
Data Portion
The data portion of a SAS data set contains the data
values, which are either character or numeric.
Partial work.newsalesemps
First_Name
Satyakam
Monica
Kevin
Petrea
Last_Name
Denny
Kletschkus
Lyon
Soltau
Job_Title
Sales Rep. II
Sales Rep. IV
Sales Rep. I
Sales Rep. II
character values
Salary
26780
30890
26955
27440
variable
names
data
values
numeric
values
13
13
Variable names are part of the descriptor portion, not the data portion.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-7
3-8
p103d02
14
14
First_Name
Last_Name
Satyakam
Monica
Kevin
Petrea
Marina
Denny
Kletschkus
Lyon
Soltau
Iyengar
Job_Title
Sales
Sales
Sales
Sales
Sales
Rep.
Rep.
Rep.
Rep.
Rep.
II
IV
I
II
III
Salary
26780
30890
26955
27440
29715
15
15
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-9
Salary
cust_ID
month1
FirstName
16
16
data5mon
5monthsdata
data#5
five months data
five_months_data
FiveMonthsData
fivemonthsdata
17
17
A variable name can contain special characters if you place the name in quotation marks and immediately
follow it with the letter N (for example, 'Flight#'n). This is called a SAS name literal. In order to use SAS
name literals as variable names, the VALIDVARNAME= option must be set to ANY.
options validvarname=any;
This setting is the default in SAS Enterprise Guide.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-10
Data Types
A SAS data set supports two types of variables.
Character variables
can contain any value: letters, numerals, special
characters, and blanks
range from 1 to 32,767 characters in length
have 1 byte per character.
Numeric variables
store numeric values using floating point or binary
representation
have 8 bytes of storage by default
can store 16 or 17 significant digits.
19
Last_Name
Kletschkus
Lyon
Soltau
Job_Title
Sales Rep. IV
Sales Rep. I
A blank represents a
missing character value.
Salary
.
26955
27440
A period represents a
missing numeric value.
By default, a period is used for a missing numeric value. This default can be altered with the
MISSING= SAS system option.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-11
-365
01JAN1960
store
01JAN1961
366
display
01/01/1959
01/01/1960
01/01/1961
3.03 Quiz
What is the numeric value for todays date?
Submit program p103a02.
View the output to retrieve the current date as a
SAS date value (that is, a numeric value referencing
January 1, 1960).
22
22
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-12
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Level 1
1. Examining the Data Portion of a SAS Data Set
a. Retrieve the starter program p103e01.
b. After the PROC CONTENTS step, add a PROC PRINT step to display all observations,
all variables, and the Obs column for the data set named work.donations.
c. Submit the program to create the PROC PRINT report below. The results contain 124
observations.
Partial PROC PRINT Output
Obs
1
2
3
...
123
124
Employee_
ID
Qtr1
Qtr2
Qtr3
Qtr4
Total
120265
120267
120269
.
15
20
.
15
20
.
15
20
25
15
20
25
60
80
121145
121147
35
10
35
10
35
10
35
10
140
40
Level 2
2. Examining the Descriptor and Data Portions of a SAS Data Set
a. Retrieve the starter program p103e02.
b. After the DATA step, add a PROC CONTENTS step to display the descriptor portion of
work.newpacks.
c. Submit the program and answer the following questions:
How many observations are in the data set? ______________________________________
How many variables are in the data set? _________________________________________
What is the length (byte size) of the variable Product_Name? _____________________
d. After the PROC CONTENTS step, add a PROC PRINT to display the data portion of
work.newpacks.
Submit the program to create the following PROC PRINT report:
Obs
1
Supplier_Name
Top Sports
Supplier_
Country
DK
Product_Name
Black/Black
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Top Sports
Top Sports
DK
DK
X-Large Bottlegreen/Black
Comanche Women's 6000 Q Backpack. Bark
ES
US
3-13
Challenge
3. Working with Times and Datetimes
a. Retrieve and submit the starter program p103e03.
b. Notice the values of CurrentTime and CurrentDateTime in the PROC PRINT output.
c. Use the SAS Help facility or product documentation to investigate how times and datetimes are
stored in SAS.
d. Complete the following sentences:
A SAS time value represents the number of __________________________________________.
A SAS datetime value represents the number of _______________________________________.
27
27
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-14
Business Scenario
Orion Star programmers need to access existing
SAS data sets, so they need to understand how the
data sets are stored in SAS.
28
SAS Libraries
SAS data sets are stored in SAS libraries. A SAS library is
a collection of SAS files that are referenced and stored as
a unit.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
SAS Libraries
You can think of a SAS library as a drawer in a filing
cabinet and a SAS data set as one of the files in the
drawer.
data set
libraries
30
30
temporary library
permanent library
work
sashelp
31
31
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-15
3-16
Assigning a Libref
Regardless of the operating system that you use, you
refer to a SAS library by a logical name called a library
reference name, or libref.
work
libref
sashelp
32
32
Temporary Library
Work is a temporary library where you can store and
access SAS data sets for the duration of the SAS session.
It is the default library.
work
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Permanent Libraries
Sashelp is a permanent library that contains sample
SAS data sets you can access during your SAS session.
sashelp
34
Permanent Libraries
Sasuser is a permanent library that you can use to store
and access SAS data sets in any SAS session.
sasuser
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-17
3-18
work.newsalesemps
sashelp.class
libref
libref.data-set-name
Business Scenario
Orion Star programmers need to access and view
SAS data sets that are stored in a permanent
user-defined library.
work
sashelp
orion
37
37
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
User-Defined Libraries
Users can create their own SAS libraries. A user-defined
library
is permanent. Data sets are stored until the user
deletes them.
is implemented within the operating environments file
system.
is not automatically available in a SAS session.
38
38
User-Defined Libraries
Operating
Environment
A SAS library is
Example
Microsoft Windows
A folder
s:\workshop
UNIX
A directory
~/workshop
z/OS (OS/390)
A sequential file
userid.workshop.sasdata
39
39
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-19
3-20
libref
orion
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-21
LIBNAME Statement
The SAS LIBNAME statement is a global SAS statement.
libname orion "s:\workshop";
LIBNAME libref "SAS-library" <options>;
42
In the Microsoft Windows environment, an existing folder is used as a SAS library. The LIBNAME
statement cannot create a new folder.
In the UNIX environment, an existing directory is used as a SAS library. The LIBNAME statement
cannot create a new directory.
In the z/OS environment, a sequential file is used as a SAS library. z/OS (OS/390) users can use a SAS
LIBNAME statement, a DD statement, or a TSO ALLOCATE command. These statements and
commands can create a new library.
43
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-22
44
Browsing a Library
Step 3 You can browse a library interactively in a
SAS or SAS Enterprise Guide session, or
programmatically using the CONTENTS
procedure.
PROC CONTENTS
orion
46
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-23
47
47
ORION
V9
S:\workshop
S:\workshop
Name
Member
Type
File
Size
1
2
3
CHARITIES
CONSULTANTS
COUNTRY
COUNTRY
CUSTOMER
CUSTOMER_DIM
DATA
DATA
DATA
INDEX
DATA
DATA
9216
5120
17408
17408
33792
33792
4
5
Last Modified
23Aug12:15:58:39
23Aug12:15:58:39
13Oct10:19:04:39
13Oct10:19:04:39
04Nov11:09:52:27
04Nov11:09:52:27
48
48
A member type of DATA indicates a standard SAS data set. INDEX indicates a file that enables SAS to
access observations in the SAS data set quickly and efficiently.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-24
Country
AU
CA
DE
IL
TR
US
ZA
Country_Name
Population
Australia
Canada
Germany
Israel
Turkey
United States
South Africa
20,000,000
.
80,000,000
5,000,000
70,000,000
280,000,000
43,000,000
Country_
ID
160
260
394
475
905
926
801
Continent_
ID
96
91
93
95
95
91
94
Country_Former
Name
East/West Germany
p103d04
49
50
50
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-25
51
3-26
When working with the course data and programs, be sure to run libname.sas at the start of
every Enterprise Guide session to set the path variable and assign the orion libref.
6. Check the log to confirm that the orion libref was assigned. The physical name reflects the location
of your SAS data files and might differ from the name shown below.
15
%let path=s:\workshop;
16
libname orion "s:\workshop";
NOTE: Libref ORION was successfully assigned as follows:
Engine:
V9
Physical Name: s:\workshop
7. In the Server List, click Libraries and then click Refresh. The orion library is in the active library
list.
8. Expand ORION to see the list of data sets.
10. Click
11. In the Server List, right-click COUNTRY and select Properties. The Properties window is displayed
with the General tab selected. Click the Columns tab to see information about the variables.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Click
12. Explore the orion library programmatically using the CONTENTS procedure. Open and submit
program p103d03 to generate a list of library members. Partial output is shown below.
proc contents data=orion._all_ nods;
run;
The CONTENTS Procedure
Directory
Libref
Engine
Physical Name
Filename
Name
Member
Type
1
2
3
CHARITIES
CONSULTANTS
COUNTRY
COUNTRY
CUSTOMER
DATA
DATA
DATA
INDEX
DATA
ORION
V9
s:\workshop
s:\workshop
File
Size
9216
5120
17408
17408
33792
Last Modified
23Aug12:15:58:39
23Aug12:15:58:39
13Oct10:19:04:39
13Oct10:19:04:39
04Nov11:09:52:27
13. Open a new Program window. Type and submit the following statement to clear the orion libref.
libname orion clear;
14. Check the log and the Server List to verify that the orion libref was deassigned.
15
libname orion clear;
NOTE: Libref ORION has been deassigned.
In the Server List, click Libraries and click Refresh. Verify that orion is no longer defined.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-27
3-28
3. In the Server List, click Libraries and select Refresh. Expand ORION to see its members.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.05 Poll
The library display in the Server List updates immediately
when a libref is assigned or cleared using SAS Enterprise
Guide.
True
False
53
2. Double-click Libraries to show all available libraries. The active libraries are displayed.
3. Open libname.sas.
%let path=s:\workshop;
*libname orion "s:\workshop";
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-29
3-30
The first statement creates a macro variable named path and assigns it a full path to the folder
containing the course data. The second statement, a LIBNAME statement, associates the libref, orion,
with the same data location. The LIBNAME statement is commented out.
4. Modify the program by removing the asterisk from the start of the LIBNAME statement. Save the
modified program.
5. Submit the program.
When working with the course data and programs, be sure to run libname.sas at the start of
every SAS session to set the path variable and assign the orion libref.
6. Check the log to confirm that the orion libref was assigned. The physical name reflects the location
of your SAS data files and might differ from the name shown below.
1. 412 %let path=s:\workshop;
413 libname orion "s:\workshop";
NOTE: Libref ORION was successfully assigned as follows:
Engine:
V9
Physical Name: s:\workshop
7. In the Explorer window, verify that orion is now displayed as an active library.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10. Click
3-31
11. Double-click the Sales data set or right-click the file and select Open. The data set orion.sales opens
in a VIEWTABLE window.
Variable labels are displayed by default. You can display variable names instead of variable labels by
selecting View Column Names. In addition to browsing SAS data sets, you can use the
VIEWTABLE window to edit and create data sets, and to customize your view of a SAS data set.
12. Click
13. Explore the orion library programmatically. Open and submit p103d03 to generate a list of library
members.
proc contents data=orion._all_ nods;
run;
14. View the output. Partial output is shown below.
The CONTENTS Procedure
Directory
Libref
Engine
Physical Name
Filename
Name
Member
Type
1
2
3
CHARITIES
CONSULTANTS
COUNTRY
COUNTRY
DATA
DATA
DATA
INDEX
ORION
V9
S:\workshop
S:\workshop
File
Size
9216
5120
17408
17408
Last Modified
23Aug12:15:58:39
23Aug12:15:58:39
13Oct10:19:04:39
13Oct10:19:04:39
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-32
16. Open a new Editor window. Type and submit the following statement to clear the orion libref.
libname orion clear;
17. Check the log and the Explorer window to verify that the orion libref was deassigned.
15
libname orion clear;
NOTE: Libref ORION has been deassigned.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-33
Select Enable at startup to have this library defined automatically when a SAS session
starts.
3. No notes are written to the log when this method is used to assign a library. Use the Explorer window
to verify that orion was assigned.
56
Exercises
You must submit a LIBNAME statement before beginning this exercise session.
a. Open libname.sas. Modify the program by removing the asterisk from the LIBNAME statement.
b. Submit the modified program.
c. Verify that the libref orion was successfully assigned.
d. Save the program.
Level 1
4. Accessing a Permanent Data Set
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-34
a. Use an interactive facility to explore the orion library and answer the following questions:
How many observations are in the orion.country data set? _______
How many variables are in the orion.country data set? ______
What is the name of the last country in the data set? _____________________________
b. Submit a PROC CONTENTS step to generate a list of all members in the orion library. What is
the name of the last member listed? __________________________________________
Level 2
5. Viewing General Data Set Properties
a. Examine the general data set properties of orion.staff.
b. What sort information is stored for this data set? ____________________________________
Challenge
6. SAS Autoexec File
Use the Help facility or product documentation to investigate the SAS autoexec file and answer the
following questions.
What is the name of the file? __________________________________________________________
What is its purpose? _________________________________________________________________
How is it created? ___________________________________________________________________
How could this be useful in a SAS session? ______________________________________________
3.3 Solutions
Solutions to Exercises
1. Examining the Data Portion of a SAS Data Set
data work.donations;
infile "&path\donation.dat";
input Employee_ID Qtr1 Qtr2 Qtr3 Qtr4;
Total=sum(Qtr1,Qtr2,Qtr3,Qtr4);
run;
proc contents data=work.donations;
run;
proc print data=work.donations;
run;
2. Examining the Descriptor and Data Portions of a SAS Data Set
data work.newpacks;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.3 Solutions
3-35
b. Submit a PROC CONTENTS step to generate a list of all members in the orion library.
proc contents data=orion._all_ nods;
run;
What is the name of the last member listed? US_SUPPLIERS
5. Viewing General Data Set Properties
proc contents data=orion.staff;
run;
a. Examine the general data set properties of orion.staff.
b. What sort information is stored for this data set? The General Information section indicates
that the data set is sorted. The Variable section indicates that it is sorted by Employee_ID
using the ANSI character set, and has been validated.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-36
data work.donations;
infile "&path\donation.dat";
input Employee_ID Qtr1 Qtr2 Qtr3 Qtr4;
Total=sum(Qtr1,Qtr2,Qtr3,Qtr4);
run;
proc contents data=work.donations;
run;
12
p103a01s
12
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.3 Solutions
data5mon
5monthsdata
data#5
five months data
five_months_data
FiveMonthsData
fivemonthsdata
18
18
23
23
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-37
3-38
45
54
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.3 Solutions
57
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-39
3-40
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Chapter 4
4.1
4.2
4.3
4.4
4-2
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3
3
Business Scenario
Orion Star management wants a report that displays the
names, salaries, and a salary total for all sales employees.
orion.sales
Obs
1
2
3
Last_Name
xxxxxxx
xxxxxxx
xxxxxxx
PROC PRINT
First_Name
xxxxxxxxxx
xxxxxxxxxx
xxxxxxxxxx
Salary
99999
99999
99999
----99999
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-3
4-4
PRINT Procedure
By default, PROC PRINT displays all observations, all
variables, and an Obs column on the left side.
proc print data=orion.sales;
run;
Partial PROC PRINT Output
Obs
Employee_ID
1
2
3
4
5
120102
120103
120121
120122
120123
First_
Name
Last_Name
Tom
Wilson
Irenie
Christina
Kimiko
Zhou
Dawes
Elvish
Ngan
Hotstone
Gender Salary
M
M
F
F
F
108255
87975
26600
27475
26190
Job_Title
Sales
Sales
Sales
Sales
Sales
Manager
Manager
Rep. II
Rep. II
Rep. I
Country
AU
AU
AU
AU
AU
Birth_
Date
Hire_
Date
3510
-3996
-5630
-1984
1732
10744
5114
5114
6756
9405
5
5
The columns are listed, left to right, in the order the variables are stored in the data set.
VAR Statement
The VAR statement selects variables to include in the
report and specifies their order.
VAR variable(s);
Last_Name
First_
Name
Salary
Zhou
Dawes
Elvish
Ngan
Hotstone
Tom
Wilson
Irenie
Christina
Kimiko
108255
87975
26600
27475
26190
p104d01
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
SUM Statement
The SUM statement calculates and displays report totals
for the requested numeric variables.
Last_Name
Zhou
Dawes
Elvish
...
Capachietti
Lansberry
First_
Name
Salary
Tom
Wilson
Irenie
108255
87975
26600
Renee
Dennis
83505
84260
=======
5141420
p104d01
NOTE: There were 165 observations read from the data set ORION.SALES.
8
8
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-5
4-6
Business Scenario
Orion Star management wants a report that displays the
names and salaries of the sales employees earning less
than $25,500. Suppress the Obs column.
orion.sales
Last_Name
xxxxxxx
xxxxxxx
xxxxxxx
PROC PRINT
First_Name
xxxxxxxx
xxxxxxxx
xxxxxxxx
Salary
25000
20000
23000
WHERE Statement
The WHERE statement selects observations that meet
the criteria specified in the WHERE expression.
10
p104d02
10
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
NOTE: There were 7 observations read from the data set ORION.SALES.
WHERE Salary<25500;
11
11
Last_
Name
First_
Name
49
50
85
104
111
131
148
Tilley
Barcoe
Anstey
Voron
Polky
Ould
Buckner
Kimiko
Selina
David
Tachaun
Asishana
Tulsidas
Burnetta
Salary
25185
25275
25285
25125
25110
22710
25390
12
12
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-7
4-8
First_
Name
Tilley
Barcoe
Anstey
Voron
Polky
Ould
Buckner
Kimiko
Selina
David
Tachaun
Asishana
Tulsidas
Burnetta
Salary
25185
25275
25285
25125
25110
22710
25390
p104d02
13
13
WHERE Statement
The WHERE expression defines the condition (or
conditions) for selecting observations.
WHERE WHERE-expression;
Operands
character constants
numeric constants
date constants
character variables
numeric variables
Operators
symbols that represent a
comparison, calculation,
or logical operation
>
)
LT
SAS functions
special WHERE operators
14
14
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Operands
Constants are fixed values.
Character values are enclosed in quotation marks and
are case sensitive.
Numeric values do not use quotation marks or special
characters.
Variables must exist in the input data set.
where Gender='M';
variable
where Salary>50000;
constant
variable
constant
15
15
16
16
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-9
4-10
Comparison Operators
Comparison operators compare a variable with a value
or with another variable.
Symbol
Mnemonic
Definition
EQ
Equal to
^= = ~=
NE
Not equal to
>
GT
Greater than
<
LT
Less than
>=
GE
<=
LE
IN
17
17
The caret (^), tilde (~), and the not sign ( ) all indicate a logical not. Use the character available on your
keyboard, or use the mnemonic equivalent.
Comparison Operators
Examples
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
19
20
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-11
4-12
Logical Operators
Logical operators combine or modify WHERE
expressions.
proc print data=orion.sales;
where Country='AU' and
Salary<30000;
run;
WHERE WHERE-expression-1 AND | OR
WHERE-expression-n;
p104d03
22
22
NOTE: There were 51 observations read from the data set ORION.SALES.
WHERE (Country='AU') and (Salary<30000);
23
23
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Symbol
^
Mnemonic Priority
NOT
&
AND
II
OR
III
NOT modifies a condition by finding the complement of the specified criteria. Use the character
available on your keyboard, or use the mnemonic equivalent.
AND finds observations that satisfy both conditions.
OR finds observations that satisfy one or both conditions.
Logical Operators
Examples
where Country ne 'AU' and Salary>=50000;
where Gender eq 'M' or Salary ge 50000;
where Country='AU' or Country='US';
where Country in ('AU','US');
where Country not in ('AU','US');
equivalent expressions
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-13
4-14
4.02 Quiz
Which WHERE statement correctly subsets the numeric
values for May, June, or July and missing character
names?
a. Answer
where Month in (5-7)
and Names=.;
b. Answer
where Month in (5,6,7)
and Names=' ';
c. Answer
where Month in ('5','6','7')
and Names='.';
26
26
Business Scenario
Orion Star management wants a report that lists only the
Australian sales representatives.
orion.sales
Last_Name
First_
Name
xxxxxxxxxx
xxxxxxxxxx
xxxxxxxxxx
xxxxxxxxxx
xxxxxxx
xxxxxxx
xxxxxxx
xxxxxxx
Country
xx
xx
xx
xx
Job_Title
xxxxxxxxxxxx
xxxxxxxxxxxx
xxxxxxxxxxxx
xxxxxxxxxxxx
28
28
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Billy
Matsuoka
Vino
Meera
Harry
Julienne
Scott
Cherda
AU
AU
AU
AU
US
US
US
US
Sales
Sales
Sales
Sales
Chief
Sales
Sales
Sales
Rep. II
Rep. III
Rep. II
Rep. III
Sales Officer
Rep. II
Rep. IV
Rep. IV
p104d04
29
29
30
p104d04
30
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-15
4-16
CONTAINS Operator
The CONTAINS operator selects observations that
include the specified substring.
Equivalent Statements
where Job_Title contains 'Rep';
where Job_Title ? 'Rep';
31
31
First_
Name
Elvish
Ngan
Hotstone
Daymond
Hofmeister
Irenie
Christina
Kimiko
Lucian
Fong
Country
AU
AU
AU
AU
AU
Job_Title
Sales
Sales
Sales
Sales
Sales
Rep.
Rep.
Rep.
Rep.
Rep.
II
II
I
I
IV
32
32
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Definition
Char
Num
CONTAINS
Includes a substring
BETWEEN-AND
An inclusive range
Augment a WHERE
expression
IS NULL
A missing value
IS MISSING
A missing value
LIKE
Matches a pattern
33
33
BETWEEN-AND Operator
The BETWEEN-AND operator selects observations in
which the value of a variable falls within an inclusive
range of values.
Examples
where salary between 50000 and 100000;
where salary not between 50000 and 100000;
where Last_Name between 'A' and 'L';
where Last_Name between 'Baker' and 'Gomez';
34
34
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-17
4-18
BETWEEN-AND Operator
Equivalent Statements
where salary between 50000 and 100000;
where salary>=50000 and salary<=100000;
where 50000<=salary<=100000;
35
35
36
p104d05
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
37
37
First_Name
Last_Name
Gender
Salary
Country
3
4
5
9
14
15
19
20
21
23
Irenie
Christina
Kimiko
Sharryn
Fancine
Petrea
Marina
Shani
Fang
Amanda
Elvish
Ngan
Hotstone
Clarkson
Kaiser
Soltau
Iyengar
Duckett
Wilson
Liebman
F
F
F
F
F
F
F
F
F
F
26600
27475
26190
28100
28525
27440
29715
25795
26810
27465
AU
AU
AU
AU
AU
AU
AU
AU
AU
AU
38
38
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-19
4-20
4.03 Quiz
1. Open p104a01b.
2. Change WHERE SAME AND to WHERE ALSO.
3. Submit the program and view the log.
What message is written to the log?
39
IS NULL Operator
The IS NULL operator selects observations in which a
variable has a missing value.
Examples
where Employee_ID is null;
where Employee_ID is not null;
IS NULL can be used for both character and numeric
variables, and is equivalent to the following statements:
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
IS MISSING Operator
The IS MISSING operator selects observations in which a
variable has a missing value.
Examples
where Employee_ID is missing;
where Employee_ID is not missing;
IS MISSING can be used for both character and numeric
variables, and is equivalent to the following statements:
LIKE Operator
The LIKE operator selects observations by comparing
character values to specified patterns. Two special
characters are used to define a pattern:
A percent sign (%) specifies that any number of
characters can occupy that position.
An underscore (_) specifies that exactly one
character can occupy that position.
Examples
where Name like '%N';
where Name like 'T_m';
where Name like 'T_m%';
43
43
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-21
4-22
4.04 Quiz
Which WHERE statement returns all the observations that
have a first name starting with the letter M for the given
values?
Name
Elvish, Irenie
Ngan, Christina
Hotstone, Kimiko
Daymond, Lucian
Hofmeister, Fong
Denny, Satyakam
Clarkson, Sharryn
Kletschkus, Monica
last name, first name
44
44
Business Scenario
The Sales Manager wants a report that includes only
customers who are 21 years old.
orion.customer_dim
Customer_Age=21
Obs Customer_ID
1
2
3
999
999
999
Customer_Name
Customer_
Gender
XXXXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXXXXXX
X
X
X
Customer_
Country
Customer_Group
Customer_
Age_Group
Customer_
Type
XX
XX
XX
XXXXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXXX
XXXXXXXXX
XXXXXXXXX
XXXXXXXX
XXXXXXXX
XXXXXXXX
47
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
p104d06
48
48
Customer_ID
Customer_Name
Customer_
Gender
Customer_
Country
Customer_Group
79
11171
46966
Najma Hicks
Bill Cuddy
Lauren Krasowski
F
M
F
US
CA
CA
70210
Alex Santinello
CA
Customer_
Age_Group
Customer_Type
15-30 years
15-30 years
15-30 years
15-30 years
Orion
Report width and wrapping depends on the LINESIZE system option, the ODS destination, or both.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-23
4-24
ID Statement
The ID statement specifies the variable or variables
to print at the beginning of each row instead of an
observation number.
50
50
Note that the ID variable was removed from the VAR statement. If it were not removed, it would be
displayed twice: once as the leftmost column and again in the position specified by the VAR statement.
Customer_Name
Customer_
Gender
Najma Hicks
Bill Cuddy
Lauren Krasowski
Lera Knott
Soberina Berent
Alex Santinello
Customer_
Age_Group
15-30
15-30
15-30
15-30
15-30
15-30
years
years
years
years
years
years
F
M
F
F
F
M
Customer_
Country
US
CA
CA
CA
CA
CA
Customer_Group
Orion
Orion
Orion
Orion
Orion
Orion
Club
Club
Club
Club
Club
Club
members
Gold members
members
members
members
members
Customer_Type
Orion Club members medium activity
Orion Club Gold members low activity
Orion Club members high activity
Orion Club members medium activity
Orion Club members medium activity
Orion Club members medium activity
51
51
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-25
Level 1
1. Displaying orion.order_fact with the PRINT Procedure
a. Retrieve the starter program p104e01. Run the program and view the output. Observe that there
are 617 observations. Observations might be displayed over two lines, depending on output
settings.
b. Add a SUM statement to display the sum of Total_Retail_Price. The last several lines of the
report are shown below.
Obs
610
611
612
613
614
615
616
617
Order_
Type
1
1
1
1
1
1
1
1
Product_ID
240700100007
240700100017
240700400003
240800100042
240500200016
240500200122
240700200018
220101400130
Quantity
2
2
2
3
3
2
4
2
Total_Retail_
Price
CostPrice_
Per_Unit
$45.70
$19.98
$24.80
$760.80
$95.10
$48.20
$75.20
$33.80
=============
$100,077.46
$9.30
$11.40
$5.60
$105.30
$14.50
$11.50
$10.30
$5.70
Discount
.
40%
.
.
.
.
.
.
c. Add a WHERE statement to select only the observations with Total_Retail_Price more than 500.
Submit the program. Verify that 35 observations were displayed.
What do you notice about the Obs column? ___________________________________________
Did the sum of Total_Retail_Price change to reflect only the subset? ______________________
d. Add an option to suppress the Obs column. Verify that there are 35 observations in the results.
How can you verify the number of observations in the results? ____________________________
e. Add an ID statement to use Customer_ID as the identifying variable. Submit the program. The
results contain 35 observations.
How did the output change? _______________________________________________________
f. Add a VAR statement to display Customer_ID, Order_ID, Order_Type, Quantity, and
Total_Retail_Price.
What do you notice about Customer_ID? ___________________________________________
g. Modify the VAR statement to address the issue with Customer_ID.
Level 2
2. Displaying orion.customer_dim with the PRINT Procedure
a. Write a PRINT step to display orion.customer_dim.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-26
Customer_Name
4
9
11
...
54655
70201
James Kvarniq
Cornelia Krahl
Elke Wallstab
...
Lauren Marx
Angel Borwick
Customer_
Age
33
33
33
...
38
38
Customer_Type
Orion Club members low activity
Orion Club Gold members medium activity
Orion Club members high activity
...
Internet/Catalog Customers
Orion Club Gold members low activity
Challenge
3. Producing a Default Listing Report of orion.order_fact
This exercise assumes you are creating LISTING output in the SAS Windowing
Environment.
a. Produce a default listing report of orion.order_fact. The output might wrap onto a second line.
b. Investigate the use of the LINESIZE= SAS system option to adjust the width of the lines. What
are the minimum and maximum values for the LINESIZE= option? ___________________
Submit an OPTIONS statement with LINESIZE= set to the highest allowed value. Resubmit the
step, and observe the horizontal scroll bar if displayed.
Reset the line size to 96 when finished.
c. Another way to create compact output is to request vertical headings. Investigate the HEADING=
option in the PROC PRINT statement, and then experiment with it to generate vertical headings,
and then horizontal headings.
How do you specify vertical headings? ______________________________________________
How do you specify horizontal headings?_____________________________________________
4. Producing a Default Listing Report of orion.product_dim
This exercise assumes you are creating LISTING output in the SAS Windowing
Environment.
a. Produce a default listing report to display orion.product_dim. Notice that the column width
varies from one page to the next, depending on the width of the values displayed on each page.
b. Investigate the WIDTH= option of the PROC PRINT statement, and modify the program to use
this option with a value of UNIFORM.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-27
How are the results different when using the WIDTH=UNIFORM option? __________________
______________________________________________________________________________
c. Why might the procedure run more slowly with this option? ______________________________
______________________________________________________________________________
d. How can you save computer resources and still display columns consistently across pages? ____
______________________________________________________________________________
55
55
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-28
Business Scenario
Display observations from orion.sales in ascending order
by the variable Salary.
Employee_ID
999999
999999
999999
Last_Name
xxxxxxxxxx
xxxxxxxxxx
xxxxxxxxxx
Salary
99999
99999
99999
56
56
PROC SORT
work.sales
57
57
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
work.sales
PROC PRINT
58
58
p104d08
59
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-29
4-30
NOTE: There were 165 observations read from the data set
ORION.SALES.
NOTE: The data set WORK.SALES has 165 observations and 9
variables.
60
60
61
Last_Name
Ould
Polky
Voron
Favaron
Zhou
Highpoint
Salary
22710
25110
25125
95090
108255
243190
p104d08
61
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
SORT Procedure
The SORT procedure
replaces the original data set or creates a new one
can sort on multiple variables
sorts in ascending (default) or descending order
does not generate printed output.
The input data set is overwritten unless the OUT=
option is used to specify an output data set.
62
62
4.05 Quiz
Which step sorts the observations in a SAS data set and
overwrites the same data set?
a. Answer
proc sort data=work.EmpsAU
out=work.sorted;
by First;
run;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-31
4-32
Business Scenario
Produce a report that lists sales employees grouped by
Country, in descending Salary order within country.
--------------------------------------Country=AU-------------------------------------------Employee_ID
9999
9999
First_
Name
Last_
Name
xxxx
xxxx
xxxxx
xxxxx
Gender
x
x
Salary
Job_Title
99999
99999
xxxxxx
xxxxxx
Birth_
Date
Hire_
Date
9999
9999
9999
9999
--------------------------------------Country=US-------------------------------------------Employee_ID
First_
Name
Last_
Name
9999
9999
9999
xxxx
xxxx
xxxx
xxxxx
xxxxx
xxxxx
Gender Salary
x
x
x
Job_Title
Birth_
Date
Hire_
Date
xxxxxx
xxxxxx
xxxxxx
9999
9999
9999
9999
9999
9999
99999
99999
99999
66
66
67
67
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
p104d09
68
68
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-33
4-34
70
70
71
p104d09
71
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
First_Name
Last_Name
Tom
Wilson
Selina
Zhou
Dawes
Barcoe
Gender
Salary
M
M
M
108255
87975
36605
Hire_
Date
...
12205
6575
18567
First_Name
Last_Name
Gender
Salary
Harry
Louis
Asishana
Highpoint
Favaron
Polky
M
M
M
243190
95090
84260
Hire_
Date
...
11535
15157
13027
72
72
4.06 Quiz
Open and submit p104a02. View the log.
Why did the program fail?
73
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-35
4-36
Business Scenario
Modify the previous report to display selected variables, the
salary subtotal for each country, and the salary grand total.
-------------------- Country=AU -----------------First_Name
Last_Name
XXXX
XXXX
-------Country
XXXXXXX
XXXXXXX
Gender
X
X
Salary
99999
99999
--------999999
Last_Name
XXXXXXX
XXXXXXX
-------Country
XXXXXXX
XXXXXXX
Gender
X
X
Salary
99999
99999
--------999999
=========
9999999
subtotals
grand total
75
75
Generating Subtotals
Use a BY statement and a SUM statement in a PROC
PRINT step.
76
p104d10
76
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Last_Name
Gender
Salary
Tom
Wilson
Daniel
...
Kimiko
-------Country
Zhou
Dawes
Pilgrim
M
M
M
108255
87975
36605
Tilley
25185
------1900015
subtotal for AU
Last_Name
Harry
Louis
Dennis
...
Tulsidas
-----------Country
Highpoint
Favaron
Lansberry
M
M
M
243190
95090
84260
Ould
22710
------3241405
=======
5141420
77
Gender
Salary
subtotal for US
grand total
77
p104a03
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-37
4-38
79
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Level 1
5. Sorting orion.employee_payroll and Displaying the New Data Set
a. Open p104e05. Add a PROC SORT step before the PROC PRINT step to sort
orion.employee_payroll by Salary, placing the sorted observations into a temporary data set
named sort_salary.
b. Modify the PROC PRINT step to display the new data set. Verify that your output matches the
report below.
Obs
Employee_ID
1
2
...
422
423
424
121084
120191
120261
120262
120259
Employee_
Gender
Salary
Birth_
Date
Employee_
Hire_Date
Employee_
Term_Date
Marital_
Status
M
F
22710
24015
3150
1112
12784
17167
.
17347
M
S
3
0
M
M
M
243190
268455
433800
4800
5042
2946
11535
11932
12297
.
.
.
O
M
M
1
2
1
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Dependent
4-39
a. Open p104e06. Add a PROC SORT step before the PROC PRINT step to sort
orion.employee_payroll by Employee_Gender, and within gender by Salary in descending
order. Place the sorted observations into a temporary data set named sort_salary2.
b. Modify the PROC PRINT step to display the new data set with the observations grouped by
Employee_Gender.
-------------------------------------- Employee_Gender=F --------------------------------------
Obs
Employee_ID
Salary
Birth_
Date
Employee_
Hire_Date
Employee_
Term_Date
120260
120719
120661
207885
87420
85495
3258
4770
-400
10532
14641
10227
.
.
17347
M
M
M
2
1
3
120196
120191
24025
24015
10257
1112
17167
17167
17347
17347
S
S
0
0
1
2
3
...
190
191
Marital_
Status
Dependents
Obs
Employee_ID
Salary
Birth_
Date
Employee_
Hire_Date
Employee_
Term_Date
Marital_
Status
192
193
...
423
424
120259
120262
433800
268455
2946
5042
12297
11932
.
.
M
M
1
2
120190
121084
24100
22710
10566
3150
17837
12784
18017
.
M
M
2
3
Dependents
Level 2
7. Sorting orion.employee_payroll and Displaying a Subset of the New Data Set
a. Sort orion.employee_payroll by Employee_Gender, and by descending Salary within gender.
Place the sorted observations into a temporary data set named sort_sal.
b. Print a subset of the sort_sal data set. Select only the observations for active employees (those
without a value for Employee_Term_Date) who earn more than $65,000. Group the report by
Employee_Gender, and include a total and subtotals for Salary. Suppress the Obs column.
Display only Employee_ID, Salary, and Marital_Status. The results contain 18 observations.
-------------------------------------- Employee_Gender=F -----------------------------------Marital_
Status
Employee_ID
Salary
120260
120719
...
120677
--------------Employee_Gender
207885
87420
M
M
65555
-----605190
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-40
Employee_ID
Salary
120259
120262
...
120268
--------------Employee_Gender
433800
268455
M
M
76105
------2072410
=======
2677600
Challenge
8. Retaining the First Observation of Each BY Group
a. Sort orion.orders by Customer_ID. Place the sorted observations in a temporary data set.
b. Display the sorted data set. The resulting report should contain 490 observations. Customer_ID
is listed multiple times for customers that placed more than one order.
c. Investigate an option that causes PROC SORT to retain only the first observation in each BY
group.
d. Add the appropriate option to the PROC SORT step to retain only the first observation in each BY
group. The results contain 75 observations with no duplicate values for Customer_ID.
e. Explore the DUPOUT= option to write duplicate observations to a separate output data set.
84
84
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Business Scenario
Enhance the payroll report by adding titles, footnotes, and
descriptive column headings.
Obs Employee_ID
1
2
3
9999
9999
9999
Last_ Name
xxxxxxxxxx
99999
xxxxxxxxxx
99999
xxxxxxxxxx
99999
Orion Star Sales Staff
Salary Report
Obs Employee ID
1
2
3
Salary
9999
9999
9999
Last Name
Annual Salary
xxxxxxxxxx
xxxxxxxxxx
xxxxxxxxxx
99999
99999
99999
Confidential
85
85
86
p104d11
86
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-41
4-42
Employee_ID
Last_Name
Salary
1
2
3
...
164
165
120102
120103
120121
Zhou
Dawes
Elvish
108255
87975
26600
121144
121145
Capachietti
Lansberry
83505
84260
Confidential
87
87
TITLE Statement
The global TITLE statement specifies title lines for
SAS output.
TITLEn 'text ';
88
88
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
FOOTNOTE Statement
The global FOOTNOTE statement specifies footnote lines
for SAS output.
FOOTNOTEn 'text ';
89
89
This statement
changes title 1 and
cancels titles 2 and 3.
90
90
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-43
4-44
title;
footnote;
91
91
102
Resultant Title(s)
title;
proc print data=orion.sales;
run;
102
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.08 Quiz
Which footnote or footnotes appear in the second
procedure output?
Sales Employees
a. Non Sales Employees
c. Non Confidential
b.
Orion Star
Non Sales Employees
d.
Orion Star
Non Sales Employees
Confidential
103
103
Idea Exchange
Which of the following programs do you prefer and why?
'Orion Star Employees';
a. title
proc print data=orion.staff;
where Gender='F';
var Employee_ID Salary;
run;
105
b.
c.
d.
105
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-45
4-46
variable-n='label ';
p104d12
107
107
LABEL Statement
The LABEL statement assigns descriptive labels to
variables.
A label can be up to 256 characters and include any
characters, including blanks.
Labels are used automatically by many procedures.
The PRINT procedure uses labels only when the
LABEL or SPLIT= option is specified.
108
108
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Salary Report
Staff
Last Name
Annual
Salary
120102
120103
120121
Zhou
Dawes
Elvish
108255
87975
26600
121144
121145
Capachietti
Lansberry
Obs
Sales ID
1
2
3
...
164
165
83505
84260
Confidential
109
109
SPLIT= Option
The SPLIT= option in PROC PRINT specifies a split
character to control line breaks in column headings.
SPLIT='split-character'
110
p104d13
110
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-47
4-48
Annual
Salary
120102
120103
120121
Zhou
Dawes
Elvish
108255
87975
26600
121144
121145
Capachietti
Lansberry
Obs
Sales ID
1
2
3
...
164
165
83505
84260
Confidential
111
111
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Level 1
9. Displaying Titles and Footnotes in a Detail Report
a. Open and submit p104e09 to display all observations for Australian Sales Rep IVs.
b. Add a VAR statement to display only the variables shown in the report below.
c. Add TITLE and FOOTNOTE statements to include the titles and footnotes shown in the report
below.
d. Submit the program and verify the output. The results contain five observations as shown below.
e. Submit a null TITLE and null FOOTNOTE statement to clear all titles and footnotes.
Australian Sales Employees
Senior Sales Representatives
Obs
Employee_ID
7
10
17
41
120125
120128
120135
120159
First_
Name
Last_Name
Fong
Monica
Alexei
Lynelle
Hofmeister
Kletschkus
Platts
Phoumirath
Gender
M
F
M
F
Salary
32040
30890
32490
30765
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
120166
Fadi
Nowd
4-49
30660
Label
Employee_ID
Employee ID
First_Name
First Name
Last_Name
Last Name
Salary
Annual Salary
Employee ID
121023
121028
121029
...
121138
121140
Gender
Annual
Salary
First Name
Last Name
Shawn
William
Kuo-Chung
Fuller
Smades
Mcelwee
M
M
M
26010
26585
27225
Hershell
Saunders
Tolley
Briggi
M
M
27265
26335
b. Modify the program to use a blank space as the SPLIT= character to generate two-line column
headings. Submit the modified program and verify that two-line column labels are displayed.
Entry-level Sales Representatives
Employee
ID
121023
121028
121029
...
121138
121140
First
Name
Last
Name
Shawn
William
Kuo-Chung
Fuller
Smades
Mcelwee
M
M
M
26010
26585
27225
Hershell
Saunders
Tolley
Briggi
M
M
27265
26335
Gender
Level 2
11. Writing an Enhanced Detail Report
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Annual
Salary
4-50
Name
Amos, Salley
Apr, Nishan
Arizmendi, Gilbert
Armant, Debra
Bataineh, Perrior
...
City
San
San
San
San
San
Diego
Diego
Diego
Diego
Diego
Zip
Code
92116
92071
91950
92025
92126
4.4 Solutions
Solutions to Exercises
1. Displaying orion.order_fact with the PRINT Procedure
proc print data=orion.order_fact noobs;
where Total_Retail_Price>500;
id Customer_ID;
var Order_ID Order_Type Quantity Total_Retail_Price;
sum Total_Retail_Price;
run;
a. Run the program and view the output.
b. Add a SUM statement and verify the resulting sum.
c. What do you notice about the Obs column? The numbers are not sequential. The original
observation numbers are displayed.
Did the sum of Total_Retail Price change to reflect only the subset? Yes
d. If the Obs column is suppressed, how can you verify the number of observations in the results?
Check the log.
e. When the ID statement was added, how did the output change? Customer_ID is the leftmost
column and is displayed on each line for an observation.
f. When the VAR statement is added, what do you notice about Customer_ID? There are two
Customer_ID columns. The first column is the ID field, and a second one is included
because Customer_ID is listed in the VAR statement.
g. Remove the duplicate column by removing Customer_ID from the VAR statement.
2. Displaying orion.customer_dim with the PRINT Procedure
proc print data=orion.customer_dim noobs;
where Customer_Age between 30 and 40;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.4 Solutions
4-51
id Customer_ID;
var Customer_Name Customer_Age Customer_Type;
run;
3. Producing a Default Listing Report of orion.order_fact (SAS Windowing Environment)
options ls=max;
proc print data=orion.order_fact;
run;
options ls=96;
proc print data=orion.order_fact headings=v;
run;
a. Submit a simple PROC PRINT step.
b. The minimum value for LINESIZE= is 64 and the maximum size is MAX.
Use this statement to reset the line size to 96: options ls=96;
c. HEADINGS=V forces all column headings to display vertically.
HEADINGS=H forces all column headings to display horizontally.
4. Producing a Default Listing Report of orion.product_dim (SAS Windowing Environment)
proc print data=orion.product_dim width=uniform;
run;
a. Submit a simple PROC PRINT step.
b. Add the WIDTH=uniform option. How are the results different? Each column has the same
column width on each page.
c. Why might the procedure run more slowly with this option? With this option, PROC PRINT
must read through the entire data set twice.
d. How can you save computer resources and still display columns consistently across pages? Use a
format on every column to explicitly specify a field width so that PROC PRINT reads the
data only once.
5. Sorting orion.employee_payroll and Displaying the New Data Set
proc sort data=orion.employee_payroll out=work.sort_salary;
by Salary;
run;
proc print data=work.sort_salary;
run;
6. Sorting orion.employee_payroll and Displaying Grouped Observations
proc sort data=orion.employee_payroll out=work.sort_salary2;
by Employee_Gender descending Salary;
run;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-52
4.4 Solutions
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-53
4-54
21
NOTE: There were 134 observations read from the data set ORION.SALES.
WHERE Salary<30000;
27
27
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.4 Solutions
40
Name
Elvish, Irenie
Ngan, Christina
Hotstone, Kimiko
Daymond, Lucian
Hofmeister, Fong
Denny, Satyakam
Clarkson, Sharryn
Kletschkus, Monica
last name, first name
45
45
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-55
4-56
NOTE: There were 165 observations read from the data set ORION.SALES.
NOTE: The data set WORK.SORTED has 165 observations and 9 variables.
192
193
194
195
ERROR: Data set WORK.SORTED is not sorted in ascending sequence. The current
BY group has Gender = M and the next BY group has Gender = F.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 64 observations read from the data set WORK.SORTED.
74
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.4 Solutions
80
Orion Star
Non Sales Employees
d.
Orion Star
Non Sales Employees
Confidential
104
104
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-57
4-58
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.2
5.3
5-2
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Business Scenario
Enhance the appearance of variable values in reports.
Last_Name
First_
Name
Country
Zhou
Dawes
Elvish
Tom
Wilson
Irenie
AU
AU
AU
Last_Name
First_
Name
Country
Zhou
Dawes
Elvish
Tom
Wilson
Irenie
AU
AU
AU
Job_Title
Salary
Hire_
Date
Sales Manager
Sales Manager
Sales Rep. II
108255
87975
26600
12205
6575
6575
Job_Title
Sales Manager
Sales Manager
Sales Rep. II
Salary
Hire_Date
$108,255
$87,975
$26,600
06/01/1993
01/01/1978
01/01/1978
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-3
5-4
SAS Formats
SAS formats can be used in a PROC step to change how
values are displayed in a report.
PROC Step
FORMAT
statement
variable values
5
5
FORMAT Statement
The FORMAT statement associates a format with a variable.
p105d01
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
First_
Name
Zhou
Dawes
Elvish
Ngan
Hotstone
Tom
Wilson
Irenie
Christina
Kimiko
Country
AU
AU
AU
AU
AU
Job_Title
Sales
Sales
Sales
Sales
Sales
Manager
Manager
Rep. II
Rep. II
Rep. I
Salary
Hire_Date
$108,255
$87,975
$26,600
$27,475
$26,190
06/01/1993
01/01/1978
01/01/1978
07/01/1982
10/01/1989
DOLLAR8.
MMDDYY10.
What Is a Format?
A format is an instruction to write data values.
A format changes the appearance of a variables value
in a report.
The values stored in the data set are not changed.
SAS Date
10866
Numeric
5950.35
01/10/1989
10Jan1989
5,950.35
$5,950.35
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-5
5-6
SAS Formats
SAS formats have the following form:
<$>format<w>.<d>
format
SAS Formats
Selected SAS formats:
Format
10
Definition
$w.
w.d
COMMAw.d
DOLLARw.d
COMMAXw.d
EUROXw.d
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Stored Value
Displayed Value
Programming
Prog
12.
27134.5864
27135
12.2
27134.5864
27134.59
COMMA12.2
27134.5864
27,134.59
DOLLAR12.2
27134.5864
$27,134.59
COMMAX12.2
27134.5864
27.134,59
EUROX12.2
27134.5864
27.134,59
11
Stored Value
Displayed Value
DOLLAR12.2
27134.5864
$27,134.59
DOLLAR9.2
27134.5864
$27134.59
DOLLAR8.2
27134.5864
27134.59
DOLLAR5.2
27134.5864
27135
DOLLAR4.2
27134.5864
27E3
12
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-7
5-8
5.01 Quiz
Use SAS documentation or the SAS Help Facility
to explore the Zw.d numeric format. What is it used for?
Hint: Search for Zw.d or explore Formats by Category.
13
Stored Value
Displayed Value
MMDDYY10.
01/01/1960
MMDDYY8.
01/01/60
MMDDYY6.
010160
DDMMYY10.
365
31/12/1960
DDMMYY8.
365
31/12/60
DDMMYY6.
365
311260
15
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-9
Stored Value
Displayed Value
DATE7.
-1
31DEC59
DATE9.
-1
31DEC1959
WORDDATE.
January 1, 1960
WEEKDATE.
MONYY7.
JAN1960
YEAR4.
1960
16
5.02 Quiz
Which FORMAT statement creates the output shown
below?
format Birth_Date Hire_Date mmddyy10.
a. Answer
Term_Date monyy7.;
Birth_Date
Hire_Date
Term_Date
21/05/1969
15/10/1992
MAR2007
17
National Language Support (NLS) enables a software product to function properly in every global market
for which the product is targeted. SAS contains NLS features to ensure that SAS applications conform to
local language conventions.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-10
NLS date formats convert SAS date values to a locale-sensitive date string. The LOCALE= system option
is used to specify the locale, which reflects the local conventions, language, and culture of a geographical
region. For example, a locale value of English_Canada represents the country of Canada with a language
of English. A locale value of French_Canada represents the country of Canada with a language of French.
The LOCALE= system option can be specified in a configuration file, at SAS invocation, or in the
OPTIONS statement. For more information, refer to SAS 9.3 National Language Support Reference
Guide in the SAS documentation.
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Level 1
1. Displaying Formatted Values in a Detail Report
a. Open p105e01 and submit. Review the output.
b. Modify the PROC PRINT step to display only Employee_ID, Salary, Birth_Date, and
Employee_Hire_Date.
c. Add a FORMAT statement to display Salary in a dollar format, Birth_Date in 01/31/2012 date
style, and Employee_Hire_Date in the 01JAN2012 date style, as shown in the report below.
Obs
Employee_ID
Salary
Birth_Date
Employee_
Hire_Date
1
2
3
...
423
120101
120102
120103
$163,040.00
$108,255.00
$87,975.00
08/18/1980
08/11/1973
01/22/1953
01JUL2007
01JUN1993
01JAN1978
121147
$29,145.00
05/28/1973
01SEP1991
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
121148
$52,930.00
01/01/1973
5-11
01JAN2002
Level 2
2. Displaying Formatted Values in a Detail Report
a. Write a PROC PRINT step to display the report below using orion.sales as input. Subset the
observations and variables to produce the report shown below. Include titles, labels, and formats.
The results contain 13 observations.
US Sales Employees
Earning Under $26,000
Employee_ID
121036
121038
121044
...
121106
121108
First
Name
Last Name
Teresa
David
Ray
Mesley
Anstey
Abbott
James
Libby
Hilburger
Levi
Salary
Date
Hired
Sales Rep. I
Sales Rep. I
Sales Rep. I
$25,965
$25,285
$25,660
OCT2007
AUG2010
AUG1979
Sales Rep. I
Sales Rep. I
$25,880
$25,930
FEB2000
NOV2010
Title
Challenge
3. Exploring Formats by Category
a. Display orion.sales as shown in the report below. Refer to SAS Help or product documentation to
explore the Dictionary of Formats and investigate SAS Formats by Category. Identify and use
the character format that displays values in uppercase and a format that displays a character value
in quotation marks. The results contain 165 observations.
Employee_ID
120102
120103
120121
...
121144
121145
First_
Name
Last_Name
Job_Title
TOM
WILSON
IRENIE
ZHOU
DAWES
ELVISH
"Sales Manager"
"Sales Manager"
"Sales Rep. II"
RENEE
DENNIS
CAPACHIETTI
LANSBERRY
"Sales Manager"
"Sales Manager"
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-12
22
Business Scenario
Display country names instead of country codes in a
report.
Current Report (partial output)
Obs
Employee_ID
Salary
1
2
3
120102
120103
120121
$108,255
$87,975
$26,600
Country
Birth_
Date
Hire_
Date
AU
AU
AU
AUG1973
JAN1953
AUG1948
JUN1993
JAN1978
JAN1978
23
Obs
Employee_ID
Salary
1
2
3
120102
120103
120121
$108,255
$87,975
$26,600
Country
Australia
Australia
Australia
Birth_
Date
Hire_
Date
AUG1973
JAN1953
AUG1948
JUN1993
JAN1978
JAN1978
p105d02
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
'AU'='Australia'
'US'='United States'
other='Miscoded';
PROC FORMAT;
VALUE format-name range1 = 'label '
range2 = 'label '
...;
RUN;
p105d03
24
25
p105d03
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-13
5-14
Employee_ID
Salary
1
2
3
4
5
120102
120103
120121
120122
120123
$108,255
$87,975
$26,600
$27,475
$26,190
Country
Australia
Australia
Australia
Australia
Australia
Birth_
Date
Hire_
Date
AUG1973
JAN1953
AUG1948
JUL1958
SEP1968
JUN1993
JAN1978
JAN1978
JUL1982
OCT1989
26
VALUE Statement
VALUE format-name range1='label '
range2='label '
...;
A format name
can be up to 32 characters in length
for character formats, must begin with a dollar sign ($),
followed by a letter or underscore
for numeric formats, must begin with a letter or
underscore
cannot end in a number
cannot be given the name of a SAS format
cannot include a period in the VALUE statement.
27
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-15
VALUE Statement
VALUE format-name range1='label '
range2='label '
...;
28
Enclosing labels in quotation marks is a best practice, and it is required if a label contains internal
blanks.
$stfmt
$3levels
_4years
salranges
dollar
29
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-16
discrete
character
values
proc format;
value $ctryfmt 'AU'='Australia'
'US'='United States'
other='Miscoded';
run;
keyword
labels
p105d03
31
Applying a Format
User-defined and SAS formats can be applied in a single
FORMAT statement.
proc print data=orion.sales label;
var Employee_ID Job_Title Salary
Country Birth_Date Hire_Date;
format Salary dollar10.
Birth_Date Hire_Date monyy7.
Country $ctryfmt.;
run;
p105d03
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Idea Exchange
The formatting examples shown in this section are
sometimes referred to as translating values.
Can you give an example of where this type of application
might be useful?
33
Business Scenario
An Orion Star manager wants a report showing employee
salaries collapsed into three user-defined groups or tiers.
Current Report
Obs
Employee_ID
1
2
3
4
120102
120103
120121
120122
Last_Name
Salary
Zhou
Dawes
Elvish
Ngan
108255
87975
26600
27475
Last_Name
Salary
Zhou
Dawes
Elvish
Ngan
Tier
Tier
Tier
Tier
Desired Report
Obs
Employee_ID
1
2
3
4
120102
120103
120121
120122
3
2
1
1
35
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-17
5-18
Salary
PROC FORMAT
Value
20,000 to 49,999
Tier1
50,000 to 99,999
Tier2
100,000 to 250,000
Tier3
36
proc format;
value tiers
run;
0-49999='Tier 1'
50000-99999='Tier 2'
100000-250000='Tier 3';
numeric
format
name
labels
p105d04
37
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-19
proc format;
value tiers 20000-49999 ='Tier 1'
50000-99999 ='Tier 2'
100000-250000='Tier 3';
run;
data work.salaries;
input Name $ Salary;
Original_Salary=Salary;
datalines;
Abi 50000
Mike 65000
Jose 50000.00
Joe 37000.50
Ursula 142000
Lu 49999.99
;
proc print data=work.salaries;
format Salary tiers.;
run;
1. Program p105d04 includes a PROC FORMAT step to create the TIERS format. The DATA step reads
data lines within the program to create a data set containing names and salaries. The PROC PRINT
step displays the new data set, applying the TIERS format. Notice in the program that Salary is
assigned to Original_Salary so that both can be included in the report. The assignment statement is
covered in a later chapter.
2. Look at the data values and predict what label will be displayed when the TIERS format is applied to
Salary. Lus salary falls within a gap.
Name
Salary
Abi
50000
Mike
65000
Jose
50000.00
Joe
37000.50
Ursula
Lu
Coded Salary
142000
49999.99
3. Submit the program. What Salary value is displayed for Lu? ___________________
When a value does not match any of the ranges, PROC PRINT attempts to display the actual value. In
this case, the column width was determined by the width of the formatted values, which is 6. As
mentioned earlier, if the format width is not large enough to accommodate a numeric value, the
displayed value is automatically adjusted to fit in the width. What can you do to correct this?
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-20
4. Specify a width of 8 on the TIERS format in the FORMAT statement and resubmit the program.
proc print data=work.salaries;
format Salary tiers8.;
run;
Now what Salary value is displayed for Lu? ___________________
Starting Value
Ending Value
50000 - 100000
Includes 50000
Includes 100000
Includes 50000
Excludes 100000
Excludes 50000
Includes 100000
Excludes 50000
Excludes 100000
39
The < symbol is used to define an exclusive range. The > symbol is not permitted in a VALUE statement.
5.04 Quiz
How will a value of 50000 be displayed if the TIERS
format below is applied to the value?
a.
b.
c.
d.
Tier 1
Tier 2
50000
a missing value
proc format;
value tiers
run;
40
p105d05
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
proc format;
value tiers
run;
42
Part 1
run;
43
p105d06
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-21
5-22
Employee_ID
1
2
3
4
5
120102
120103
120121
120122
120123
Job_Title
Sales
Sales
Sales
Sales
Sales
Manager
Manager
Rep. II
Rep. II
Rep. I
Salary
Tier
Tier
Tier
Tier
Tier
3
2
1
1
1
Country
AU
AU
AU
AU
AU
Birth_
Date
Hire_
Date
AUG1973
JAN1953
AUG1948
JUL1958
SEP1968
JUN1993
JAN1978
JAN1978
JUL1982
OCT1989
44
45
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
value tiers
'AU'='Australia'
'US'='United States'
other='Miscoded';
run;
p105d07
46
Employee_ID
1
2
3
4
5
120102
120103
120121
120122
120123
Job_Title
Sales
Sales
Sales
Sales
Sales
Manager
Manager
Rep. II
Rep. II
Rep. I
Salary
Tier
Tier
Tier
Tier
Tier
3
2
1
1
1
Country
Birth_
Date
Hire_
Date
Australia
Australia
Australia
Australia
Australia
AUG1973
JAN1953
AUG1948
JUL1958
SEP1968
JUN1993
JAN1978
JAN1978
JUL1982
OCT1989
p105d07
47
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-23
5-24
Level 1
4. Creating User-Defined Formats
a. Retrieve the starter program p105e04.
b. Create a character format named $GENDER that displays gender codes as follows:
F
Female
Male
c. Create a numeric format named MNAME that displays month numbers as follows:
1
January
February
March
d. Add a PROC PRINT step to display the data set, applying these two user-defined formats to the
Employee_Gender and BirthMonth variables, respectively.
e. Submit the program to produce the following report. The results contain 113 observations.
Employees with Birthdays in Q1
Obs
Employee_ID
1
2
3
...
112
113
Employee_
Gender
Birth
Month
120103
120107
120108
Male
Female
Female
January
January
February
121142
121148
Male
Male
February
January
Level 2
5. Defining Ranges in User-Defined Formats
a. Retrieve the starter program p105e05.
b. Create a character format named $GENDER that displays gender codes as follows:
F
Female
Male
Invalid code
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.3 Solutions
5-25
c. Create a numeric format named SALRANGE that displays salary ranges as follows:
At least 20,000 but less than 100,000
Below $100,000
$100,000 or more
missing
Missing salary
Invalid salary
d. In the PROC PRINT step, apply these two user-defined formats to the Gender and Salary
variables, respectively. Submit the program to produce the following report:
Partial PROC PRINT Output
Salary and Gender Values
for Non-Sales Employees
Obs
Employee_ID
1
2
3
4
5
6
7
8
9
10
11
12
13
120101
120104
120105
120106
120107
120108
120108
120110
120111
120112
120113
120114
120115
Job_Title
Director
Administration Manager
Secretary I
Office Assistant II
Office Assistant III
Warehouse Assistant II
Warehouse Assistant I
Warehouse Assistant III
Security Guard II
Security Guard II
Security Manager
Service Assistant I
Salary
$100,000 or more
Below $100,000
Below $100,000
Missing salary
Below $100,000
Below $100,000
Below $100,000
Below $100,000
Below $100,000
Below $100,000
Below $100,000
Below $100,000
Invalid salary
Gender
Male
Female
Female
Male
Female
Female
Female
Male
Male
Female
Female
Invalid code
Male
Challenge
6. Exploring Format Storage Options
User-defined formats are stored in the formats catalog in the work library, work.formats. Use the
SAS Help Facility or product documentation to explore permanent format catalogs in PROC
FORMAT.
What option enables you to store the formats in a permanent library? _______________________
What option causes SAS to look for formats in permanent libraries? _______________________
5.3 Solutions
Solutions to Exercises
1. Displaying Formatted Values in a Detail Report
proc print data=orion.employee_payroll;
var Employee_ID Salary Birth_Date Employee_Hire_Date;
format Salary dollar11.2 Birth_Date mmddyy10.
Employee_Hire_Date date9.;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-26
run;
a. Submit p105e01
b. Add a VAR statement.
c. Add a FORMAT statement.
2. Displaying Formatted Values in a Detail Report
title1 'US Sales Employees';
title2 'Earning Under $26,000';
proc print data=orion.sales label noobs;
where Country='US' and Salary<26000;
var Employee_ID First_Name Last_Name Job_Title Salary Hire_Date;
label First_Name='First Name'
Last_Name='Last Name'
Job_Title='Title'
Hire_Date='Date Hired';
format Salary dollar10. Hire_Date monyy7.;
run;
title;
footnote;
3. Exploring Functions by Category
proc print data=orion.sales noobs;
var Employee_ID First_Name Last_Name Job_Title;
format First_Name Last_Name $upcase. Job_Title $quote.;
run;
4. Creating User-Defined Formats
data Q1Birthdays;
set orion.employee_payroll;
BirthMonth=month(Birth_Date);
if BirthMonth le 3;
run;
proc format;
value $gender
'F'='Female'
'M'='Male';
value mname
1='January'
2='February'
3='March';
run;
title 'Employees with Birthdays in Q1';
proc print data=Q1Birthdays;
var Employee_ID Employee_Gender BirthMonth;
format Employee_Gender $gender.
BirthMonth mname.;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.3 Solutions
run;
title;
5. Defining Ranges in User-Defined Formats
proc format;
value $gender
'F'='Female'
'M'='Male'
other='Invalid code';
value salrange .='Missing salary'
20000-<100000='Below $100,000'
100000-500000='$100,000 or more'
other='Invalid salary';
run;
title1 'Salary and Gender Values';
title2 'for Non-Sales Employees';
proc print data=orion.nonsales;
var Employee_ID Job_Title Salary Gender;
format Salary salrange. Gender $gender.;
run;
title;
6. Exploring Format Storage Options
What option enables you to store the formats in a permanent library? LIBRARY=
What option causes SAS to look for formats in permanent libraries? FMTSEARCH=
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-27
5-28
14
Birth_Date
Hire_Date
Term_Date
21/05/1969
15/10/1992
MAR2007
18
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.3 Solutions
$stfmt
$3levels
_4years
salranges
dollar
Tier 1
Tier 2
50000
a missing value
proc format;
value tiers
run;
41
p105d05
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-29
5-30
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.2
6.3
6-2
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3
3
Business Scenario
Information about Orion Star sales employees resides in
several input sources.
4
4
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-3
6-4
Considerations
Management wants a series of reports for Australian
sales employees. You will read data from various input
sources to create a SAS data set that can be analyzed
and presented.
5
5
6
6
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
DATA Step
work.subset1
Rep
7
7
DATA output-SAS-data-set;
SET input-SAS-data-set;
WHERE WHERE-expression;
RUN;
p106d01
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-5
6-6
DATA Statement
The DATA statement begins a DATA step and provides
the name of the SAS data set to create.
DATA output-SAS-data-set;
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
run;
9
9
SET Statement
The SET statement reads observations from an existing
SAS data set for further processing in the DATA step.
SET input-SAS-data-set;
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
run;
10
10
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
WHERE Statement
The WHERE statement selects observations from
a SAS data set that meet a particular condition.
WHERE WHERE-expression;
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
run;
p106d01
11
11
Using a WHERE statement might improve the efficiency of your SAS programs because SAS only
processes the observations that meet the condition or conditions in the WHERE expression.
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
run;
NOTE: There were 61 observations read from the data set ORION.SALES.
WHERE (Country='AU') and Job_Title contains 'Rep';
NOTE: The data set WORK.SUBSET1 has 61 observations and 9 variables.
12
12
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-7
6-8
First_
Name
Last_Name
Irenie
Christina
Kimiko
Lucian
Fong
Elvish
Ngan
Hotstone
Daymond
Hofmeister
Gender
F
F
F
M
M
Salary
26600
27475
26190
26480
32040
Job_Title
Sales
Sales
Sales
Sales
Sales
Rep.
Rep.
Rep.
Rep.
Rep.
II
II
I
I
IV
Country
Birth_
Date
Hire_
Date
AU
AU
AU
AU
AU
-4169
-523
3193
1228
-391
6575
8217
10866
8460
8460
p106d01
13
13
14
p106a01
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
15
18
18
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-9
6-10
Considerations
Subsetting is based on Hire_Date, which contains
a SAS date value. How can you compare a SAS date
value to a calendar date?
Use a
SAS date
constant.
19
19
Date Constant
A date constant can be used in any SAS expression,
including a WHERE expression.
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep' and
Hire_Date<'01jan2000'd;
run;
20
p106d02
20
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Considerations
Create a data set that includes the new variable, Bonus,
which represents a 10% bonus.
orion.sales
work.subset1
21
21
Assignment Statement
The assignment statement evaluates an expression
and assigns the result to a new or existing variable.
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep' and
Hire_Date<'01jan2000'd;
Bonus=Salary*.10;
run;
variable=expression;
22
p106d02a
22
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-11
6-12
Assignment Statement
The expression consists of operands and operators.
variable=expression;
Operators
Operands
character constants
numeric constants
date constants
character variables
numeric variables
+ ** *
) / ||
SAS functions
23
23
The operators can be character or arithmetic operators or SAS functions. A function is a routine that
accepts arguments, performs a calculation or manipulation using the arguments, and returns a single
value.
Type
Salary=26960;
Numeric constant
Gender='F';
Character constant
Hire_Date='21JAN1995'd;
Date constant
Arithmetic expression
24
24
The MONTH function accepts a SAS date and returns the month portion of the date as an integer between
1 and 12. You investigate this and other SAS functions in a later chapter.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Arithmetic Operators
If any operand in an arithmetic expression has a missing
value, the result is a missing value.
Symbol
Definition
Priority
**
Exponentiation
Multiplication
II
Division
II
Addition
III
Subtraction
III
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep' and
Hire_Date<'01jan2000'd;
Bonus=Salary*.10;
run;
NOTE: There were 29 observations read from the data set ORION.SALES.
WHERE (Country='AU') and Job_Title contains 'Rep' and
(Hire_Date<'01JAN2000'D);
NOTE: The data set WORK.SUBSET1 has 29 observations and 10 variables.
The input data set has 9 variables, and the new data set
has 10 variables.
26
26
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-13
6-14
Last_Name
Salary
Irenie
Christina
Kimiko
Lucian
Fong
Elvish
Ngan
Hotstone
Daymond
Hofmeister
26600
27475
26190
26480
32040
Job_Title
Sales
Sales
Sales
Sales
Sales
Rep.
Rep.
Rep.
Rep.
Rep.
II
II
I
I
IV
Bonus
Hire_Date
2660.0
2747.5
2619.0
2648.0
3204.0
01JAN1978
01JUL1982
01OCT1989
01MAR1983
01MAR1983
p106d02a
27
27
No format was specified for Bonus, so PROC PRINT uses a BESTw.d format. One decimal position is
sufficient to display the values on this page.
6.03 Quiz
Evaluate the assignment statements below given the
values shown in the PDV.
x
y
.
a.
num=y+z/2;
b.
num=x+z/2;
z
10
28
28
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-15
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Level 1
1. Creating a SAS Data Set
a. Retrieve and submit the starter program p106e01.
What is the name of the variable that contains gender values? ____________________________
What are the two observed gender values? ____________________________________________
b. Add a DATA step before the PROC PRINT step to create a new data set named work.youngadult
using the data set orion.customer_dim as input. Include a WHERE statement to select only
female customers.
Submit the program and confirm that work.youngadult was created with 30 observations and 11
variables.
c. Modify the program to select female customers whose age is between 18 and 36. Submit the
program and confirm that work.youngadult was created with 15 observations and 11 variables.
d. Modify the program to select 18- to 36-year-old female customers who have the word Gold in
their Customer_Group value. Submit the program and confirm that work.youngadult was
created with 5 observations and 11 variables.
e. Add an assignment statement to the DATA step to create a new variable, Discount, and assign it a
value of .25.
f. Modify the PROC PRINT step to print the new data set as shown below. Use an ID statement to
display Customer_ID instead of the Obs column. Results should contain five observations.
Customer_ID
5
9
45
49
2550
Customer_Name
Sandrina Stephano
Cornelia Krahl
Dianne Patchin
Annmarie Leveille
Sanelisiwe Collier
Customer_
Age
28
33
28
23
19
Customer_
Gender
F
F
F
F
F
Customer_Group
Orion
Orion
Orion
Orion
Orion
Club
Club
Club
Club
Club
Gold
Gold
Gold
Gold
Gold
members
members
members
members
members
Discount
0.25
0.25
0.25
0.25
0.25
Level 2
2. Creating a SAS Data Set
a. Write a DATA step to create a new data set named work.assistant using the data set orion.staff as
input.
b. The work.assistant data set should contain only the observations where Job_Title contains
Assistant and Salary is less than $26,000.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-16
Job_Title
Warehouse Assistant I
Warehouse Assistant I
Warehouse Assistant I
Service Assistant I
Service Assistant I
Salary
Increase
New_Salary
$25,130.00
$25,905.00
$25,185.00
$25,195.00
$25,735.00
$2,513.00
$2,590.50
$2,518.50
$2,519.50
$2,573.50
$27,643.00
$28,495.50
$27,703.50
$27,714.50
$28,308.50
Challenge
3. Using the SOUNDS-LIKE Operator to Select Observations
a. Write a DATA step to create a new data set named work.tony using orion.customer_dim as
input.
b. Include a WHERE statement in the DATA step to select observations in which the
Customer_FirstName value sounds like Tony.
Documentation on the SOUNDS-LIKE operator can be found in the SAS Help facility or
product documentation by searching for sounds-like operator.
Customer_
FirstName
Customer_
LastName
Tonie
Tommy
Asmussen
Mcdonald
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
33
33
work.subset1
orion.sales
Employee_ID
Gender
Country
Birth_Date
34
34
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-17
6-18
DROP Statement
The DROP statement specifies the variables to exclude
from the output data set.
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
DROP variable-list;
Partial SAS Log
NOTE: There were 61 observations read from the data set ORION.SALES.
WHERE (Country='AU') and Job_Title contains 'Rep';
NOTE: The data set WORK.SUBSET1 has 61 observations and 6 variables.
p106d03
35
35
36
Obs
First_
Name
Last_Name
1
2
3
4
5
Irenie
Christina
Kimiko
Lucian
Fong
Elvish
Ngan
Hotstone
Daymond
Hofmeister
Salary
26600
27475
26190
26480
32040
Job_Title
Sales
Sales
Sales
Sales
Sales
Rep.
Rep.
Rep.
Rep.
Rep.
II
II
I
I
IV
Hire_
Date
Bonus
6575
8217
10866
8460
8460
2660.0
2747.5
2619.0
2648.0
3204.0
p106d03
36
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-19
KEEP Statement
The KEEP statement specifies all variables to include in
the output data set.
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
keep First_Name Last_Name Salary
Job_Title Hire_Date Bonus;
run;
KEEP variable-list;
37
37
When you use a KEEP statement, be sure to name every variable to be written to the new SAS data set,
including any variables created within the step, such as Bonus.
38
38
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-20
First_
Name
Last_Name
1
2
3
4
5
Irenie
Christina
Kimiko
Lucian
Fong
Elvish
Ngan
Hotstone
Daymond
Hofmeister
Salary
26600
27475
26190
26480
32040
Hire_
Date
Job_Title
Sales
Sales
Sales
Sales
Sales
Rep.
Rep.
Rep.
Rep.
Rep.
II
II
I
I
IV
6575
8217
10866
8460
8460
Bonus
2660.0
2747.5
2619.0
2648.0
3204.0
p106d04
39
39
41
41
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Execution Phase
42
42
Compilation Phase
Salary
43
43
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-21
6-22
Compilation
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
p106d03
...
44
44
Compilation
PDV
Employee_ID
N8
Country
$2
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
First_Name
$ 12
Birth_Date
N8
Last_Name Gender
$ 18
$1
Salary
N8
Job_Title
$ 25
Hire_Date
N8
45
45
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Compilation
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
PDV
Employee_ID
N8
Country
$2
First_Name
$ 12
Birth_Date
N8
Last_Name Gender
$ 18
$1
Hire_Date
N8
Salary
N8
Job_Title
$ 25
Bonus
N8
...
46
46
Compilation
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
PDV
Employee_ID
N8
Country
$2
First_Name
$ 12
Birth_Date
N8
Last_Name Gender
D
$ 18
$1
Hire_Date
N8
Salary
N8
Job_Title
$ 25
Bonus
N8
47
47
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
6-23
6-24
Compilation
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
PDV
Employee_ID
N8
Country
$2
First_Name
$ 12
Birth_Date
N8
Last_Name Gender
D
$ 18
$1
Hire_Date
N8
Salary
N8
Job_Title
$ 25
Bonus
N8
Last_Name
Salary
Job_Title
Hire_Date
Bonus
48
48
Execution Phase
Compile the step
Compilation Phase
Success?
No
Next step
Yes
Execution Phase
Yes
Next step
No
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Execution
Initialize PDV
Partial orion.sales
Employee
_ID
120121
120122
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
Hire_Date
6575
8217
...
120123
10866
120124
8460
PDV
Employee
_ID
.
...
Salary
...
Country
Birth_Date
Hire_Date
.
Bonus
.
work.subset1
First_Name
Last_Name
Salary
Job_Title
Hire_Date
Bonus
...
50
50
Execution
Partial orion.sales
Employee
_ID
120121
120122
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
Hire_Date
6575
8217
...
120123
10866
120124
8460
PDV
Employee
_ID
120121
...
Salary
...
26600
Country
AU
Birth_Date
-4169
Hire_Date
Bonus
6575
work.subset1
First_Name
Last_Name
Salary
Job_Title
Hire_Date
Bonus
51
51
...
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-25
6-26
Execution
Partial orion.sales
Employee
_ID
120121
120122
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
Hire_Date
6575
8217
...
120123
10866
120124
8460
PDV
Employee
_ID
120121
...
Salary
...
26600
Country
AU
Birth_Date
Hire_Date
-4169
Bonus
6575
2660
work.subset1
First_Name
Last_Name
Salary
Job_Title
Hire_Date
Bonus
...
52
52
Execution
Partial orion.sales
Employee
_ID
120121
120122
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
Implicit OUTPUT;
Hire_Date
6575
8217
...
120123
10866
120124
8460
Implicit RETURN;
PDV
Employee
_ID
120121
...
Salary
...
26600
Country
AU
Birth_Date
-4169
Hire_Date
6575
Bonus
2660
work.subset1
First_Name
Last_Name
Irenie
Elvish
Salary
Job_Title
Hire_Date
6575
Bonus
2660
53
53
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
6-27
Execution
Reinitialize PDV
Partial orion.sales
Employee
_ID
120121
120122
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
Hire_Date
6575
8217
...
120123
10866
120124
8460
PDV
Employee
_ID
120121
...
Salary
...
26600
Country
AU
Birth_Date
-4169
Hire_Date
Bonus
6575
work.subset1
First_Name
Last_Name
Irenie
Elvish
Salary
Job_Title
Hire_Date
6575
Bonus
2660
...
54
54
Only the new variables are reinitialized. The variables that come from the input data set are not
reinitialized because they are overwritten when the next observation is read into the PDV. Values in the
PDV are overwritten even if values in the next observation are missing.
Execution
Partial orion.sales
Employee
_ID
120121
120122
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
Hire_Date
6575
8217
...
120123
10866
120124
8460
PDV
Employee
_ID
120122
...
Salary
...
27475
Country
AU
Birth_Date
-523
Hire_Date
Bonus
8217
work.subset1
First_Name
Last_Name
Irenie
Elvish
Salary
Job_Title
Hire_Date
6575
Bonus
2660
55
55
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
6-28
Execution
Partial orion.sales
Employee
_ID
120121
120122
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
Hire_Date
6575
8217
...
120123
10866
120124
8460
PDV
Employee
_ID
120122
...
Salary
...
27475
Country
AU
Birth_Date
-523
Hire_Date
8217
Bonus
274.75
work.subset1
First_Name
Last_Name
Irenie
Elvish
Salary
Job_Title
Hire_Date
Bonus
6575
2660
...
56
56
Execution
Partial orion.sales
Employee
_ID
120121
120122
data work.subset1;
set orion.sales;
where Country='AU' and
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
Implicit OUTPUT;
run;
Hire_Date
6575
8217
...
120123
10866
120124
8460
Implicit RETURN;
PDV
Employee
_ID
120122
...
Salary
...
27475
Country
AU
Birth_Date
-523
Hire_Date
8217
Bonus
274.75
work.subset1
First_Name
Last_Name
Salary
Job_Title
Hire_Date
Bonus
Irenie
Elvish
6575
2660.00
Christina
Ngan
8217
2747.50
57
57
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
Partial orion.sales
Employee
_ID
120121
120122
data work.subset1;
set orion.sales;
where Country='AU'
and until EOF
Continue
Job_Title contains 'Rep';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
Hire_Date
6575
8217
...
120123
10866
120124
8460
PDV
Employee
_ID
120122
...
Salary
...
27475
Country
Birth_Date
AU
Hire_Date
-523
Bonus
8217
work.subset1
First_Name
Last_Name
Irenie
Elvish
Salary
Job_Title
Hire_Date
6575
Bonus
2660.00
Christina
Ngan
8217
2747.50
58
58
59
First_
Name
Last_Name
Salary
Irenie
Christina
Kimiko
Lucian
Fong
Elvish
Ngan
Hotstone
Daymond
Hofmeister
26600
27475
26190
26480
32040
Job_Title
Sales
Sales
Sales
Sales
Sales
Rep.
Rep.
Rep.
Rep.
Rep.
II
II
I
I
IV
Hire_
Date
Bonus
6575
8217
10866
8460
8460
2660.0
2747.5
2619.0
2648.0
3204.0
p106d03
59
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-29
6-30
orion.sales
work.auemps
Bonus
61
61
Selecting Observations
Subsetting is based on the new variable, Bonus, that is
created with an assignment statement.
data work.auemps;
set orion.sales;
where Country='AU';
Bonus=Salary*.10;
drop Employee_ID Gender Country
Birth_Date;
run;
A WHERE statement is used to subset observations when
the selected variables exist in the input data set.
62
p106d03
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.04 Quiz
Open and submit p106a03. Is the output data set created
successfully?
data work.usemps;
set orion.sales;
Bonus=Salary*.10;
where Country='US' and Bonus>=3000;
run;
p106a03
63
Subsetting IF
The subsetting IF statement tests a condition to determine
whether the DATA step should continue processing the
current observation.
data work.auemps;
set orion.sales;
where Country='AU';
Bonus=Salary*.10;
if Bonus>=3000;
run;
IF condition;
In this program, processing will reach the bottom of the
DATA step and output an observation only if the condition
is true.
65
p106d05
65
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-31
6-32
data work.auemps;
set orion.sales;
where Country='AU';
Bonus=Salary*.10;
if Bonus>=3000;
run;
NOTE: There were 63 observations read from the data set ORION.SALES.
WHERE Country='AU';
NOTE: The data set WORK.AUEMPS has 12 observations and 10 variables.
66
66
First_
Name
Last_Name
Salary
Bonus
Tom
Wilson
Fong
Monica
Alvin
Alexei
Viney
Caterina
Daniel
Lynelle
Rosette
Fadi
Zhou
Dawes
Hofmeister
Kletschkus
Roebuck
Platts
Barbis
Hayawardhana
Pilgrim
Phoumirath
Martines
Nowd
108255
87975
32040
30890
30070
32490
30265
30490
36605
30765
30785
30660
10825.5
8797.5
3204.0
3089.0
3007.0
3249.0
3026.5
3049.0
3660.5
3076.5
3078.5
3066.0
67
67
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Read an Observation
False
IF Expression
A subsetting IF statement is
valid only in a DATA step.
True
Continue Processing
the Observation
Output Observation
to SAS Data Set
68
68
Idea Exchange
File p106a04 contains two versions of the previous
program. Submit both programs and compare the output
and number of observations read. What do you notice
about the results?
data work.auemps;
set orion.sales;
Bonus=Salary*.10;
if Country='AU' and Bonus>=3000;
run;
69
69
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-33
6-34
WHERE
IF
Yes
No
SET statement
Yes
Yes
assignment statement
No
Yes
PROC step
DATA step (source of variable)
70
70
work.subset1
formats
72
72
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
LABEL Statement
The LABEL statement assigns descriptive labels to
variables.
DATA Step
LABEL
statement
Sales Title
Date Hired
work.subset1
First_Name
Last_Name
Salary
Job_Title
Hire_Date
Bonus
73
p106d06
74
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-35
6-36
Variable
Type
Len
6
1
5
4
2
3
Bonus
First_Name
Hire_Date
Job_Title
Last_Name
Salary
Num
Char
Num
Char
Char
Num
8
12
8
25
18
8
Label
Date Hired
Sales Title
p106d06
75
75
76
First_
Name
Last_Name
Salary
Irenie
Christina
Kimiko
Lucian
Fong
Elvish
Ngan
Hotstone
Daymond
Hofmeister
26600
27475
26190
26480
32040
Sales Title
Date
Hired
Bonus
Sales
Sales
Sales
Sales
Sales
6575
8217
10866
8460
8460
2660.0
2747.5
2619.0
2648.0
3204.0
Rep.
Rep.
Rep.
Rep.
Rep.
II
II
I
I
IV
p106d06
76
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-37
First_
Name
Last_Name
Salary
Irenie
Christina
Kimiko
Lucian
Fong
Elvish
Ngan
Hotstone
Daymond
Hofmeister
26600
27475
26190
26480
32040
Sales
Title
Sales
Sales
Sales
Sales
Sales
Rep.
Rep.
Rep.
Rep.
Rep.
II
II
I
I
IV
Date
Hired
Bonus
6575
8217
10866
8460
8460
2660.0
2747.5
2619.0
2648.0
3204.0
p106d06
77
77
PROC PRINT is the only SAS procedure that requires either the LABEL option or the SPLIT= option to
use custom labels.
6.05 Quiz
What column heading will be displayed for Job_Title in the
program below?
data work.us;
set orion.sales;
where Country='US';
Bonus=Salary*.10;
label Job_Title='Sales Title';
drop Employee_ID Gender Country
Birth_Date;
run;
proc print data=work.subset1 label;
label Job_Title='Title';
run;
78
p106a05
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-38
FORMAT Statement
The FORMAT statement associates formats with
variables.
DATA Step
FORMAT
statement
commax8.
ddmmyy10.
commax8.2
work.subset1
First_Name
Last_Name
Irenie
Elvish
Salary
Job_Title
Hire_Date
6575
Bonus
2660.0
80
p106d07
81
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Variable
Type
Len
6
1
5
4
2
3
Bonus
First_Name
Hire_Date
Job_Title
Last_Name
Salary
Num
Char
Num
Char
Char
Num
8
12
8
25
18
8
Format
Label
COMMAX8.2
DDMMYY10.
Date Hired
Sales Title
COMMAX8.
p106d07
82
First_
Name
Last_Name
Salary
Sales Title
Date Hired
Bonus
Irenie
Christina
Kimiko
Lucian
Fong
Elvish
Ngan
Hotstone
Daymond
Hofmeister
26.600
27.475
26.190
26.480
32.040
Sales
Sales
Sales
Sales
Sales
01/01/1978
01/07/1982
01/10/1989
01/03/1983
01/03/1983
2.660,00
2.747,50
2.619,00
2.648,00
3.204,00
Rep.
Rep.
Rep.
Rep.
Rep.
II
II
I
I
IV
p106d07
83
83
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-39
6-40
Level 1
4. Subsetting Observations Based on Two Conditions
a. Retrieve the starter program p106e04.
b. Modify the DATA step to select only the observations with Emp_Hire_Date values on or after
July 1, 2010. Subset the observations as they are being read into the program data vector.
c. In the DATA step, write another statement to select only the observations that have an increase
greater than 3000.
d. The new data set should contain only the following variables: Employee_ID, Emp_Hire_Date,
Salary, Increase, and NewSalary.
e. Add permanent labels for Employee_ID, Emp_Hire_Date, and NewSalary as shown in the
report below.
f. Add permanent formats to display Salary and NewSalary with dollar signs, commas, and two
decimal places, and Increase with commas and no decimal places.
g. Submit a PROC CONTENTS step to verify that the labels and formats are stored in the descriptor
portion of the new data set, work.increase.
Partial PROC CONTENTS Output
Alphabetic List of Variables and Attributes
#
Variable
Type
3
1
4
5
2
Emp_Hire_Date
Employee_ID
Increase
NewSalary
Salary
Num
Num
Num
Num
Num
Len
8
8
8
8
8
Format
Informat
Label
DATE9.
12.
COMMA5.
DOLLAR10.2
DOLLAR10.2
DATE9.
Hire Date
Employee ID
New Annual Salary
Employee Annual Salary
h. Some variables have labels and formats that were not defined in this program. How were these
created? ____________________________________________________________________
i. Submit the program to create the PROC PRINT report below, with labels split over multiple lines.
Results should contain 10 observations.
Obs
1
2
3
...
9
10
Employee
ID
Employee
Annual
Salary
Hire
Date
120128
120144
120161
$30,890.00
$30,265.00
$30,785.00
01NOV2010
01OCT2010
01OCT2010
3,089
3,027
3,079
$33,979.00
$33,291.50
$33,863.50
121085
121107
$32,235.00
$31,380.00
01JAN2011
01JUL2010
3,224
3,138
$35,458.50
$34,518.00
Increase
Level 2
5. Subsetting Observations Based on Three Conditions
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
New
Annual
Salary
6-41
Variable
Type
2
4
1
3
5
Customer_ID
Delivery_Date
Employee_ID
Order_Date
Order_Month
Num
Num
Num
Num
Num
Len
8
8
8
8
8
Format
Label
12.
MMDDYY10.
12.
MMDDYY10.
Customer ID
Date Delivered
Employee ID
Date Ordered
Month Ordered
h. Write a PROC PRINT step to create the report below. Results should contain nine observations.
Obs
1
2
3
4
5
6
7
8
9
Employee_ID
Customer_ID
Order_Date
Delivery_
Date
99999999
99999999
99999999
99999999
99999999
99999999
99999999
99999999
99999999
70187
52
16
61
2550
70201
9
71
70201
08/13/2007
08/20/2007
08/27/2007
08/29/2007
08/10/2008
08/15/2008
08/10/2009
08/30/2010
08/24/2011
08/18/2007
08/26/2007
09/04/2007
09/03/2007
08/15/2008
08/20/2008
08/15/2009
09/05/2010
08/29/2011
Order_
Month
8
8
8
8
8
8
8
8
8
Challenge
6. Using an IF-THEN/DELETE Statement to Subset Observations
a. Write a DATA step to create work.bigdonations using orion.employee_donations as input.
b. Use the SUM function to create a new variable, Total, which holds the sum of the four quarterly
donations.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-42
c. Use the N function to create a new variable, NumQtrs, which holds the count of nonmissing
values in Qtr1, Qtr2, Qtr3, and Qtr4. Explore the N function in the SAS Help facility or online
documentation.
d. The new data set should not include the charities or method of payment.
e. The final data set should contain only observations meeting the following two conditions:
Total values greater than or equal to 50
NumQtrs value equal to 4
Use an IF-THEN/DELETE statement to eliminate the observations where the conditions are not
met. Explore the use of IF-THEN/DELETE in the SAS Help facility or online documentation.
f. Store permanent labels in the new data set as shown in the report below.
g. Create the following report to verify that the labels were stored:
Alphabetic List of Variables and Attributes
#
Variable
Type
1
7
2
3
4
5
6
Employee_ID
NumQtrs
Qtr1
Qtr2
Qtr3
Qtr4
Total
Num
Num
Num
Num
Num
Num
Num
Len
8
8
8
8
8
8
8
Format
Label
12.
Employee ID
First Quarter
Second Quarter
Third Quarter
Fourth Quarter
First
Quarter
15
20
20
15
25
Second
Quarter
15
20
20
15
25
Third
Quarter
15
20
20
15
25
Fourth
Quarter
Total
15
20
20
15
25
60
80
80
60
100
Num
Qtrs
4
4
4
4
4
6.3 Solutions
Solutions to Exercises
1. Creating a SAS Data Set
a. What is the name of the variable that contains gender values? Customer_Gender
What are the possible values of this variable? M or F
data work.youngadult;
set orion.customer_dim;
where Customer_Gender='F' and
Customer_Age between 18 and 36 and
Customer_Group contains 'Gold';
Discount=.25;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.3 Solutions
run;
proc print data=work.youngadult;
var Customer_Name Customer_Age
Customer_Gender Customer_Group Discount;
id Customer_ID;
run;
2. Creating a SAS Data Set
data work.assistant;
set orion.staff;
where Job_Title contains 'Assistant' and
Salary<26000;
Increase=Salary*.10;
New_Salary=Salary+Increase;
run;
proc print data=work.assistant;
id Employee_ID;
var Job_Title Salary Increase New_Salary;
format Salary Increase New_Salary dollar10.2;
run;
3. Using the SOUNDS-LIKE Operator to Select Observations
data work.tony;
set orion.customer_dim;
where Customer_FirstName=* 'Tony';
run;
proc print data=work.tony;
var Customer_FirstName Customer_LastName;
run;
4. Subsetting Observations Based on Two Conditions
data work.increase;
set orion.staff;
where Emp_Hire_Date>='01JUL2010'd;
Increase=Salary*0.10;
if Increase>3000;
NewSalary=Salary+Increase;
label Employee_ID='Employee ID'
Salary='Annual Salary'
Emp_Hire_Date='Hire Date'
NewSalary='New Annual Salary';
format Salary NewSalary dollar10.2 Increase comma5.;
keep Employee_ID Emp_Hire_Date Salary Increase NewSalary;
run;
proc print data=work.increase split=' ';
run;
The existing labels and formats were inherited from the input data set.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-43
6-44
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.3 Solutions
16
y
.
a.
num=y+z/2;
z
10
4+10/2
num=(y+z)/2;
b.
num=x+z/2;
.+10/2
4+5
14/2
9
7
.+5
29
29
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-45
6-46
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7.2
7.3
7-2
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3
3
Business Scenario
The Sales Manager has requested a report about Orion
Star sales employees from Australia and the United
States.
The input data is in an Excel workbook.
sales.xls
4
4
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-3
7-4
Business Scenario
Use SAS/ACCESS Interface to PC Files to read the
worksheets within the sales.xls workbook as if they were
SAS data sets.
sales.xls
5
5
two worksheets
6
6
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-5
engine name
engine name
SAS/ACCESS provides data connectivity and integration between SAS and third-party data sources,
including Microsoft Excel workbooks and various databases. SAS/ACCESS uses data access engines to
read, write, and update data regardless of the data source or platform.
Both SAS and Microsoft Office offer 32-bit and 64-bit versions. Different SAS/ACCESS engines are
needed based on the products bitness. If the bitness of both products is the same, use the default
SAS/ACCESS Excel engine, If the bitness differs, use the PC Files Server engine and specify PATH=
in front of workbook-name.
The table below summarizes the possible bit combinations and the appropriate engine to use for each.
SAS
Microsoft
Office
SAS/ACCESS Engine
9.4
32-bit
PC Files Server
9.4
64-bit
Default
9.3 32-bit
32-bit
Default
9.3 32-bit
64-bit
PC Files Server
9.3 64-bit
32-bit
PC Files Server
9.3 64-bit
64-bit
Default
32-bit
Default
64-bit
PC Files Server
For more information, see the usage note Installing SAS 9.3 PC Files Server and using it to convert
32-bit Microsoft Office files to SAS 64-bit files: http://support.sas.com/kb/43/802.html.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-6
8
8
The named ranges might or might not exist, depending on how the Excel worksheets were created. They
are included here to show the difference between a named range and a worksheet name.
CONTENTS Procedure
proc contents data=orionx._all_;
run;
The CONTENTS Procedure
Directory
Libref
Engine
Physical Name
Schema/Owner
ORIONX
PCFILES
s:\workshop\sales.xls
.
Name
Member
Type
DBMS Member
Type
1
2
3
4
Australia
Australia$
UnitedStates
UnitedStates$
DATA
DATA
DATA
DATA
TABLE
SYSTEM TABLE
TABLE
SYSTEM TABLE
p107d01
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-7
ORIONX.'Australia$'n
DATA
PCFILES
.
.
Observations
Variables
Indexes
Observation Length
Deleted Observations
Compressed
Sorted
.
9
0
0
0
NO
NO
Default
Default
Variable
Type
Len
Format
Informat
Label
8
7
1
2
4
9
6
3
5
Birth_Date
Country
Employee_ID
First_Name
Gender
Hire_Date
Job_Title
Last_Name
Salary
Num
Char
Num
Char
Char
Num
Char
Char
Num
8
2
8
10
1
8
14
12
8
DATE9.
$2.
DATE9.
$2.
$10.
$1.
DATE9.
$14.
$12.
$10.
$1.
DATE9.
$14.
$12.
Birth Date
Country
Employee ID
First Name
Gender
Hire Date
Job Title
Last Name
Salary
10
10
The column headings are used to create variable names. In the SAS windowing environment, embedded
spaces in column names are replaced with underscores. In SAS Enterprise Guide, the column headings
are used without modification because special characters are allowed in variable names. Set the
VALIDVARNAME=V7 option in SAS Enterprise Guide to cause Enterprise Guide to behave the same as
the windowing environment.
Some fields are missing in the PROC CONTENTS output because Excel metadata is incomplete.
ORIONX.UnitedStates$'n
DATA
PCFILES
.
.
Observations
Variables
Indexes
Observation Length
Deleted Observations
Compressed
Sorted
.
9
0
0
0
NO
NO
Default
Default
Variable
Type
Len
Format
Informat
Label
8
7
1
2
4
9
6
3
5
Birth_Date
Country
Employee_ID
First_Name
Gender
Hire_Date
Job_Title
Last_Name
Salary
Num
Char
Num
Char
Char
Num
Char
Char
Num
8
2
8
10
1
8
14
12
8
DATE9.
$2.
DATE9.
$2.
$10.
$1.
DATE9.
$14.
$12.
$10.
$1.
DATE9.
$14.
$12.
Birth Date
Country
Employee ID
First Name
Gender
Hire Date
Job Title
Last Name
Salary
11
11
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-8
orionx.'Australia$'n
SAS name literal
12
12
13
120102
120103
120121
120122
120123
First_Name Last_Name
Tom
Wilson
Irenie
Christina
Kimiko
Zhou
Dawes
Elvish
Ngan
Hotstone
108255
87975
26600
27475
26190
Sales
Sales
Sales
Sales
Sales
Manager
Manager
Rep. II
Rep. II
Rep. I
Country
AU
AU
AU
AU
AU
Birth_
Date
Hire_Date
11AUG1973
22JAN1953
02AUG1948
27JUL1958
28SEP1968
01JUN1993
01JAN1978
01JAN1978
01JUL1982
01OCT1989
p107d01
13
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
p107d01
14
14
Last_Name
Job_Title
Hofmeister
Kletschkus
Platts
Phoumirath
Nowd
Sales
Sales
Sales
Sales
Sales
Rep.
Rep.
Rep.
Rep.
Rep.
Salary
IV
IV
IV
IV
IV
32040
30890
32490
30765
30660
15
15
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-9
7-10
Disassociating a Libref
If SAS has a libref assigned to an Excel workbook, the
workbook cannot be opened in Excel. To disassociate the
libref, use a LIBNAME statement with the CLEAR option.
p107d01
16
16
7.01 Quiz
Which PROC PRINT step displays the worksheet
containing employees from the United States?
a. proc print data=orionx.'UnitedStates';
run;
17
17
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-11
4. In the Server List window, select Libraries and click Refresh to refresh the list. Verify that
ORIONX is displayed as an active library. Expand ORIONX to see its contents. The named ranges
might not exist.
5. In the Server List window, right-click Australia$ and select Properties. Then click the Column tab
to see the variable information. Notice that the variable names contain embedded blanks. SAS
Enterprise Guide allows special characters in variable names without the need for a name literal.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-12
7. The VALIDVARNAME= option instructs SAS Enterprise Guide to follow the same rules as the SAS
windowing environment in regard to variable names. To set this option, return to the Program
window, and uncomment and submit the OPTIONS statement.
options validvarname=v7;
8. Resubmit the PROC CONTENTS step and observe that the embedded blanks have been replaced
with underscores.
9. In the Server List window, double-click Australia$ to see the data displayed in a data grid.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-13
11. Return to the Program window and submit the final LIBNAME statement to clear the libref, releasing
the spreadsheet. Check the log or the Server List window for success. You might need to refresh the
Server List window.
To restore the default variable name behavior, submit another OPTIONS statement to set
VALIDVARNAME=ANY.
4. Use the Explorer window to verify that orionx is active. Double-click orionx to drill into it to see its
contents. The named ranges might not exist.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-14
5. In the Explorer window, right-click Australia$ and select View Columns to see the variable
information. Notice that the variable names contain underscores in place of embedded blanks, and the
Label column contains the original spreadsheet column headings.
Variable
Type
Len
Format
Informat
Label
8
7
1
2
4
9
6
3
5
Birth_Date
Country
Employee_ID
First_Name
Gender
Hire_Date
Job_Title
Last_Name
Salary
Num
Char
Num
Char
Char
Num
Char
Char
Num
8
2
8
12
1
8
20
18
8
DATE9.
$2.
DATE9.
$2.
$12.
$1.
DATE9.
$20.
$18.
$12.
$1.
DATE9.
$20.
$18.
Birth Date
Country
Employee ID
First Name
Gender
Hire Date
Job Title
Last Name
Salary
In the Explorer window, double-click Australia$ to see the data portion. The labels are displayed as
column headings. Close the table.
7. Return to the Editor and submit the PROC PRINT step to see the same information in a report.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
ID
120125
120128
120135
120159
120166
Last_Name
Job_Title
Hofmeister
Kletschkus
Platts
Phoumirath
Nowd
Sales
Sales
Sales
Sales
Sales
7-15
Employee_
Salary
Rep.
Rep.
Rep.
Rep.
Rep.
IV
IV
IV
IV
IV
32040
30890
32490
30765
30660
8. Submit the final LIBNAME statement to clear the libref, releasing the spreadsheet. Check the log or
Explorer window for success.
Business Scenario
Create a SAS data set using a Microsoft Excel workbook
as input.
sales.xls
20
20
Considerations
Use a SAS/ACCESS LIBNAME statement to read the
Australia$ worksheet and create a temporary data set.
Australia$
DATA Step
work.subset2
Rep
The new data set should include the following:
only the employees with Rep in their job title
a Bonus variable that is 10% of Salary
permanent labels and formats
21
21
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-16
7.02 Poll
A DROP or KEEP statement can be used to control which
worksheet columns are written to the new data set.
True
False
22
If you are using SAS Enterprise Guide, remember to uncomment and submit the OPTIONS
statement to set VALIDVARNAME=V7.
libname orionxls pcfiles path="&path\sales.xls";
*options validvarname=v7;
/* Needed for SAS Enterprise Guide */
data work.subset2;
set orionxls.'Australia$'n;
where Job_Title contains 'Rep';
Bonus=Salary*.10;
keep First_Name Last_Name Salary Bonus
Job_Title Hire_Date;
label Job_Title='Sales Title'
Hire_Date='Date Hired';
format Salary comma10. Hire_Date mmddyy10.
Bonus comma8.2;
run;
proc contents data=work.subset2;
run;
proc print data=work.subset2 label;
run;
libname orionxls clear;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-17
Job_Title is not found in Australia$. Job Title is found instead, because SAS Enterprise Guide
permits blanks and special characters in variable names. Setting the VALIDVARNAME=v7 option
causes SAS Enterprise Guide to use the same variable naming rules as the SAS windowing
environment. Be sure to set the option as explained in the note above.
5. Run the PROC CONTENTS step. Verify that the formats and labels were stored in the descriptor
portion of work.subset2. Notice that all columns have labels, not just the columns that were assigned
labels in the DATA step.
Alphabetic List of Variables and Attributes
#
Variable
Type
Len
6
1
5
4
2
3
Bonus
First_Name
Hire_Date
Job_Title
Last_Name
Salary
Num
Char
Num
Char
Char
Num
8
10
8
14
12
8
Format
Informat
Label
COMMA8.2
$10.
MMDDYY10.
$14.
$12.
COMMA10.
$10.
DATE9.
$14.
$12.
First Name
Date Hired
Sales Title
Last Name
Salary
6. Submit the PROC PRINT step and verify that the results contain 61 observations and that the labels
were displayed and formats applied.
First Name
Last Name
Salary
Sales Title
Date Hired
Bonus
Irenie
Christina
Kimiko
...
Alban
Alena
Elvish
Ngan
Hotstone
26,600
27,475
26,190
Sales Rep. II
Sales Rep. II
Sales Rep. I
01/01/1978
07/01/1982
10/01/1989
2,660.00
2,747.50
2,619.00
Kingston
Moody
28,830
26,205
10/01/1996
09/01/2010
2,883.00
2,620.50
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-18
Level 1
1. Accessing an Excel Worksheet
a. Retrieve the starter program p107e01.
b. Add a LIBNAME statement before the PROC CONTENTS step to create a libref named
CUSTFM that references the Excel workbook, custfm.xls.
If you are using SAS Enterprise Guide, you need to set the VALIDVARNAME=V7
option if not already set in the session.
c. Submit the LIBNAME statement and the PROC CONTENTS step to create the following partial
PROC CONTENTS report:
Part 1 of 3
The CONTENTS Procedure
Directory
Libref
Engine
Physical Name
User
CUSTFM
EXCEL
custfm.xls
Admin
Name
Member
Type
1
2
Females$
Males$
DATA
DATA
DBMS
Member
Type
TABLE
TABLE
d. Add a SET statement in the DATA step to read the worksheet containing the male data.
e. Add a KEEP statement in the DATA step to include only the First_Name, Last_Name, and
Birth_Date variables in the new data set.
f. Add a FORMAT statement in the DATA step to display Birth_Date as a four-digit year.
g. Add a LABEL statement to change the column heading of Birth_Date to Birth Year.
h. Submit the program including the final LIBNAME statement and create the report below. Results
should contain 47 observations.
i. Verify that the worksheet was released.
Partial PROC PRINT Output
Obs
1
2
3
4
5
First Name
Last Name
Birth
Year
James
David
Markus
Ulrich
Jimmie
Kvarniq
Black
Sepke
Heyde
Evans
1974
1969
1988
1939
1954
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-19
Level 2
2. Accessing an Excel Worksheet
a. Write a LIBNAME statement to create a libref named PROD that references the Excel workbook
products.xls.
b. Write a PROC CONTENTS step to view all of the contents of PROD.
c. Submit the program to determine the names of the four worksheets in products.xls.
d. Write a DATA step that reads the worksheet containing sports data and creates a new data set
named work.golf.
The data set work.golf should
include only the observations where Category is equal to Golf
not include the Category variable
include a label of Golf Products for the Name variable.
e. Write a LIBNAME statement to clear the PROD libref.
f. Create the report below. The results should contain 56 observations.
Partial PROC PRINT Output
Obs
1
2
3
4
5
Golf Products
Ball Bag
Red/White/Black Staff 9 Bag
Tee Holder
Bb Softspikes - Xp 22-pack
Bretagne Performance Tg Men's Golf Shoes L.
Challenge
3. Creating an Excel Spreadsheet
a. Open p107e03. Insert a LIBNAME statement to associate the libref out with the Excel workbook
employees.xls in the default data folder.
b. Modify the program so that it creates a spreadsheet named salesemps in the employees.xls
workbook.
c. Submit the SAS program and verify that it created the data set out.salesemps with 71
observations and 4 variables as shown in the partial SAS log below.
NOTE: 71 records were read from the infile "s:\workshop\newemps.csv".
The minimum record length was 28.
The maximum record length was 47.
NOTE: The data set OUT.salesemps has 71 observations and 4 variables.
The program will fail if the workbook already exists. Use Windows Explorer to navigate
to the data folder and delete employees.xls.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-20
28
28
Business Scenario
The Northeast Sales Manager requested a report listing
supervisors from New York and New Jersey. The input
data is in an Oracle database.
29
29
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-21
Business Scenario
Use SAS/ACCESS to read the tables within the database
as if they were SAS data sets.
30
30
engine name
options
31
31
The engine name, such as Oracle or DB2, is the SAS/ACCESS component that reads and writes to your
DBMS. The engine name is required.
USER= specifies an optional Oracle user name. USER= must be used with PASSWORD=.
PASSWORD= (or PW=) specifies an optional Oracle password associated with the Oracle user name.
PATH= specifies the Oracle driver, node, and database. SAS/ACCESS uses the same Oracle path
designation that you use to connect to Oracle directly.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-22
SCHEMA= enables you to read database objects, such as tables and views, in the specified schema. If this
option is omitted, you connect to the default schema for your DBMS.
33
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
35
35
EMPID
STATE
JOBCATEGORY
1834
1433
1983
1420
1882
NY
NJ
NY
NJ
NY
BC
FA
FA
ME
ME
36
36
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-23
7-24
37
37
EMPID
STATE
JOBCATEGORY
1834
1433
1983
1420
1882
NY
NJ
NY
NJ
NY
BC
FA
FA
ME
ME
38
38
7.3 Solutions
Solutions to Exercises
1. Accessing an Excel Worksheet
libname custfm pcfiles path="&PATH\custfm.xls";
*options validvarname=v7;
/* needed for SAS Enterprise Guide */
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7.3 Solutions
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-25
7-26
Explore the IMPORT and EXPORT procedures. Describe the chief differences from using the
SAS/ACCESS LIBNAME statement.
When you use the SAS/ACCESS LIBNAME statement, you are
accessing the most recent data in the workbook or database.
not necessarily making a SAS copy of the data. You can use PROC PRINT (or other
procedures) directly on a sheet or table through a dynamic "pipeline."
When you import, you are
duplicating storage, in a sense
creating a static SAS copy of the data, which might get out of date unless you remember to
re-import
allowing for easier programming (for example, no name literals).
18
18
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7.3 Solutions
23
34
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-27
7-28
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8.2
8.3
8.4
8.5
8-2
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8.1 Introduction to
o Reading Raw
w Data Files
8.1 Introd
ductio
on to Readiing Ra
aw Da
ata
Files
Ob
bjectives
Identify type
es of raw data
a files and in
nput styles.
Define the te
erms standard and nonsttandard data
a.
3
3
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-3
8-4
5
5
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7
7
8
8
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-5
8-6
Chapter 8 Read
ding Raw Data Files
delimited
mn
fixed colum
both delimitted and fixed
d column
I do not rea
ad raw data files.
f
10
10
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8.2 Reading
g Standard Dellimited Data
12
12
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-7
8-8
List Input
Use list input to read delimited raw data files.
Partial sales.csv
120102,Tom,Zhou,M,108255,Sales Manager,AU,11AUG1973,06/01/1993
120103,Wilson,Dawes,M,87975,Sales Manager,AU,22JAN1953,01/01/1978
120121,Irenie,Elvish,F,26600,Sales Rep. II,AU,02AUG1948,01/01/1978
120122,Christina,Ngan,F,27475,Sales Rep. II,AU,27JUL1958,07/01/1982
120123,Kimiko,Hotstone,F,26190,Sales Rep. I,AU,28SEP1968,10/01/1989
14
14
8.02 Quiz
Which fields in this file can be read as standard numeric
values?
Partial sales.csv
120102,Tom,Zhou,M,108255,Sales Manager,AU,11AUG1973,06/01/1993
120103,Wilson,Dawes,M,87975,Sales Manager,AU,22JAN1953,01/01/1978
120121,Irenie,Elvish,F,26600,Sales Rep. II,AU,02AUG1948,01/01/1978
120122,Christina,Ngan,F,27475,Sales Rep. II,AU,27JUL1958,07/01/1982
120123,Kimiko,Hotstone,F,26190,Sales Rep. I,AU,28SEP1968,10/01/1989
15
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8.2 Reading
g Standard Dellimited Data
8-9
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-10
Chapter 8 Read
ding Raw Data Files
da
ata work.subset;
d
infile "&path\sales.csv" dlm=',';
yee_ID First_Na
ame $ Last_Nam
me $
input Employ
Gender
r $ Salary Job_
_Title $ Count
try $;
un;
ru
NOTE: T
The infile "s:\workshop\sale
es.csv" is:
F
Filename=s:\wo
orkshop\sales.c
csv,
R
RECFM=V,LRECL=
=256,File Size (bytes)=11340
0
NOTE: 1
165
T
The
T
The
NOTE: The
T
20
p108d01
20
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8.2 Reading
g Standard Dellimited Data
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-11
8-12
Chapter 8 Read
ding Raw Data Files
Co
ompilation
n
data work.subs
set;
in
nfile "&pa
ath\sales.
.csv" dlm
m=',';
in
nput Emplo
oyee_ID Fi
irst_Name $ Last_N
Name $
Gende
er $ Salar
ry Job_Title $ Cou
untry $;
run;
24
p108d01
24
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
...
8.2 Reading
g Standard Dellimited Data
The defau
ult length of th
he input buffeer depends on
n the operatin g system. It ccan be modified using the
LRECL= option in the INFILE stateement.
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-13
8-14
Chapter 8 Read
ding Raw Data Files
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8.2 Reading
g Standard Dellimited Data
30
30
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-15
8-16
Chapter 8 Read
ding Raw Data Files
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8.2 Reading
g Standard Dellimited Data
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-17
8-18
Chapter 8 Read
ding Raw Data Files
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8.2 Reading
g Standard Dellimited Data
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-19
8-20
Chapter 8 Read
ding Raw Data Files
There is usually
u
no dellimiter after th
he last field in
n a record, soo SAS stops reeading when it encounters an
end-of-reccord marker.
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8.2 Reading
g Standard Dellimited Data
Execution
Herre is the outp
put data set after
a
the firstt iteration of the
DAT
TA step.
work.subset
Emplo
oyee
_ID
D
First
e
_Name
12
20102 Tom
Last
G
Gender
Salary
y
_Name
Zhou
Job
_Title
108255 Sales Ma
Country
AU
42
42
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
...
8-21
8-22
Chapter 8 Read
ding Raw Data Files
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8.2 Reading
g Standard Dellimited Data
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-23
8-24
Chapter 8 Read
ding Raw Data Files
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
50
50
First_
Name
Last_
Name
Tom
Wilson
Irenie
Christin
Kimiko
Lucian
Fong
Zhou
Dawes
Elvish
Ngan
Hotstone
Daymond
Hofmeist
Gender
Salary
M
M
F
F
F
M
M
108255
87975
26600
27475
26190
26480
32040
Job_
Title
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Ma
Ma
Re
Re
Re
Re
Re
Country
AU
AU
AU
AU
AU
AU
AU
p108d01
52
Some character values contain unnecessary trailing blanks, although this is not obvious from a
PROC PRINT report.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-25
8-26
Chapter 8 Read
ding Raw Data Files
The LENG
GTH statemeent is used primarily for ch
haracter variabbles.
Co
ompilation
n
data work.subs
set;
ength Firs
st_Name $ 12 Last_N
Name $ 18
le
Gend
der $ 1 Jo
ob_Title $ 25
Coun
ntry $ 2;
in
nfile "&pa
ath\sales.
.csv" dlm=
=',';
in
nput Emplo
oyee_ID Fi
irst_Name $ Last_Na
ame $
Gende
er $ Salar
ry Job_Tit
tle $ Coun
ntry $;
run;
PDV
First_
Name
$ 12
Gender
$1
Job_T
Title
$ 25
Countrry
$2
54
...
54
The namee, type, and leength of a variiable are deteermined at thee variables firrst use. Thesee specificationns
can be in a LENGTH statement
s
or th
he INPUT staatement, whicchever appearrs first in the DATA step. T
The
name is used exactly ass specified at first use, inclluding the casse.
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
Compilation
data work.subset;
length First_Name $ 12 Last_Name $ 18
Gender $ 1 Job_Title $ 25
Country $ 2;
infile "&path\sales.csv" dlm=',';
input Employee_ID First_Name $ Last_Name $
Gender $ Salary Job_Title $ Country $;
run;
PDV
First_
Name
$ 12
Last
_Name
$ 18
Gender
$1
Job_Title
$ 25
Country
$2
Employee_
ID
N8
Salary
N8
55
55
Last_Name
Tom
Wilson
Irenie
Christina
Kimiko
Zhou
Dawes
Elvish
Ngan
Hotstone
Gender Job_Title
M
M
F
F
F
Sales
Sales
Sales
Sales
Sales
Manager
Manager
Rep. II
Rep. II
Rep. I
Country
AU
AU
AU
AU
AU
Employee_
ID
Salary
120102
120103
120121
120122
120123
108255
87975
26600
27475
26190
p108d02
56
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-27
8-28
8.05 Quiz
Suppose you want the order of the variables to match the
order of the fields. You can include the numeric variables
in the LENGTH statement. Which of the following
produces the correct results?
a. length Employee_ID First_Name $ 12
Last_Name $ 18 Gender $ 1
Salary Job_Title $ 25
Country $ 2;
b. length Employee_ID 8 First_Name $ 12
Last_Name $ 18 Gender $ 1
Salary 8 Job_Title $ 25
Country $ 2;
57
data work.subset;
length Employee_ID 8 First_Name $ 12
Last_Name $ 18 Gender $ 1
Salary 8 Job_Title $ 25
Country $ 2;
infile "&path\sales.csv" dlm=',';
input Employee_ID First_Name Last_Name
Gender Salary Job_Title Country;
run;
59
p108d03
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Variable
Type
Len
1
2
3
4
5
6
7
Employee_ID
First_Name
Last_Name
Gender
Salary
Job_Title
Country
Num
Char
Char
Char
Num
Char
Char
8
12
18
1
8
25
2
60
60
First_
Name
Last_Name
Gender
Salary
Job_Title
Tom
Wilson
Irenie
Christina
Kimiko
Lucian
Fong
Zhou
Dawes
Elvish
Ngan
Hotstone
Daymond
Hofmeister
M
M
F
F
F
M
M
108255
87975
26600
27475
26190
26480
32040
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Manager
Manager
Rep. II
Rep. II
Rep. I
Rep. I
Rep. IV
Country
AU
AU
AU
AU
AU
AU
AU
61
61
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-29
8-30
Chapter 8 Read
ding Raw Data Files
8.006 Quiz
What problems do you see with the data
a values for the
t
lastt two data fie
elds, Salary and
a Country
y?
Partial sales3in
nv.csv
120
0102,Tom,Zhou,Manager,108255
5,AU
120
0103,Wilson,Daw
wes,Manager,87
7975,AU
120
0121,Irenie,Elvissh,Rep. II,26600,AU
120
0122,Christina,N
Ngan,Rep. II,n/a,A
AU
120
0123,Kimiko,Hottstone,Rep. I,261
190,AU
120
0124,Lucian,Dayymond,Rep. I,26480,12
120
0125,Fong,Hofm
meister,Rep. IV,32
2040,AU
64
64
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
p108d04
66
66
Employee_
ID
120102
120103
120121
120122
120123
120124
120125
First
Last
Tom
Wilson
Irenie
Christina
Kimiko
Lucian
Fong
Zhou
Dawes
Elvish
Ngan
Hotstone
Daymond
Hofmeister
Job_
Title
Salary
Manager
Manager
Rep. II
Rep. II
Rep. I
Rep. I
Rep. IV
108255
87975
26600
.
26190
26480
32040
Country
AU
AU
AU
AU
AU
12
AU
67
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-31
8-32
data work.sales;
infile "&path\sales3inv.csv" dlm=',';
input Employee_ID First $ Last $
Job_Title $ Salary Country $;
run;
A data error occurs when a data value does not match the
field specification.
68
Even though these are referred to as data errors, they generate notes, not error messages. Syntax errors
stop the DATA step, whereas data errors allow processing to continue.
Data Errors
When this kind of data error occurs, the following
information is written to the SAS log:
a note describing the error
a column ruler
the input record
the contents of the PDV
NOTE: Invalid data for Salary in line 4 31-33.
RULE:
----+----1----+----2----+----3----+----4----+----54
120122,Christina,Ngan,Rep. II,n/a,AU 36
Employee_ID=120122 First=Christin Last=Ngan Job_Title=Rep. II Salary=.
Country=AU _ERROR_=1 _N_=4
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-33
Data Errors
Two temporary variables are created during the
processing of every DATA step:
_N_ is the DATA step iteration counter.
_ERROR_ indicates data error status.
0 indicates that no data error occurred on that
record.
1 indicates that one or more data errors occurred
on that record.
NOTE: Invalid data for Salary in line 4 31-33.
RULE:
----+----1----+----2----+----3----+----4----+----54
120122,Christina,Ngan,Rep. II,n/a,AU 36
Employee_ID=120122 First=Christin Last=Ngan Job_Title=Rep. II Salary=.
Country=AU _ERROR_=1 _N_=4
70
70
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-34
72
72
Use the SAS system option ERRORS=n to specify the maximum number of observations for which error
messages about data input errors are printed.
options errors=5;
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Level 1
1. Reading a Comma-Delimited Raw Data File
a. Open p108e01. Add the appropriate LENGTH, INFILE, and INPUT statements to read the
comma-delimited raw data file named the following:
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Windows
"&path\newemps.csv"
UNIX
"&path/newemps.csv"
z/OS (OS/390)
"&path..rawdata(newemps)"
Type
Length
First
Character
12
Last
Character
18
Title
Character
25
Salary
Numeric
c. Submit the program to create the report below. The results should contain 71 observations.
Partial PROC PRINT Output
Obs
1
2
3
4
5
First
Last
Satyakam
Monica
Kevin
Petrea
Marina
Denny
Kletschkus
Lyon
Soltau
Iyengar
Title
Sales
Sales
Sales
Sales
Sales
Rep.
Rep.
Rep.
Rep.
Rep.
Salary
II
IV
I
II
III
26780
30890
26955
27440
29715
Level 2
2. Reading a Space-Delimited Raw Data File
a. Write a DATA step to create a new data set named work.qtrdonation, reading the spacedelimited raw data file named the following:
Windows
"&path\donation.dat"
UNIX
"&path/donation.dat"
z/OS (OS/390)
"&path..rawdata(donation)"
. . .
15 15
20 20
20 10
20 20
25
15 15
20 20
5 .
20 20
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-35
8-36
Type
Length
IDNum
Character
Qtr1
Numeric
Qtr2
Numeric
Qtr3
Numeric
Qtr4
Numeric
c. Write a PROC PRINT step to create the report below. The results contain 124 observations.
Partial PROC PRINT Output
Obs
IDNum
1
2
3
4
5
120265
120267
120269
120270
120271
Qtr1
Qtr2
Qtr3
Qtr4
.
15
20
20
20
.
15
20
10
20
.
15
20
5
20
25
15
20
.
20
Challenge
3. Reading a Tab-Delimited Raw Data File
a. Create a temporary data set, managers2, using the tab-delimited raw data file named the
following:
Windows
"&path\managers2.dat"
UNIX
"&path/ managers2.dat"
z/OS (OS/390)
"&path..rawdata(managers2)"
Tom
Wilson
Harry
Louis
Renee
Dennis
Zhou
M
Dawes
M
Highpoint
Favaron M
Capachietti
Lansberry
108255
87975
M
95090
F
M
Sales Manager
Sales Manager
243190 Chief Sales Officer
Senior Sales Manager
83505
Sales Manager
84260
Sales Manager
Type
ID
Numeric
First
Character
Last
Character
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Name
Type
Gender
Character
Salary
Numeric
Title
Character
c. The new data set should contain only First, Last, and Title.
d. Generate the report below. The results should contain six observations.
Obs
1
2
3
4
5
6
First
Tom
Wilson
Harry
Louis
Renee
Dennis
Last
Zhou
Dawes
Highpoint
Favaron
Capachietti
Lansberry
Title
Sales Manager
Sales Manager
Chief Sales Officer
Senior Sales Manager
Sales Manager
Sales Manager
77
77
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-37
8-38
Chapter 8 Read
ding Raw Data Files
Co
onsideratio
ons
Use
e modified lisst input to rea
ad all the fields from sale
es.csv.
Store the date fields
f
as SAS
S dates.
Partial sales.csv
120102,
,Tom,Zhou,M,10
08255,Sales Manager,AU,11AUG
G1973,06/01/19
993
120103,
,Wilson,Dawes,M,87975,Sales Manager,AU,22
2JAN1953,01/01
1/1978
120121,
,Irenie,Elvish
h,F,26600,Sales Rep. II,AU,0
02AUG1948,01/0
01/1978
120122,
,Christina,Nga
an,F,27475,Sales Rep. II,AU,27JUL1958,07/
/01/1982
120123,
,Kimiko,Hotsto
one,F,26190,Sales Rep. I,AU,28SEP1968,10/
/01/1989
79
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8..3 Reading No
onstandard Dellimited Data
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-39
8-40
82
8.08 Quiz
A format is an instruction that tells SAS how to display
data values. What formats would you specify to display a
SAS date in the styles shown below?
a) 01JAN2000
b) 01/16/2000
83
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8..3 Reading No
onstandard Dellimited Data
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-41
8-42
SAS Informats
Selected SAS Informats for Nonstandard Numeric Values
Informat
COMMA.
DOLLAR.
Definition
Reads nonstandard numeric data and removes
embedded commas, blanks, dollar signs, percent signs,
and dashes.
$CHAR.
SAS Informats
Informats are used to read and convert raw data.
Informat
COMMA.
DOLLAR.
$12,345
12345
COMMAX.
DOLLARX.
$12.345
12345
EUROX.
12.345
12345
$CHAR.
##Australia
$UPCASE. au
##Australia
AU
88
88
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
SAS Informats
Use date informats to read and convert dates to SAS date
values.
Informat
MMDDYY.
010160
01/01/60
01/01/1960
1/1/1960
DDMMYY.
311260
31/12/60
31/12/1960
365
DATE.
31DEC59
31DEC1959
-1
89
89
8.09 Quiz
Use the SAS Help facility or documentation to investigate
the DATEw. informat and answer the following questions:
a) What does the w represent?
90
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-43
8-44
Chapter 8 Read
ding Raw Data Files
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8..3 Reading No
onstandard Dellimited Data
da
ata work.sales;
d
infile "&path\sales.csv" dlm=',';
ame :$12. Last
t_Name :$18.
input Employee_ID First_Na
J
:$25
5. Country :$2.
Gender :$1. Salary Job_Title
_Date :date. Hi
ire_Date :mmdd
dyy.;
Birth_
un;
ru
NOTE: T
The infile "s:\workshop\sale
es.csv" is:
F
Filename=s:\wo
rkshop\sales.c
csv,
R
RECFM=V,LRECL=
256,File Size (bytes)=11340
0,
NOTE: 1
165 records were read from the
t
infile "s:\workshop\sale
es.csv".
NOTE: The
T
data set WORK.SALES
W
has 165 observati
ions and 9 vari
iables.
94
p108d06
94
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-45
8-46
96
98
p108d07
98
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
First_
Name
Last_Name
Sales Title
Tom
Wilson
Irenie
Christina
Kimiko
Zhou
Dawes
Elvish
Ngan
Hotstone
Sales
Sales
Sales
Sales
Sales
Manager
Manager
Rep. II
Rep. II
Rep. I
Salary
Date
Hired
$108,255
$87,975
$26,600
$27,475
$26,190
JUN1993
JAN1978
JAN1978
JUL1982
OCT1989
p108d07
99
99
WHERE
IF
Yes
No
SET statement
Yes
Yes
assignment statement
No
Yes
INPUT statement
No
Yes
100
100
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-47
8-48
Chapter 8 Read
ding Raw Data Files
Using
g List Input: Imporrtance off Colon Fo
ormat Mo
odifier
p108a02
2
1. Open p108a02 and
d examine thee INPUT stateement.
he INFILE staatement does not
n contain DLM=
D
becausse the file is sppace delimiteed.
Th
HiireDate and Salary
S
are non
nstandard num
meric fields, sso an informaat is needed.
2. In thee SAS window
wing environm
ment, select File
F Open Program, chhange the valuue for Files off
type to
t Data Files (*.dat), and select salary..dat.
Partiaal salary.dat
Donny 5MAY2008 25 FL $43,132.50
Margaret 20FEB2008 43 NC 65,150
Dan 1J
JUN2008 27 FL $40,000.00
0
Subash 2FEB2008 45 NC 75,750
Antonio 25MAY2008 35 FL $43,5
500.50
3. Subm
mit part 1 and view
v
the log and output. The
T expected ooutput is show
wn below.
/*
/ Part 1 - using colon
c
for
rmat modif
fiers*/
data
a work.sal
laries;
infile "&p
path\salar
ry.dat";
input Name
e $ HireDa
ate :date. Age Sta
ate $ Sala
ary :comm
ma.;
run;
;
proc
c print da
ata=work.s
salaries;
run;
;
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8..3 Reading No
onstandard Dellimited Data
8-49
Name
Donny
Margaret
Dan
Subash
Antonio
Hire
Date
Age
A
17657
17582
17684
17564
17677
25
43
27
45
35
State
FL
NC
FL
NC
FL
Salary
43132.5
65150.0
40000.0
75750.0
43500.5
4. Now lets see whatt happens wheen a colon forrmat modifierr is omitted fr
from Salary. S
Submit part 22.
/*
* Part 2 omit the
e colon format
f
mod
difier for Salary */
data
a work.sal
laries;
infile
i
"&p
path\salar
ry.dat";
input
i
Name
e $ HireDa
ate :date
e. Age Sta
ate $ Salary comma
a.;
run;
;
proc
c print da
ata=work.s
salaries;
run;
;
5. Exam
mine the log. There
T
are no errors
e
or warn
nings.
923
924
925
926
927
6. Exam
mine the outpu
ut.
Obs
1
2
3
4
5
Name
Donny
Margaret
Dan
Subash
Antonio
Date
Age
A
17657
17582
17684
17564
17677
25
43
27
45
35
State
FL
NC
FL
NC
FL
Salary
.
6
.
7
.
The Salary
S
values are incorrectt. Why are thee values eitheer missing or oonly one digitt in length?
Thee comma. info
format has a default
d
width of
o 1, so SAS reads 1 colum
mn from the input file. Thiis
reads the first ch
haracter of thee Salary valu
ue.
hen the first column contains a dollar sig
gn, a missingg value is assiggned to the nuumeric variabble.
Wh
Thiis does not caause a data errror because th
he comma infformat removves non-numerric characterss,
including the do
ollar sign. Wh
hen the first co
olumn containns a digit, thaat digit becom
mes the value oof the
varriable.
c
omitting a colon resu
ults in missinng or invalid vvalues. In otheer cases, it results
In some cases,
in data errrors.
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-50
Chapter 8 Read
ding Raw Data Files
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
First_
Name
Last_
Name
Job_
Title
1
2
3
Steven
Merle
Marta
Worton
Hieds
Bamberge
Auditor
Trainee
Manager
Salary
40450
24025
32000
p108d08
105
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Level 1
4. Reading Nonstandard Data from a Comma-Delimited Raw Data File
a. Open p108e04. Add the appropriate LENGTH, INFILE, and INPUT statements to read the
comma-delimited raw data file named the following:
Windows
"&path\custca.csv"
UNIX
"&path/custca.csv"
z/OS (OS/390)
"&path..rawdata(custca)"
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-51
8-52
Type
Length
First
Character
20
Last
Character
20
ID
Numeric
Gender
Character
BirthDate
Numeric
Age
Numeric
AgeGroup
Character
12
b. Use FORMAT and DROP statements in the DATA step to create a data set that results in the
report below when displayed with a PROC PRINT step. Include an appropriate title. The results
should contain 15 observations.
Partial PROC PRINT Output
Canadian Customers
Obs
1
2
3
4
5
First
Last
Bill
Susan
Andreas
Lauren
Lauren
Cuddy
Krasowski
Rennie
Krasowski
Marx
Gender
M
F
M
F
F
AgeGroup
15-30
46-60
61-75
15-30
31-45
years
years
years
years
years
Birth
Date
OCT1986
JUL1959
JUL1934
OCT1986
AUG1969
Level 2
5. Reading a Delimited Raw Data File with Nonstandard Data Values
a. Write a DATA step to create a temporary data set, prices, reading the delimited raw data file
named the following:
Windows
"&path\pricing.dat"
UNIX
"&path/ pricing.dat"
z/OS (OS/390)
"&path..rawdata(pricing)"
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-53
Obs
ProductID
StartDate
EndDate
Cost
Sales
Price
1
2
3
4
5
210200100009
210200100017
210200200023
210200600067
210200600085
06/09/2011
01/24/2011
07/04/2011
10/27/2011
08/28/2011
12/31/9999
12/31/9999
12/31/9999
12/31/9999
12/31/9999
15.50
17.80
8.25
28.90
17.85
34.70
22.80
19.80
47.00
39.40
Challenge
6. Reading In-Stream Delimited Data
a. Open p108e06. Write a DATA step to read the delimited in-stream data shown below.
An INFILE statement is required. Use SAS Help or online documentation to explore the
use of DATALINES as a file specification in an INFILE statement.
120102/Tom/Zhou/M/108,255/Sales Manager/01Jun1993
120103/Wilson/Dawes/M/87,975/Sales Manager/01Jan1978
120261/Harry/Highpoint/M/243,190/Chief Sales Officer/01Aug1991
121143/Louis/Favaron/M/95,090/Senior Sales Manager/01Jul2001
121144/Renee/Capachietti/F/83,505/Sales Manager/01Nov1995
121145/Dennis/Lansberry/M/84,260/Sales Manager/01Apr1980
Last
Title
Tom
Wilson
Harry
Louis
Renee
Dennis
Zhou
Dawes
Highpoint
Favaron
Capachietti
Lansberry
Sales Manager
Sales Manager
Chief Sales Officer
Senior Sales Manager
Sales Manager
Sales Manager
ID
Gender
Salary
HireDate
120102
120103
120261
121143
121144
121145
M
M
M
M
F
M
108255
87975
243190
95090
83505
84260
06/01/1993
01/01/1978
08/01/1991
07/01/2001
11/01/1995
04/01/1980
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-54
Chapter 8 Read
ding Raw Data Files
109
109
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
phone2.csv
1
1
2
2
3
3
4
4
1---5----0----5----0----5----0----5----0----5
James Kvarniq,(704) 293-8126,(701) 281-8923
Sandrina Stephano,, (919) 271-4592
Cornelia Krahl,(212) 891-3241,(212) 233-5413
Karen Ballinger,, (714) 644-9090
Elke Wallstab,(910) 763-5561,(910) 545-3421
111
111
8.11 Quiz
data work.contacts;
length Name $ 20 Phone Mobile $ 14;
infile "&path\phone2.csv" dlm=',';
input Name $ Phone $ Mobile $;
run;
proc print data=work.contacts noobs;
run;
112
p108a03
112
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-55
8-56
Chapter 8 Read
ding Raw Data Files
Co
onsecutivee Delimiteers in Listt Input
Listt input treats two or more
e consecutive
e delimiters as a
sing
gle delimiter and not as a missing value.
pho
one2.csv
1
1
2
2
3
3
4
4
1---5----0----5----0----5----0----5----0----5
5
Jam
mes Kvarniq,(704) 293-8126,(701) 281-8923
San
ndrina Stephano,, (919
9) 271-4592
Cor
rnelia Krahl,(212) 891
1-3241,(212)
) 233-5413
Kar
ren Ballinger,, (714) 644-9090
Elk
ke Wallstab,(910) 763-5561,(910) 545-3421
114
114
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8
8.4 Handling Missing Data
Ph
hone
James Kvarniq
hano
Sandrina Steph
Cornelia Krahl
l
Karen Ballinge
er
Elke Wallstab
(704) 293-8126
(212) 891-3241
(910) 763-5561
Mobile
(701)
919)
(9
(2
212)
(714)
910)
(9
281-8923
271-4592
233-5413
644-9090
545-3421
117
117
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-57
8-58
Chapter 8 Read
ding Raw Data Files
pho
one.csv
1
1
2
2missing
3 valu
3
4
4
ues
1---5----0----5----0----5----0----5----0----5
5
Jam
mes Kvarniq,(704) 293-8126,(701) 281-8923
San
ndrina Stephano,(919) 871-7830
Cor
rnelia Krahl,(212) 891
1-3241,(212)
) 233-5413
Kar
ren Ballinger,(714) 34
44-4321
Elk
ke Wallstab,(910) 763-5561,(910) 545-3421
The
e DSD option
n is not appro
opriate beca
ause the misssing
data
a is not markked by conse
ecutive delim
miters.
120
120
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8
8.4 Handling Missing Data
PRO
OC PRINT Output
O
Name
James Kvarniq
ano
Sandrina Stepha
Cornelia Krahl
r
Karen Ballinger
Elke Wallstab
Pho
one
(704)
(919)
(212)
(714)
(910)
293-8126
2
8
871-7830
8
891-3241
3
344-4321
7
763-5561
Mobile
(70
01) 281-8923
(21
12) 233-5413
(91
10) 545-3421
122
122
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8-59
8-60
Chapter 8 Read
ding Raw Data Files
Exerc
cises
If you restarted your SA
AS session siince the last exercise,
e
openn and submit tthe libname.ssas program
found in the
t data folder.
Level 1
7. Readiing a Comma
a-Delimited File
F with Misssing Values
a. Open p108e07 and insert IN
NFILE and IN
NPUT statemeents to read thhe comma-dellimited raw ddata
naamed the following:
Windowss
"&path\dona
"
ation.csv"
UNIX
"&path/dona
"
ation.csv"
z/OS (OS/39
90)
"&p
path..rawdatta(donation)"
"
b. Th
here might bee missing dataa in the middlle or at the ennd of a recordd. Read the following fields
frrom the raw data
d file:
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
Name
Type
Employee ID
Numeric
Quarter 1
Numeric
Quarter 2
Numeric
Quarter 3
Numeric
Quarter 4
Numeric
8-61
c. Write a PROC PRINT step to generate the report below. The results should include 124
observations.
Obs
EmpID
Q1
Q2
Q3
Q4
1
2
3
4
5
120265
120267
120269
120270
120271
.
15
20
20
20
.
15
20
10
20
.
15
20
5
20
25
15
20
.
20
Level 2
8. Reading a Delimited File with Missing Values
a. Write a DATA step to create a temporary data set, prices, using the asterisk-delimited raw data
file named the following:
Windows
"&path\prices.dat"
UNIX
"&path/prices.dat"
z/OS (OS/390)
"&path..rawdata(prices)"
There might be missing data at the end of some records. Read the following fields from the raw
data file:
Name
Type
Length
ProductID
Numeric
StartDate
Numeric
EndDate
Numeric
UnitCostPrice
Numeric
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-62
Name
Type
Length
UnitSalesPrice
Numeric
b. Define labels and formats in the DATA step to create a data set that generates the following output
when used in the PROC PRINT step. The results should contain 259 observations.
Partial PROC PRINT Output
2007 Prices
Obs
1
2
3
4
5
Product ID
Start of
Date Range
End of
Date Range
210200100009
210200100017
210200200023
210200600067
210200600085
06/09/2007
01/24/2007
07/04/2007
10/27/2007
08/28/2007
12/31/9999
12/31/9999
12/31/9999
12/31/9999
12/31/9999
Cost Price
per Unit
15.50
17.80
8.25
28.90
17.85
Sales
Price per
Unit
34.70
.
19.80
.
39.40
Challenge
9. Reading a Delimited File with Missing Values and Embedded Delimiters
a. Write a DATA step to create a temporary data set, salesmgmt, using the raw data file named the
following:
Windows
"&path\managers.dat"
UNIX
"&path/managers.dat"
z/OS (OS/390)
"&path..rawdata(managers)"
b. ID is a numeric value. The salesmgmt data set should contain only the variables shown in the
report below.
c. Write a PROC PRINT step to generate the report below. The results should contain six
observations.
PROC PRINT Output
Orion Star Managers
Obs
ID
1
2
3
4
120102
120103
120261
121143
Last
Title
Zhou
Dawes
Highpoint
Favaron
Sales Manager
Sales Manager
Senior Sales Manager
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
HireDate
Salary
01JUN1989
01JAN1974
01AUG1987
01JUL1997
.
87975
243190
95090
8.5 Solutions
5
6
121144
121145
Capachietti
Lansberry
Sales Manager
Sales Manager
.
01APR1976
8.5 Solutions
Solutions to Exercises
1. Reading a Comma-Delimited Raw Data File
data work.newemployees;
length First $ 12 Last $ 18 Title $ 25;
infile "&path\newemps.csv" dlm=',';
input First $ Last $ Title $ Salary;
run;
proc print data=work.newemployees;
run;
2. Reading a Space-Delimited Raw Data File
data work.qtrdonation;
length IDNum $ 6;
infile "&path\donation.dat";
input IDNum $ Qtr1 Qtr2 Qtr3 Qtr4;
run;
proc print data=work.qtrdonation;
run;
3. Reading a Tab-Delimited Raw Data File
data work.managers2;
length First Last $ 12 Title $ 25;
infile "&path\managers2.dat" dlm='09'x;
input ID First $ Last $ Gender $ Salary Title $;
keep First Last Title;
proc print data=work.managers2;
run;
4. Reading Nonstandard Data from a Comma-Delimited Raw Data File
data work.canada_customers;
length First Last $ 20 Gender $ 1 AgeGroup $ 12;
infile "&path\custca.csv" dlm=',';
input First $ Last $ ID Gender $
BirthDate :ddmmyy. Age AgeGroup $;
format BirthDate monyy7.;
drop ID Age;
run;
title 'Canadian Customers';
proc print data=work.canada_customers;
run;
title;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
83505
84260
8-63
8-64
5.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8.5 Solutions
16
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-65
8-66
31
31
51
51
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8.5 Solutions
65
65
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-67
8-68
Chapter 8 Read
ding Raw Data Files
MMDDY
YY10.
84
Copyright 2013, SAS Institute Inc., Cary, Nortth Carolina, USA. ALL RIGHTS RE
ESERVED.
8.5 Solutions
91
97
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
8-69
8-70
Phone
Mobile
James Kvarniq
Sandrina Stephano
Karen Ballinger
(704) 293-8126
(919) 871-7830
(714) 344-4321
(701) 281-8923
Cornelia Krahl
Elke Wallstab
113
113
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9.2
9.3
9-2
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3
3
Business Scenario
Orion Star management plans to give a $500 bonus to
each employee in his or her hire month.
500
4
4
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-3
9-4
Considerations
Create a new data set with three new variables:
Bonus, which is a constant 500
Compensation, which is the sum of Salary and
Bonus
BonusMonth, which is the month in which the
employee was hired
Compensation
Bonus
BonusMonth
work.comp
orion.sales
5
5
Considerations
Partial orion.sales
Employee
First
_ID
_Name
120102 Tom
Birth_
Date
3510
Hire_
Date
10744
AU
-3996
5114
AU
-5630
5114
Last
_ Name
Zhou
Gender
Salary
AU
120103 Wilson
Dawes
120121 Irenie
Elvish
Job_ Title
Country
Partial work.comp
Employee
_ID
First
_Name
Last
_ Name
Bonus
Compensation
Bonus
Month
120102 Tom
Zhou
500
108755
120103 Wilson
Dawes
500
88475
120121 Irenie
Elvish
500
27100
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Partial work.comp
Bonus
Month
Bonus
Compensation
500
108755
500
88475
500
27100
SAS Functions
SAS functions can be used in an assignment statement. A
function is a routine that accepts arguments and returns a
value.
variable=function-name(argument1, argument2, );
Some functions manipulate character values, compute
descriptive statistics, or manipulate SAS date values.
Arguments are enclosed in parentheses and separated
by commas.
A function can return a numeric or character result.
9
9
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-5
9-6
SUM Function
Use the SUM function to create Compensation. The
SUM function is a descriptive statistics function that
returns the sum of its arguments.
Compensation=sum(Salary,Bonus);
SUM(argument1,argument2, ...)
10
10
MONTH Function
Use the MONTH function to extract the month of hire from
Hire_Date.
BonusMonth=month(Hire_Date);
MONTH(SAS-date)
Other date functions can do the following:
extract information from SAS date values
create SAS date values
11
11
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Description
YEAR(SAS-date)
QTR(SAS-date)
MONTH(SAS-date)
DAY(SAS-date)
12
12
Description
TODAY()
DATE()
MDY(month,day,year)
Examples
CurrentDate=today();
y2k=mdy(01,1,2000);
NewYear=mdy(Mon,Day,2013);
13
13
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-7
9-8
BonusMonth=month(Hire_Date);
AnnivBonus=mdy(BonusMonth,15,2008);
A function call can be part of any SAS expression.
if month(Hire_Date)=12;
A function call can be an argument to another function.
AnnivBonus=mdy(month(Hire_Date),15,2012);
14
14
data work.comp;
set orion.sales;
Bonus=500;
Compensation=sum(Salary,Bonus);
BonusMonth=month(Hire_Date);
run;
175
176
177
178
179
180
data work.comp;
set orion.sales;
Bonus=500;
Compensation=sum(Salary,Bonus);
BonusMonth=month(Hire_Date);
run;
orion.sales has
nine variables.
NOTE: There were 165 observations read from the data set ORION.SALES.
NOTE: The data set WORK.COMP has 165 observations and 12 variables.
15
p109d01
15
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
First_
Name
Last_Name
Tom
Wilson
Irenie
Christina
Kimiko
Zhou
Dawes
Elvish
Ngan
Hotstone
Bonus
Compensation
Bonus
Month
500
500
500
500
500
108755
88475
27100
27975
26690
6
1
1
7
10
p109d01
16
16
9.02 Quiz
A DROP statement has been added to this DATA step.
Will the program calculate Compensation and
BonusMonth correctly?
data work.comp;
set orion.sales;
drop Gender Salary Job_Title Country
Birth_Date Hire_Date;
Bonus=500;
Compensation=sum(Salary,Bonus);
BonusMonth=month(Hire_Date);
run;
17
p109a01
17
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-9
9-10
Tom
Wilson
Irenie
Christina
Kimiko
Last_Name
Zhou
Dawes
Elvish
Ngan
Hotstone
Bonus
500
500
500
500
500
Compensation
108755
88475
27100
27975
26690
Bonus
Month
6
1
1
7
10
p109a01
19
19
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Level 1
1. Creating New Variables
a. Retrieve the starter program p109e01.
b. In the DATA step, create three new variables:
Increase, which is Salary multiplied by 0.10
NewSalary, which is Salary added to Increase
BdayQtr, which is the quarter in which the employee was born
c. The new data set should include only Employee_ID, Salary, Birth_Date, and the three new
variables.
d. Store permanent formats to display Salary, Increase, and NewSalary with commas.
e. Modify the program to create the report below, including labels. The results should contain 424
observations.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-11
Employee ID
Employee
Annual
Salary
1
2
3
4
5
120101
120102
120103
120104
120105
163,040
108,255
87,975
46,230
27,110
Employee
Birth Date
Increase
NewSalary
16,304
10,826
8,798
4,623
2,711
179,344
119,081
96,773
50,853
29,821
18AUG1980
11AUG1973
22JAN1953
11MAY1958
21DEC1978
Bday
Qtr
3
3
1
2
4
Level 2
2. Creating New Variables
a. Write a DATA step that reads orion.customer to create work.birthday.
b. In the DATA step, create three new variables: Bday2012, BdayDOW2012, and Age2012.
Bday2012 is the combination of the month of Birth_Date, the day of Birth_Date, and the
constant of 2012 in the MDY function.
BdayDOW2012 is the day of the week of Bday2012.
Age2012 is the age of the customer in 2012. Subtract Birth_Date from Bday2012 and divide
the result by 365.25.
c. Include only the following variables in the new data set: Customer_Name, Birth_Date,
Bday2012, BdayDOW2012, and Age2012.
d. Format Bday2012 to display in the form 01Jan2012. Age2012 should be formatted to display
with no decimal places.
e. Write a PROC PRINT step to create the report below. The results should contain 77 observations.
Partial PROC PRINT Output
Obs
1
2
3
4
5
Customer_Name
James Kvarniq
Sandrina Stephano
Cornelia Krahl
Karen Ballinger
Elke Wallstab
Birth_
Date
Bday2012
27JUN1978
09JUL1983
27FEB1978
18OCT1988
16AUG1978
27JUN2012
09JUL2012
27FEB2012
18OCT2012
16AUG2012
Bday
DOW2012
4
2
2
5
5
Age2012
34
29
34
24
34
Challenge
3. Using the CATX and INTCK Functions to Create Variables
a. Write a DATA step that reads orion.sales to create work.employees.
In the DATA step, create a new variable, FullName, which is the combination of First_Name, a
space, and Last_Name. Use the CATX function. Documentation on CATX can be found in the
SAS Help facility or in the online documentation.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-12
In the DATA step, create a new variable, Yrs2012, which is the number of years between
January 1, 2012, and Hire_Date. Use the INTCK function. Documentation on INTCK
can be found in the SAS Help facility or in the online documentation.
b. Format Hire_Date to display in the form 01/31/2012.
c. Give Yrs2012 a label of Years of Employment in 2012.
d. Create the report shown below. The results should contain 165 observations.
Partial PROC PRINT Output
Obs
1
2
3
4
5
FullName
Tom Zhou
Wilson Dawes
Irenie Elvish
Christina Ngan
Kimiko Hotstone
Hire_Date
Years of
Employment
in 2012
01/06/1993
01/01/1978
01/01/1978
01/07/1982
01/10/1989
19
34
34
30
23
23
23
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Business Scenario
Orion Star management plans to give each sales
employee a bonus based on his or her job title.
Bonus
24
24
Considerations
Create a new data set, work.comp, using orion.sales as
input. Include a new variable, Bonus, with a value based
on Job_Title.
Job_Title
Bonus
Sales Rep. IV
1000
Sales Manager
1500
2000
2500
25
25
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-13
9-14
IF-THEN Statements
The IF-THEN statement executes a SAS statement for
observations that meet a specific condition.
data work.comp;
set orion.sales;
if Job_Title='Sales Rep. IV' then
Bonus=1000;
...
run;
IF expression THEN statement;
26
26
Conditional Processing
The value assigned to Bonus is determined by testing for
various values of Job_Title.
Sales Rep. IV
Yes
Bonus=1000
Yes
Bonus=1500
No
Sales Manager
No
Yes
Bonus=2000
No
Yes
Bonus=2500
No
27
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Conditional Processing
PDV
Employee_ID Last_Name
120102 Zhou
Job_Title
Sales Manager
Bonus
.
p109d02
...
28
28
Conditional Processing
false
PDV
Employee_ID Last_Name
120102 Zhou
Job_Title
Sales Manager
Bonus
29
29
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
.
...
9-15
9-16
Conditional Processing
true
PDV
Employee_ID Last_Name
120102 Zhou
Job_Title
Sales Manager
Bonus
.
...
30
30
Conditional Processing
PDV
Employee_ID Last_Name
120102 Zhou
Job_Title
Sales Manager
Bonus
1500
31
31
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Conditional Processing
false
PDV
Employee_ID Last_Name
120102 Zhou
Job_Title
Sales Manager
Bonus
1500
...
32
32
Conditional Processing
false
PDV
Employee_ID Last_Name
120102 Zhou
Job_Title
Sales Manager
Bonus
1500
33
33
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
9-17
9-18
Conditional Processing
Implicit OUTPUT;
Implicit RETURN;
PDV
Employee_ID Last_Name
120102 Zhou
Job_Title
Bonus
1500
Job_Title
Bonus
1500
Sales Manager
34
34
Conditional Processing
Continue until EOF
PDV
Employee_ID Last_Name
120102 Zhou
Sales Manager
35
35
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Last_Name
Job_Title
1
2
3
4
5
6
7
8
9
10
11
12
Zhou
Dawes
Elvish
Ngan
Hotstone
Daymond
Hofmeister
Denny
Clarkson
Kletschkus
Roebuck
Lyon
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Manager
Manager
Rep. II
Rep. II
Rep. I
Rep. I
Rep. IV
Rep. II
Rep. II
Rep. IV
Rep. III
Rep. I
Bonus
1500
1500
.
.
.
.
1000
.
.
1000
.
.
p109d02
36
36
37
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-19
9-20
39
39
Conditional Processing
When an expression is true, the associated statement is
executed and subsequent ELSE statements are skipped.
Sales Rep. IV
Yes
Bonus=1000
Yes
Bonus=1500
No
Sales Manager
No
Yes
Bonus=2000
No
Yes
Bonus=2500
No
40
40
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
IF-THEN Statements
data work.comp;
set orion.sales;
if Job_Title='Sales Rep. IV' then
Bonus=1000;
else if Job_Title='Sales Manager' then
Bonus=1500;
else if Job_Title='Senior Sales Manager'
then Bonus=2000;
else if Job_Title='Chief Sales Officer'
then Bonus=2500;
run;
PDV
Employee_ID Last_Name
120102 Zhou
Job_Title
Sales Manager
Bonus
.
p109d03
...
41
41
IF-THEN Statements
data work.comp;
false
set orion.sales;
if Job_Title='Sales Rep. IV' then
Bonus=1000;
else if Job_Title='Sales Manager' then
Bonus=1500;
else if Job_Title='Senior Sales Manager'
then Bonus=2000;
else if Job_Title='Chief Sales Officer'
then Bonus=2500;
run;
PDV
Employee_ID Last_Name
120102 Zhou
Job_Title
Sales Manager
Bonus
42
42
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
.
...
9-21
9-22
IF-THEN Statements
data work.comp;
set orion.sales;
if Job_Title='Sales Rep. IV' then
true
Bonus=1000;
else if Job_Title='Sales Manager' then
Bonus=1500;
else if Job_Title='Senior Sales Manager'
then Bonus=2000;
else if Job_Title='Chief Sales Officer'
then Bonus=2500;
run;
PDV
Employee_ID Last_Name
120102 Zhou
Job_Title
Sales Manager
Bonus
.
...
43
43
IF-THEN Statements
data work.comp;
set orion.sales;
if Job_Title='Sales Rep. IV' then
Bonus=1000;
else if Job_Title='Sales Manager' then
Bonus=1500;
else if Job_Title='Senior Sales Manager'
then Bonus=2000;
else if Job_Title='Chief Sales Officer'
then Bonus=2500;
run;
PDV
Employee_ID Last_Name
120102 Zhou
Job_Title
Sales Manager
Bonus
1500
44
44
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
IF-THEN Statements
data work.comp;
set orion.sales;
if Job_Title='Sales Rep. IV' then
Bonus=1000;
else if Job_Title='Sales Manager' then
Bonus=1500;
else if Job_Title='Senior Sales Manager'
then Bonus=2000;
else if Job_Title='Chief Sales Officer'
then Bonus=2500;
run;
Implicit OUTPUT;
Implicit RETURN;
PDV
Employee_ID Last_Name
120102 Zhou
Job_Title
Sales Manager
Bonus
1500
45
45
IF-THEN Statements
data work.comp;
Continue
until EOF
set
orion.sales;
if Job_Title='Sales Rep. IV' then
Bonus=1000;
else if Job_Title='Sales Manager' then
Bonus=1500;
else if Job_Title='Senior Sales Manager'
then Bonus=2000;
else if Job_Title='Chief Sales Officer'
then Bonus=2500;
run;
PDV
Employee_ID Last_Name
120102 Zhou
Job_Title
Sales Manager
Bonus
1500
46
46
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-23
9-24
Last_Name
1
2
3
4
5
6
7
8
9
10
Zhou
Dawes
Elvish
Ngan
Hotstone
Daymond
Hofmeister
Denny
Clarkson
Kletschkus
Job_Title
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Bonus
Manager
Manager
Rep. II
Rep. II
Rep. I
Rep. I
Rep. IV
Rep. II
Rep. II
Rep. IV
1500
1500
.
.
.
.
1000
.
.
1000
p109d03
47
47
Job_Title
Bonus
1000
Sales Rep. IV
1000
Sales Manager
1500
2000
2500
500
48
48
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
49
49
Conditional Processing
An optional final ELSE statement gives an alternative
action if none of the conditions are true.
Sales Rep. III or IV
Yes
Bonus=1000
No
Sales Manager
Yes
Bonus=1500
No
Senior Sales Manager
Yes
Bonus=2000
No
Chief Sales Manager
Yes
Bonus=2500
No
Bonus=500
50
50
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-25
9-26
Last_Name
1
2
3
4
5
6
7
8
9
10
Zhou
Dawes
Elvish
Ngan
Hotstone
Daymond
Hofmeister
Denny
Clarkson
Kletschkus
Job_Title
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Manager
Manager
Rep. II
Rep. II
Rep. I
Rep. I
Rep. IV
Rep. II
Rep. II
Rep. IV
Bonus
1500
1500
500
500
500
500
1000
500
500
1000
p109d04
51
51
Business Scenario
Orion Star managers are considering a country-based
bonus. Create a new SAS data set named work.bonus
using orion.sales as input. The value of the new variable,
Bonus, is based on Country.
$500
$300
52
52
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-27
IF-THEN/ELSE Statements
If orion.sales has been validated and only includes the
Country values US and AU, the conditional clause can be
omitted from the ELSE statement.
data work.bonus;
set orion.sales;
if Country='US' then Bonus=500;
else Bonus=300;
run;
p109d05
53
53
This technique should be used only when you know that the final ELSE statement must be executed for
all other observations.
54
Obs
First_Name
Last_Name
60
61
62
63
64
65
66
67
68
69
Billy
Matsuoka
Vino
Meera
Harry
Julienne
Scott
Cherda
Priscilla
Robert
Plested
Wills
George
Body
Highpoint
Magolan
Desanctis
Ridley
Farren
Stevens
Country
AU
AU
AU
AU
US
US
US
US
US
US
Bonus
300
300
300
300
500
500
500
500
500
500
p109d05
54
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-28
9.04 Quiz
Program p109a02 reads orion.nonsales, a non-validated
data set. Open and submit the program and review the
results. Why is Bonus set to 300 in observations 125,
197, and 200?
data work.bonus;
set orion.nonsales;
if Country='US' then Bonus=500;
else Bonus=300;
run;
55
55
p109a02s
57
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
data work.bonus;
set orion.nonsales;
Country=upcase(Country);
if Country='US'
then Bonus=500;
else Bonus=300;
run;
It is a best practice to clean the data at the source,
but in some cases, that is not possible. With this
method, you are creating a clean data set.
p109d06
58
58
Business Scenario
Orion Star employees will receive a bonus once or twice a
year. In addition to Bonus, add a new variable, Freq, that
is equal to the following:
Once a Year for United States employees
Twice a Year for Australian employees
$500
Once a Year
$300
Twice a Year
60
60
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-29
9-30
IF-THEN/ELSE Statements
Only one executable statement is allowed in IF-THEN
and ELSE statements.
Bonus=500;
if Country='US' then
Freq='Once a Year';
61
61
DO Group
Multiple statements are permitted in a DO group.
data work.bonus;
set orion.sales;
if Country='US' then do;
Bonus=500;
Freq='Once a Year';
DO group
end;
else if Country='AU' then do;
Bonus=300;
Freq='Twice a Year';
end;
run;
Each DO group ends with an END statement.
62
p109d07
62
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
DO group
DO group
DO group
63
63
64
Obs
First_Name
Last_Name
Country
Bonus
Freq
60
61
62
63
64
65
66
67
68
69
Billy
Matsuoka
Vino
Meera
Harry
Julienne
Scott
Cherda
Priscilla
Robert
Plested
Wills
George
Body
Highpoint
Magolan
Desanctis
Ridley
Farren
Stevens
AU
AU
AU
AU
US
US
US
US
US
US
300
300
300
300
500
500
500
500
500
500
Twice a Yea
Twice a Yea
Twice a Yea
Twice a Yea
Once a Year
Once a Year
Once a Year
Once a Year
Once a Year
Once a Year
truncation
p109d07
64
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-31
9-32
Compilation
data work.bonus;
set orion.sales;
if Country='US' then do;
Bonus=500;
Freq='Once a Year';
end;
else if Country='AU' then do;
Bonus=300;
Freq='Twice a Year';
end;
run;
PDV
Employee_ID
N8
First_Name
$ 12
...
Hire_Date
N8
p109d07
...
65
65
Compilation
data work.bonus;
set orion.sales;
if Country='US' then do;
Bonus=500;
Freq='Once a Year';
end;
else if Country='AU' then do;
Bonus=300;
Freq='Twice a Year';
end;
run;
PDV
Employee_ID
N8
First_Name
$ 12
...
Hire_Date
N8
Bonus
N8
66
66
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Compilation
data work.bonus;
set orion.sales;
if Country='US' then do;
Bonus=500;
Freq='Once a Year';
end;
else if Country='AU'
then do;
11 characters
Bonus=300;
Freq='Twice a Year';
end;
run;
PDV
Employee_ID
N8
First_Name
$ 12
...
Hire_Date
N8
Bonus
N8
Freq
$ 11
...
67
67
Compilation
data work.bonus;
set orion.sales;
if Country='US' then do;
Bonus=500;
Freq='Once a Year';
end;
else if Country='AU' then do;
Bonus=300;
Freq='Twice a Year';
end;
length does
run;
12 characters
not change
PDV
Employee_ID
N8
First_Name
$ 12
...
Hire_Date
N8
Bonus
N8
Freq
$ 11
68
68
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-33
9-34
9.05 Quiz
How would you prevent Freq from being truncated?
69
69
data work.bonus;
set orion.sales;
length Freq $ 12;
if Country='US' then do;
Bonus=500;
Freq='Once a Year';
end;
else if Country='AU' then do;
Bonus=300;
Freq='Twice a Year';
end;
run;
LENGTH variable(s) <$> length;
It is a good practice to use a LENGTH statement
any time you create a new character variable.
71
p109d08
71
The LENGTH statement is usually placed at or near the top of the DATA step.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Compilation
data work.bonus;
set orion.sales;
length Freq $ 12;
if Country='US' then do;
Bonus=500;
Freq='Once a Year';
end;
else if Country='AU' then do;
Bonus=300;
Freq='Twice a Year';
end;
run;
PDV
Employee_ID
N8
First_Name
$ 12
...
Hire_Date
N8
p109d08
...
72
72
Compilation
data work.bonus;
set orion.sales;
length Freq $ 12;
if Country='US' then do;
Bonus=500;
Freq='Once a Year';
end;
else if Country='AU' then do;
Bonus=300;
Freq='Twice a Year';
end;
run;
PDV
Employee_ID
N8
First_Name
$ 12
...
Hire_Date
N8
Freq
$ 12
73
73
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
9-35
9-36
Compilation
data work.bonus;
set orion.sales;
length Freq $ 12;
if Country='US' then do;
Bonus=500;
Freq='Once a Year';
end;
else if Country='AU' then do;
Bonus=300;
Freq='Twice a Year';
end;
run;
PDV
Employee_ID
N8
First_Name
$ 12
...
Hire_Date
N8
Freq
C
$ 12
Bonus
N8
...
74
74
Compilation
data work.bonus;
set orion.sales;
length Freq $ 12;
if Country='US' then do;
Bonus=500;
Freq='Once a Year';
end;
else if Country='AU' then do;
Bonus=300;
Freq='Twice a Year';
end;
run;
length does
not change
PDV
Employee_ID
N8
First_Name
$ 12
...
Hire_Date
N8
Freq
C
$ 12
Bonus
N8
75
75
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
First_Name
Last_Name
Country
Bonus
60
61
62
63
64
65
66
67
68
69
Billy
Matsuoka
Vino
Meera
Harry
Julienne
Scott
Cherda
Priscilla
Robert
Plested
Wills
George
Body
Highpoint
Magolan
Desanctis
Ridley
Farren
Stevens
AU
AU
AU
AU
US
US
US
US
US
US
300
300
300
300
500
500
500
500
500
500
Freq
Twice a Year
Twice a Year
Twice a Year
Twice a Year
Once a Year
Once a Year
Once a Year
Once a Year
Once a Year
Once a Year
p109d08
76
76
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Level 1
4. Using Conditional Processing
a. Retrieve the starter program p109e04.
b. In the DATA step, create a new variable, Method, and assign a value based on Order_Type.
If Order_Type is equal to 1 then Method equals Retail.
If Order_Type is equal to 2 then Method equals Catalog.
If Order_Type is equal to 3 then Method equals Internet.
For any other values or Order_Type, Method equals Unknown.
c. Modify the PROC PRINT step to display the report below. The results should contain 490
observations.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-37
9-38
Order_ID
1
2
3
4
5
1230058123
1230080101
1230106883
1230147441
1230315085
Order_
Type
1
2
2
1
1
Method
Retail
Catalog
Catalog
Retail
Retail
Supplier_Name
Scandinavian Clothing A/S
Petterson AB
Prime Sports Ltd
Top Sports
AllSeasons Outdoor Clothing
Country
NO
SE
GB
DK
US
Region
Not North America
Not North America
Not North America
Not North America
North America
Discount
Discount
Type
0.05
0.05
0.05
0.05
0.10
Optional
Optional
Optional
Optional
Required
Level 2
6. Creating Multiple Variables in Conditional Processing
a. Write a DATA step that reads orion.customer_dim to create work.season.
b. Create two new variables: Promo and Promo2.
The value of Promo is based on the quarter in which the customer was born.
If the customer was born in the first quarter, then Promo is equal to Winter.
If the customer was born in the second quarter, then Promo is equal to Spring.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
If the customer was born in the third quarter, then Promo is equal to Summer.
If the customer was born in the fourth quarter, then Promo is equal to Fall.
The value of Promo2 is based on the customers age:
For young adults, whose age is between 18 and 25, set Promo2 equal to YA.
For seniors, aged 65 or older, set Promo2 equal to Senior.
Promo2 should have a missing value for all other customers.
c. The new data set should include only Customer_FirstName, Customer_LastName,
Customer_BirthDate, Customer_Age, Promo, and Promo2.
d. Create the report below. The results should include 77 observations.
Partial PROC PRINT Output
Obs
1
2
3
4
5
6
7
8
Customer_
FirstName
Customer_
LastName
Customer_
BirthDate
Promo
James
Sandrina
Cornelia
Karen
Elke
David
Markus
Ulrich
Kvarniq
Stephano
Krahl
Ballinger
Wallstab
Black
Sepke
Heyde
27JUN1978
09JUL1983
27FEB1978
18OCT1988
16AUG1978
12APR1973
21JUL1992
16JAN1943
Spring
Summer
Winter
Fall
Summer
Spring
Summer
Winter
Customer_
Age
33
28
33
23
33
38
19
68
Promo2
YA
YA
Senior
Order_ID
Order_
Date
1
2
3
1230058123
1230080101
1230106883
11JAN2007
15JAN2007
20JAN2007
Delivery_
Date
11JAN2007
19JAN2007
22JAN2007
Type
Sale
Ads
Day
Of
Week
Retail Sale
Catalog Sale
Catalog Sale
Mail
Mail
5
2
7
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-39
9-40
1230147441
1230315085
28JAN2007
27FEB2007
28JAN2007
27FEB2007
Retail Sale
Retail Sale
1
3
Challenge
8. Using WHEN Statements in a SELECT Group to Create Variables Conditionally
a. Write a DATA step that reads orion.nonsales to create work.gifts.
b. Create two new variables, Gift1 and Gift2, using a SELECT group with WHEN statements.
Documentation about the SELECT group with WHEN statements can be found in the SAS Help
facility or in the online documentation.
If Gender is equal to F,
Gift1 is equal to Scarf
Gift2 is equal to Pedometer.
If Gender is equal to M,
Gift1 is equal to Gloves
Gift2 is equal to Money Clip.
If Gender is not equal to F or M,
Gift1 is equal to Coffee
Gift2 is equal to Calendar.
c. The new data set should include only Employee_ID, First, Last, Gender, Gift1, and Gift2.
d. Create the report below. The results should contain 235 observations.
Partial PROC PRINT Output
Employee_ID
120101
120104
120105
120106
120107
First
Last
Patrick
Kareen
Liz
John
Sherie
Lu
Billington
Povey
Hornsey
Sheedy
Gender
M
F
F
M
F
Gift1
Gift2
Gloves
Scarf
Scarf
Gloves
Scarf
Money Clip
Pedometer
Pedometer
Money Clip
Pedometer
9.3 Solutions
Solutions to Exercises
1. Creating New Variables
data work.increase;
set orion.staff;
Increase=Salary*0.10;
NewSalary=sum(Salary,Increase);
/* alternate statement is */
/* NewSalary=Salary+Increase; */
BdayQtr=qtr(Birth_Date);
keep Employee_ID Birth_Date Salary Increase NewSalary BdayQtr;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9.3 Solutions
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-41
9-42
Discount=0.10;
DiscountType='Required';
Region='North America';
end;
else do;
Discount=0.05;
DiscountType='Optional';
Region='Not North America';
end;
keep Supplier_Name Country
Discount DiscountType Region;
run;
proc print data=work.region;
run;
6. Creating Multiple Variables in Conditional Processing
data work.season;
set orion.customer_dim;
length Promo2 $ 6;
Quarter=qtr(Customer_BirthDate);
if Quarter=1 then Promo='Winter';
else if Quarter=2 then Promo='Spring';
else if Quarter=3 then Promo='Summer';
else if Quarter=4 then Promo='Fall';
if Customer_Age>=18 and Customer_Age<=25 then Promo2='YA';
else if Customer_Age>=65 then Promo2='Senior';
keep Customer_FirstName Customer_LastName Customer_BirthDate
Customer_Age Promo Promo2;
run;
proc print data=work.season;
var Customer_FirstName Customer_LastName Customer_BirthDate Promo
Customer_Age Promo2;
run;
7. Creating Variables Unconditionally and Conditionally
data work.ordertype;
set orion.orders;
length Type $ 13 SaleAds $ 5;
DayOfWeek=weekday(Order_Date);
if Order_Type=1 then
Type='Retail Sale';
else if Order_Type=2 then do;
Type='Catalog Sale';
SaleAds='Mail';
end;
else if Order_Type=3 then do;
Type='Internet Sale';
SaleAds='Email';
end;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9.3 Solutions
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-43
9-44
Partial work.comp
Bonus
Month
Bonus
Compensation
500
108755
500
88475
500
27100
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9.3 Solutions
38
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
9-45
9-46
70
70
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-2
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3
3
Business Scenario
You have been asked to combine the data sets containing
information about Orion Star employees from Denmark
and France into a new data set.
empsdk
empsall1
empsfr
4
4
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-3
10-4
Considerations
Concatenate like-structured data sets empsdk and
empsfr to create a new data set named empsall1.
empsdk
First
Lars
Kari
Jonas
Gender
M
F
M
Country
Denmark
Denmark
Denmark
empsfr
First
Pierre
Sophie
Gender Country
M
France
F
France
empsall1
First
Lars
Kari
Jonas
Pierre
Sophie
Gender
M
F
M
M
F
Country
Denmark
Denmark
Denmark
France
France
data empsall1;
set empsdk empsfr;
run;
p110d01
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Compilation
empsdk
First
Lars
Kari
Jonas
empsfr
Gender Country
M
Denmark
F
Denmark
M
Denmark
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
run;
PDV
First
empsall1
Gender
Country
First
Gender
Country
p110d01
...
7
7
Execution
empsdk
First
Lars
Kari
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
empsfr
First
Pierre
Sophie
Gender Country
M
France
F
France
Initialize PDV
data empsall1;
set empsdk empsfr;
run;
PDV
First
empsall1
Gender
Country
First
Gender
Country
8
8
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-5
10-6
Execution
empsdk
First
Lars
Kari
Jonas
empsfr
Gender Country
M
Denmark
F
Denmark
M
Denmark
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
run;
PDV
First
Lars
empsall1
Gender Country
M
Denmark
First
Gender
Country
...
9
9
Execution
empsdk
First
Lars
Kari
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
empsfr
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
Implicit OUTPUT;
run;
PDV
First
Lars
Implicit RETURN;
empsall1
Gender Country
M
Denmark
First
Lars
Gender Country
M
Denmark
10
10
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
empsdk
First
Lars
Kari
Jonas
empsfr
Gender Country
M
Denmark
F
Denmark
M
Denmark
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
run;
PDV
First
Lars
empsall1
Gender Country
M
Denmark
First
Lars
Gender Country
M
Denmark
11
11
Execution
empsdk
First
Lars
Kari
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
empsfr
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
run;
PDV
First
Kari
empsall1
Gender Country
F
Denmark
First
Lars
Gender Country
M
Denmark
12
12
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-7
10-8
Execution
empsdk
First
Lars
Kari
Jonas
empsfr
Gender Country
M
Denmark
F
Denmark
M
Denmark
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
Implicit OUTPUT;
run;
PDV
First
Kari
Implicit RETURN;
empsall1
Gender Country
F
Denmark
First
Lars
Kari
Gender Country
M
Denmark
F
Denmark
...
13
13
Execution
empsdk
First
Lars
Kari
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
empsfr
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
run;
PDV
First
Kari
empsall1
Gender Country
F
Denmark
First
Lars
Kari
Gender Country
M
Denmark
F
Denmark
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
empsdk
First
Lars
Kari
Jonas
empsfr
Gender Country
M
Denmark
F
Denmark
M
Denmark
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
run;
PDV
First
Jonas
empsall1
Gender Country
M
Denmark
First
Lars
Kari
Gender Country
M
Denmark
F
Denmark
...
15
15
Execution
empsdk
First
Lars
Kari
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
empsfr
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
Implicit OUTPUT;
run;
PDV
First
Jonas
Implicit RETURN;
empsall1
Gender Country
M
Denmark
First
Lars
Kari
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
16
16
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-9
10-10
Execution
empsdk
First
Lars
Kari
Jonas
empsfr
Gender Country
M
Denmark
F
Denmark
M
Denmark
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
run;
PDV
First
Jonas
empsall1
Gender Country
M
Denmark
First
Lars
Kari
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
...
17
17
Execution
empsdk
First
Lars
Kari
EOF
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
empsfr
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
run;
PDV
First
Jonas
empsall1
Gender Country
M
Denmark
First
Lars
Kari
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
18
18
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-11
Execution
empsdk
First
Lars
Kari
Jonas
empsfr
Gender Country
M
Denmark
F
Denmark
M
Denmark
First
Pierre
Sophie
Gender Country
M
France
F
France
empsall1
Gender Country
First
Lars
Kari
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
...
19
19
When concatenating data sets, SAS reinitializes the entire PDV before it begins reading the next data set
listed in the SET statement.
Execution
empsdk
First
Lars
Kari
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
empsfr
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
run;
PDV
First
Pierre
empsall1
Gender Country
M
France
First
Lars
Kari
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
20
20
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-12
Execution
empsdk
First
Lars
Kari
Jonas
empsfr
Gender Country
M
Denmark
F
Denmark
M
Denmark
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
Implicit OUTPUT;
run;
Implicit RETURN;
PDV
First
Pierre
empsall1
Gender Country
M
France
First
Lars
Kari
Jonas
Pierre
Gender
M
F
M
M
Country
Denmark
Denmark
Denmark
France
...
21
21
Execution
empsdk
First
Lars
Kari
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
empsfr
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
run;
PDV
First
Pierre
empsall1
Gender Country
M
France
First
Lars
Kari
Jonas
Pierre
Gender
M
F
M
M
Country
Denmark
Denmark
Denmark
France
22
22
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
empsdk
First
Lars
Kari
Jonas
empsfr
Gender Country
M
Denmark
F
Denmark
M
Denmark
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
run;
PDV
First
Sophie
empsall1
Gender Country
F
France
First
Lars
Kari
Jonas
Pierre
Gender
M
F
M
M
Country
Denmark
Denmark
Denmark
France
...
23
23
Execution
empsdk
First
Lars
Kari
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
empsfr
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
Implicit OUTPUT;
run;
PDV
First
Sophie
24
Implicit RETURN;
empsall1
Gender Country
F
France
First
Lars
Kari
Jonas
Pierre
Sophie
Gender
M
F
M
M
F
Country
Denmark
Denmark
Denmark
France
France ...
24
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-13
10-14
Execution
empsdk
First
Lars
Kari
Jonas
empsfr
Gender Country
M
Denmark
F
Denmark
M
Denmark
First
Pierre
Sophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
run;
PDV
First
Sophie
empsall1
Gender Country
F
France
First
Lars
Kari
Jonas
Pierre
Sophie
Gender
M
F
M
M
F
Country
Denmark
Denmark
Denmark
France
France ...
25
Execution
empsdk
First
Lars
Kari
Jonas
Gender Country
M
Denmark
F
Denmark
M
Denmark
empsfr
First
Pierre
EOFSophie
Gender Country
M
France
F
France
data empsall1;
set empsdk empsfr;
run;
PDV
First
Sophie
26
empsall1
Gender Country
F
France
First
Lars
Kari
Jonas
Pierre
Sophie
Gender
M
F
M
M
F
Country
Denmark
Denmark
Denmark
France
France
26
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
data empsall1;
set empsdk empsfr;
run;
NOTE: There were 3 observations read from the data set WORK.EMPSDK.
NOTE: There were 2 observations read from the data set WORK.EMPSFR.
NOTE: The data set WORK.EMPSALL1 has 5 observations and 3
variables.
27
27
empsjp
Gender
M
M
F
Country
China
China
China
First
Cho
Tomi
Gender Region
F
Japan
M
Japan
p110d02
28
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-15
10-16
10.01 Quiz
How many variables will be in empsall2 after
concatenating empscn and empsjp?
empscn
First
Chang
Li
Ming
empsjp
Gender
M
M
F
Country
China
China
China
First
Cho
Tomi
Gender Region
F
Japan
M
Japan
data empsall2;
set empscn empsjp;
run;
29
29
Compilation
empscn
First
Chang
Li
Ming
Gender
M
M
F
Country
China
China
China
empsjp
First
Cho
Tomi
Gender Region
F
Japan
M
Japan
data empsall2;
set empscn empsjp;
run;
PDV
First
31
Gender Country
p110d02
31
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Compilation
empscn
First
Chang
Li
Ming
empsjp
Gender
M
M
F
Country
China
China
China
First
Cho
Tomi
Gender Region
F
Japan
M
Japan
data empsall2;
set empscn empsjp;
run;
PDV
First
32
32
Final Results
empsall2
First
Chang
Li
Ming
Cho
Tomi
Gender
M
M
F
F
M
Country Region
China
China
China
Japan
Japan
33
33
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-17
10-18
Business Scenario
Rename variables in one or more data sets to align
columns.
empscn
First
Chang
Li
Ming
empsjp
Gender
M
M
F
Country
China
China
China
First
Cho
Tomi
Gender Region
F
Japan
M
Japan
34
34
35
35
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10.02 Quiz
Which statement has correct syntax?
a. Answer
set empscn(rename(Country=Location))
empsjp(rename(Region=Location));
b. Answer
set empscn(rename=(Country=Location))
empsjp(rename=(Region=Location));
c. Answer
set empscn rename=(Country=Location)
empsjp rename=(Region=Location);
37
37
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-19
10-20
Compilation
empscn
First
Chang
Li
Ming
empsjp
Gender
M
M
F
Country
China
China
China
First
Cho
Tomi
Gender Region
F
Japan
M
Japan
data empsall2;
set empscn empsjp(rename=(Region=Country));
run;
PDV
First
Gender Country
p110d03
...
39
39
Compilation
empscn
First
Chang
Li
Ming
Gender
M
M
F
Country
China
China
China
empsjp
First
Cho
Tomi
Gender Region
F
Japan
M
Japan
data empsall2;
set empscn empsjp(rename=(Region=Country));
run;
PDV
First
Gender Country
40
40
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Compilation
empscn
First
Chang
Li
Ming
empsjp
Gender
M
M
F
Country
China
China
China
First
Cho
Tomi
Gender Region
F
Japan
M
Japan
data empsall2;
set empscn empsjp(rename=(Region=Country));
run;
PDV
First
Gender Country
...
41
41
Compilation
empscn
First
Chang
Li
Ming
Gender
M
M
F
Country
China
China
China
empsjp
First
Cho
Tomi
Gender Region
F
Japan
M
Japan
data empsall2;
set empscn empsjp(rename=(Region=Country));
run;
PDV
First
Gender Country
42
42
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-21
10-22
Compilation
empscn
empsjp
First
Chang
Li
Ming
Gender
M
M
F
Country
China
China
China
First
Cho
Tomi
Gender Region
F
Japan
M
Japan
data empsall2;
set empscn empsjp(rename=(Region=Country));
run;
PDV
First
Gender Country
43
43
Final Results
The Region values are stored in Country.
empsall2
First
Chang
Li
Ming
Cho
Tomi
Gender
M
M
F
F
M
Country
China
China
China
Japan
Japan
44
44
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-23
Level 1
1. Concatenating Like-Structured Data Sets
a. Write and submit a DATA step to concatenate orion.mnth7_2011, orion.mnth8_2011, and
orion.mnth9_2011 to create a new data set, work.thirdqtr.
How many observations in work.thirdqtr are from orion.mnth7_2011? ____________
How many observations in work.thirdqtr are from orion.mnth8_2011? ____________
How many observations in work.thirdqtr are from orion.mnth9_2011? ____________
b. Write a PROC PRINT step to create the report below. The results should contain 32 observations.
Partial PROC PRINT Output
Obs
Order_ID
1
2
3
4
5
1242691897
1242736731
1242773202
1242782701
1242827683
Order_
Type
2
1
3
3
1
Employee_ID
Customer_ID
Order_
Date
Delivery_
Date
99999999
121107
99999999
99999999
121105
90
10
24
27
10
02JUL2011
07JUL2011
11JUL2011
12JUL2011
17JUL2011
04JUL2011
07JUL2011
14JUL2011
17JUL2011
17JUL2011
orion.nonsales
a. Add a DATA step after the PROC CONTENTS steps to concatenate orion.sales and
orion.nonsales to create a new data set, work.allemployees.
Use a RENAME= data set option to change the names of the different variables in
orion.nonsales.
The new data set should include only Employee_ID, First_Name, Last_Name, Job_Title, and
Salary.
b. Add a PROC PRINT step to create the report below. The results should contain 400 observations.
Partial PROC PRINT Output
Obs
Employee_ID
1
2
3
4
5
120102
120103
120121
120122
120123
First_
Name
Last_Name
Salary
Tom
Wilson
Irenie
Christina
Kimiko
Zhou
Dawes
Elvish
Ngan
Hotstone
108255
87975
26600
27475
26190
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Job_Title
Sales
Sales
Sales
Sales
Sales
Manager
Manager
Rep. II
Rep. II
Rep. I
10-24
Level 2
3. Concatenating Data Sets with Variables of Different Lengths and Types
a. Open p110e03. Submit the PROC CONTENTS steps or explore the data sets interactively to
complete the table below by filling in attribute information for each variable in each data set.
Code
Type
Company
Length
Type
Length
ContactType
Type
Length
orion.charities
orion.us_suppliers
orion.consultants
b. Write a DATA step to concatenate orion.charities and orion.us_suppliers, creating a temporary
data set, contacts.
c. Submit a PROC CONTENTS step to examine work.contacts. From which input data set were the
variable attributes assigned? ___________________________________________
d. Write a DATA step to concatenate orion.us_suppliers and orion.charities, creating a temporary
data set, contacts2. Note that these are the same data sets as the previous program, but they are in
reverse order.
e. Submit a PROC CONTENTS step to examine work.contacts2. From which input data set were
the variable attributes assigned?
_____________________________________________________________
f. Write a DATA step to concatenate orion.us_suppliers and orion.consultants, creating a
temporary data set, contacts3.
Why did the DATA step fail? ________________________________________________
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
48
48
Match-Merging
A B C
1
2
3
C D E
1
2
3
One-to-One
A single observation in one data set is
related to exactly one observation in
another data set based on the values of
one or more selected variables.
A B C
1
2
C D E
1
1
2
One-to-Many
A single observation in one data set is
related to more than one observation in
another data set based on the values of
one or more selected variables.
A B C
1
2
4
C D E
2
3
4
Non-matches
At least one observation in one data set
is unrelated to any observation in
another data set based on the values of
one or more selected variables.
49
49
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-25
10-26
50
50
51
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Business Scenario
Merge the Australian employee data set with a phone
data set to obtain each employees home phone number,
storing the results in a new data set.
empsau
First
phoneh
Gender EmpID
EmpID
Phone
empsauh
empsauh
First
Gender
EmpID Phone
53
53
Match-Merging
The MERGE statement in a DATA step joins observations
from two or more SAS data sets into single observations.
data empsauh;
merge empsau phoneh;
by EmpID;
run;
MERGE SAS-data-set1 SAS-data-set2 . . .;
BY <DESCENDING> BY-variable(s);
A BY statement indicates a match-merge and lists the
variable or variables to match.
54
p110d04
54
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-27
10-28
55
55
One-to-One Merge
One observation in empsau matches exactly one
observation in phoneh.
empsau
First
Togar
Kylie
Birin
phoneh
Gender EmpID
M
121150
F
121151
M
121152
EmpID
121150
121151
121152
Phone
+61(2)5555-1793
+61(2)5555-1849
+61(2)5555-1665
56
56
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Final Results
empsau
First
Togar
Kylie
Birin
phoneh
EmpID
Phone
121150 +61(2)5555-1793
121151 +61(2)5555-1849
121152 +61(2)5555-1665
Gender EmpID
M
121150
F
121151
M
121152
empsauh
First
Togar
Kylie
Birin
Gender
M
F
M
EmpID
121150
121151
121152
Phone
+61(2)5555-1793
+61(2)5555-1849
+61(2)5555-1665
57
57
10.04 Quiz
58
58
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-29
10-30
62
62
Business Scenario
Merge the Australian employee information data set with
the phone data set to obtain the phone numbers for each
employee.
phones
empsau
empphones
63
63
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Considerations
In this one-to-many merge, one observation in empsau
matches one or more observations in phones.
phones
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
EmpID
121150
121150
121151
121152
121152
121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
64
64
Match-Merging
Merge the two data sets by EmpID and create a new data
set named empphones.
data empphones;
merge empsau phones;
by EmpID;
run;
65
p110d05
65
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-31
10-32
Match-Merging
BY group
phones
EmpID
121150
121150
121151
121152
121152
121152
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
66
66
Execution
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
phones
EmpID
121150
121150
121151
121152
121152
121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Initialize PDV
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
67
Gender EmpID
.
Type
Phone
p110d05
67
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
phones
EmpID
121150
121150
121151
121152
121152
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Gender EmpID
.
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Type
Phone
...
68
68
Execution
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
phones
EmpID
121150
121150
121151
121152
121152
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Togar
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Phone
+61(2)5555-1793
69
69
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-33
10-34
Execution
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
phones
EmpID
121150
121150
121151
121152
121152
121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
data empphones;
merge empsau phones;
by EmpID;
run;
Implicit OUTPUT;
PDV
First
Togar
Implicit RETURN;
Gender EmpID Type
M
121150 Work
Phone
+61(2)5555-1793
...
70
70
Execution
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
phones
EmpID
121150
121150
121151
121152
121152
121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Togar
Phone
+61(2)5555-1793
71
71
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
phones
EmpID
121150
121150
121151
121152
121152
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Togar
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Phone
+61(2)5555-1793
...
72
72
Execution
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
phones
EmpID
121150
121150
121151
121152
121152
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Togar
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Phone
+61(2)5555-1793
73
73
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-35
10-36
Execution
phones
EmpID
121150
121150
121151
121152
121152
121152
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Togar
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Phone
+61(2)5555-1794
...
74
74
Execution
phones
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
EmpID
121150
121150
121151
121152
121152
121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
data empphones;
merge empsau phonehW;
by EmpID;
run;
PDV
First
Togar
Implicit OUTPUT;
Implicit RETURN;
Phone
+61(2)5555-1794
75
75
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
phones
EmpID
121150
121150
121151
121152
121152
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Togar
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Phone
+61(2)5555-1794
...
76
76
Execution
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
phones
EmpID
121150
121150
121151
121152
121152
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Togar
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Phone
+61(2)5555-1794
77
77
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-37
10-38
Execution
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
phones
EmpID
121150
121150
121151
121152
121152
121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
data empphones;
merge empsau phones;
by EmpID;
run;
Reinitialize PDV
PDV
First
Gender EmpID
.
Type
Phone
...
78
78
SAS reinitializes the entire program data vector before processing a different BY group.
Execution
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
phones
EmpID
121150
121150
121151
121152
121152
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Kylie
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Phone
+61(2)5555-1849
79
79
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
phones
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
EmpID
121150
121150
121151
121152
121152
121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Kylie
Implicit OUTPUT;
Implicit RETURN;
Phone
+61(2)5555-1849
...
80
80
Execution
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
phones
EmpID
121150
121150
121151
121152
121152
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Kylie
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Phone
+61(2)5555-1849
81
81
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-39
10-40
Execution
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
phones
EmpID
121150
121150
121151
121152
121152
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Kylie
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Phone
+61(2)5555-1849
...
82
82
Execution
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
phones
EmpID
121150
121150
121151
121152
121152
121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Reinitialize PDV
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Gender EmpID
.
Type
Phone
83
83
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
phones
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
EmpID
121150
121150
121151
121152
121152
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Birin
Gender
M
EmpID Type
121152 Work
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Phone
+61(2)5555-1850
...
84
84
Execution
phones
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
EmpID
121150
121150
121151
121152
121152
121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Birin
Implicit OUTPUT;
Implicit RETURN;
Gender
M
EmpID Type
121152 Work
Phone
+61(2)5555-1850
85
85
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-41
10-42
Execution
phones
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
EOF
EmpID
121150
121150
121151
121152
121152
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Birin
Gender
M
EmpID Type
121152 Work
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Phone
+61(2)5555-1850
...
86
86
Execution
phones
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
EOF
EmpID
121150
121150
121151
121152
121152
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Birin
Gender
M
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
EmpID Type
Phone
121152 Home +61(2)5555-1665
87
87
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
phones
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
EOF
EmpID
121150
121150
121151
121152
121152
121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Birin
Implicit OUTPUT;
Implicit RETURN;
Gender
M
EmpID Type
Phone
121152 Home +61(2)5555-1665
...
88
88
Execution
phones
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
EOF
EmpID
121150
121150
121151
121152
121152
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Birin
Gender
M
EmpID Type
121152 Home
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Phone
+61(2)5555-1665
89
89
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-43
10-44
Execution
phones
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
EOF
EmpID
121150
121150
121151
121152
121152
121152
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Birin
Gender
M
EmpID Type
121152 Cell
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Phone
+61(2)5555-1666
...
90
90
Execution
phones
empsau
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
EOF
EmpID
121150
121150
121151
121152
121152
121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Birin
Implicit OUTPUT;
Implicit RETURN;
Gender
M
EmpID Type
121152 Cell
Phone
+61(2)5555-1666
91
91
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
empsau
First
Togar
Kylie
Birin
EOF
phones
EmpID
121150
Gender EmpID
121150
M
121150
121151
F
121151
121152
M
121152
121152
EOF 121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
data empphones;
merge empsau phones;
by EmpID;
run;
PDV
First
Birin
Gender
M
EmpID Type
121152 Cell
Phone
+61(2)5555-1666
...
92
92
Final Results
empphones
First
Togar
Togar
Kylie
Birin
Birin
Birin
Gender
M
M
F
M
M
M
EmpID
121150
121150
121151
121152
121152
121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
93
93
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-45
10-46
10.05 Quiz
In a one-to-many merge, does it matter which data set is
listed first in the MERGE statement?
Open p110a02 and submit
Reverse the order of the data sets and submit again.
Observe the results. How are they different?
95
Many-to-One Merge
One or more rows in one data set match exactly one row
in the other data set.
phones
EmpID
121150
121150
121151
121152
121152
121152
Type
Home
Work
Home
Work
Home
Cell
Phone
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
data phones;
merge phones empsau;
by EmpID;
run;
97
empsau
EmpID
First
121150 Togar
121151 Kylie
121512 Birin
Gender
M
F
M
p110d06
97
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-47
EmpID
Type
Phone
First
Gender
1
2
3
4
5
6
121150
121150
121151
121152
121152
121152
Home
Work
Home
Work
Home
Cell
+61(2)5555-1793
+61(2)5555-1794
+61(2)5555-1849
+61(2)5555-1850
+61(2)5555-1665
+61(2)5555-1666
Togar
Togar
Kylie
Birin
Birin
Birin
M
M
F
M
M
M
98
98
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Level 1
4. Merging Two Sorted Data Sets in a One-to-Many Merge
a. Retrieve the starter program p110e04.
b. Submit the two PROC CONTENTS steps or explore the data sets interactively to determine the
common variable among the two data sets.
c. Add a DATA step after the two PROC CONTENTS to merge orion.orders and orion.order_item
by the common variable to create a new data set, work.allorders. A sort is not required because
the data sets are already sorted by the common variable.
d. Submit the program and confirm that work.allorders was created with 732 observations
and 12 variables.
e. Add a statement to subset the variables. The new data set should contain six variables: Order_ID,
Order_Item_Num, Order_Type, Order_Date, Quantity, and Total_Retail_Price.
f. Write a PROC PRINT step to create the report below. Include only observations with a value for
Order_Date in the fourth quarter of 2011. The results should contain 35 observations.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-48
Order_ID
1243515588
1243515588
1243568955
1243643970
1243644877
Order_
Type
1
1
1
1
3
Order_
Date
01OCT2011
01OCT2011
07OCT2011
16OCT2011
16OCT2011
Order_
Item_Num
1
2
1
1
1
Quantity
Total_Retail_
Price
1
1
1
1
1
$251.80
$114.20
$172.50
$101.50
$14.60
Level 2
5. Merging a Sorted Data Set and an Unsorted Data Set in a One-to-Many Merge
a. Sort orion.product_list by Product_Level to create a new data set, work.product_list.
b. Merge orion.product_level with the sorted data set. Create a new data set, work.listlevel, which
includes only Product_ID, Product_Name, Product_Level, and Product_Level_Name.
c. Create the report below, including only observations with Product Level equal to 3. The results
should contain 13 observations.
Partial PROC PRINT Output
Product_
Level
3
3
3
3
3
Product_Level_
Name
Product
Product
Product
Product
Product
Category
Category
Category
Category
Category
Product_ID
210100000000
210200000000
220100000000
220200000000
230100000000
Product_Name
Children Outdoors
Children Sports
Clothes
Shoes
Outdoors
Challenge
6. Using the MERGENOBY Option
a. Use the SAS Help facility or online documentation to explore the MERGENOBY system option.
b. What is the purpose of this option and why is it used? __________________________________
_________________________________________________________________________________
c. Complete the following table to include the values that this option can assume:
Value
Description
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Default (Y/N)
Objectives
102
102
Business Scenario
An Orion Star manager in Australia requested an inventory
of company phone numbers.
103
103
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-49
10-50
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
104
104
Match-Merging
Merge empsau and phonec by EmpID to create
a new data set named empsauc.
data empsauc;
merge empsau phonec;
by EmpID;
run;
105
p110d07
105
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
Initialize PDV
data empsauc;
merge empsau phonec;
by EmpID;
run;
PDV
First
Gender EmpID
.
Phone
...
106
106
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
Do the EmpID values match?
merge empsau phonec;
by EmpID;
Yes
run;
PDV
First
Gender EmpID
.
Phone
107
107
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-51
10-52
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
merge empsau phonec;
by EmpID;
run;
PDV
First
Togar
Gender EmpID
Phone
M
121150 +61(2)5555-1795
...
108
108
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
merge empsau phonec;
by EmpID;
run;
Implicit OUTPUT;
Implicit RETURN;
PDV
First
Togar
Gender EmpID
Phone
M
121150 +61(2)5555-1795
109
109
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
Do the EmpID values match?
merge empsau phonec;
by EmpID;
No
run;
PDV
First
Togar
Gender EmpID
Phone
M
121150 +61(2)5555-1795
...
110
110
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
merge empsau phonec;
by EmpID;
run;
PDV
First
Togar
Gender EmpID
Phone
M
121150 +61(2)5555-1795
111
111
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-53
10-54
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
Reinitialize PDV
data empsauc;
merge empsau phonec;
by EmpID;
run;
PDV
First
Gender EmpID
.
Phone
...
112
112
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
merge empsau phonec;
by EmpID;
run;
PDV
First
Gender EmpID
.
Phone
113
113
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
merge empsau phonec;
by EmpID;
run;
PDV
First
Kylie
Gender EmpID
F
121151
Phone
...
114
114
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
merge empsau phonec;
by EmpID;
run;
Implicit OUTPUT;
Implicit RETURN;
PDV
First
Kylie
Gender EmpID
F
121151
Phone
115
115
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-55
10-56
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
Do the EmpID values match?
merge empsau phonec;
by EmpID;
Yes
run;
PDV
First
Kylie
Gender EmpID
F
121151
Phone
...
116
116
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
merge empsau phonec;
by EmpID;
run;
PDV
First
Kylie
Gender EmpID
F
121151
Phone
117
117
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
Reinitialize PDV
data empsauc;
merge empsau phonec;
by EmpID;
run;
PDV
First
Gender EmpID
.
Phone
...
118
118
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
merge empsau phonec;
by EmpID;
run;
PDV
First
Birin
Gender EmpID
Phone
M
121152 +61(2)5555-1667
119
119
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-57
10-58
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
merge empsau phonec;
by EmpID;
run;
Implicit OUTPUT;
Implicit RETURN;
PDV
First
Birin
Gender EmpID
Phone
M
121152 +61(2)5555-1667
...
120
120
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
EOF
data empsauc;
merge empsau phonec;
by EmpID;
run;
PDV
First
Birin
Gender EmpID
Phone
M
121152 +61(2)5555-1667
121
121
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EOF
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
Reinitialize PDV
data empsauc;
merge empsau phonec;
by EmpID;
run;
PDV
First
Gender EmpID
.
Phone
...
122
122
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
EOF
data empsauc;
merge empsau phonec;
by EmpID;
run;
PDV
First
Gender EmpID
Phone
121153 +61(2)5555-1348
123
123
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-59
10-60
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
EOF
data empsauc;
merge empsau phonec;
by EmpID;
run;
Implicit OUTPUT;
Implicit RETURN;
PDV
First
Gender EmpID
Phone
121153 +61(2)5555-1348
...
124
124
Execution
empsau
First
Togar
Kylie
Birin
EOF
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
EOF
data empsauc;
merge empsau phonec;
by EmpID;
run;
PDV
First
Gender EmpID
Phone
121153 +61(2)5555-1348
125
125
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Final Results
empsauc
First
Togar
Kylie
Birin
Gender EmpID
Phone
M
121150 +61(2)5555-1795
F
121151
M
121152 +61(2)5555-1667
121153 +61(2)5555-1348
126
126
10.06 Quiz
Which data set contributed information to the last
observation in the output data set?
a.
b.
c.
d.
empsau
phonec
both empsau and phonec
There is insufficient information.
empsauc
First
Togar
Kylie
Birin
Gender EmpID
Phone
M
121150 +61(2)5555-1795
F
121151
M
121152 +61(2)5555-1667
121153 +61(2)5555-1348
127
127
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-61
10-62
Business Scenario
An Orion Star manager requested three phone inventory
reports.
130
130
131
131
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
merge empsau(in=E)
phonec(in=P);
merge empsau(in=AU)
phonec;
132
132
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
merge empsau(in=Emps)
phonec(in=Cell);
by EmpID;
run;
PDV
First
Togar
133
Gender EmpID
M
121150
match
Emps
Phone
1 +61(2)5555-1795
Cell
1
p110d08
133
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10-63
10-64
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
merge empsau(in=Emps)
phonec(in=Cell);
by EmpID;
run;
PDV
First
Kylie
Gender EmpID
F
121151
Emps
1
non-match
Phone
Cell
0
...
134
134
Execution
empsau
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
merge empsau(in=Emps)
phonec(in=Cell);
by EmpID;
run;
PDV
First
Birin
Gender EmpID
M
121152
match
Emps
Phone
1 +61(2)5555-1667
Cell
1
135
135
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
...
10.07 Quiz
What are the values of Emps and Cell?
empsau
EOF
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
merge empsau(in=Emps)
phonec(in=Cell);
by EmpID;
run;
PDV
First
136
Gender EmpID
121153
Emps
Phone
+61(2)5555-1348
Cell
Emps
Phone
1 +61(2)5555-1795
1
1 +61(2)5555-1667
0 +61(2)5555-1348
Cell
1
0
1
1
136
PDV Results
PDV
First
Togar
Kylie
Birin
Gender EmpID
M
121150
F
121151
M
121152
121153
The variables created with the IN= data set option are
only available during DATA step execution.
They are not written to the SAS data set.
Their value can be tested using conditional logic.
138
138
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-65
10-66
Matches Only
Add a subsetting IF statement to select the employees
that have company phones.
data empsauc;
merge empsau(in=Emps)
phonec(in=Cell);
by EmpID;
if Emps=1 and Cell=1;
run;
empsauc
First
Togar
Birin
Gender EmpID
Phone
M
121150 +61(2)5555-1795
M
121152 +61(2)5555-1667
p110d08
139
139
140
Gender EmpID
F
121151
Phone
p110d08
140
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Gender EmpID
Phone
121153 +61(2)5555-1348
p110d08
141
141
All Non-Matches
data empsauc;
merge empsau(in=Emps)
phonec(in=Cell);
by EmpID;
if Emps=0 or Cell=0;
run;
empsauc
First
Kylie
Gender EmpID
Phone
F
121151
121153 +61(2)5555-1348
142
p110d08
142
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-67
10-68
Alternate Syntax
When checking a variable for a value of 1 or 0 as in the
previous scenario, you can use the following syntax:
Instead of
if Emps=0 or Cell=0;
143
143
Alternate Syntax
Both programs create a report of employees without cell
phones.
data empsphone;
merge empsact(in=inEmps)
phoneact(in=inCell);
by EmpID;
if inEmps=1 and inCell=0;
run;
data empsphone;
merge empsact(in=inEmps)
phoneact(in=inCell);
by EmpID;
if inEmps and not inCell;
run;
144
p110d09
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Business Scenario
Merge Orion Star customer information with customer
type data to obtain a customer description. The new data
set should include only US customers.
orion.
customer_type
orion.customer
customers
146
146
Considerations
The orion.customer data set is not sorted by
Customer_Type_ID, the common variable. The subsetting
variable, Country, is defined in only one data set.
orion.customer
Customer_
ID
Country
Customer_
Name
Birth_
Date
Customer_Type_
ID
orion.customer_type
Customer_
Group
Customer_Group_ Customer_
ID
Type
Customer_Type_
ID
147
147
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-69
10-70
10.08 Quiz
Open and submit p110a03. Correct the program and
resubmit. What change is needed to correct the error?
proc sort data=orion.customer
out=cust_by_type;
by Customer_Type_ID;
run;
data customers;
merge cust_by_type orion.customer_type;
by Customer_Type_ID;
where Country='US';
run;
148
Subsetting IF
Use a subsetting IF when the subsetting variable is not in
all data sets named in the MERGE statement.
proc sort data=orion.customer
out=cust_by_type;
by Customer_Type_ID;
run;
data customers;
merge cust_by_type orion.customer_type;
by Customer_Type_ID;
if Country='US';
run;
150
p110a03s
150
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
data customers;
merge cust_by_type orion.customer_type;
by Customer_Type_ID;
if Country='US';
run;
NOTE: There were 77 observations read from the data set WORK.CUST_BY_TYPE.
NOTE: There were 8 observations read from the data set ORION.CUSTOMER_TYPE.
NOTE: The data set WORK.CUSTOMERS has 28 observations and 15 variables.
151
151
WHERE
IF
Yes
No
SET statement
Yes
Yes
assignment statement
No
Yes
INPUT statement
No
Yes
Yes
Yes
152
152
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-71
10-72
Level 1
7. Merging Using the IN= Option
a. Retrieve the starter program p110e07. Add a DATA step after the PROC SORT step to merge
work.product and orion.supplier by Supplier_ID to create a new data set, work.prodsup.
b. Submit the program and confirm that work.prodsup was created with 556 observations
and 10 variables.
c. Modify the DATA step to output only observations that are in work.product but
not orion.supplier. A subsetting IF statement that references IN= variables in the MERGE
statement must be added.
d. Submit the program and confirm that work.prodsup was created with 75 observations
and 10 variables.
e. Submit the PROC PRINT step to create the report below. The results should contain 75
observations.
Obs
Product_ID
1
2
3
4
5
210000000000
210100000000
210100100000
210200000000
210200100000
Product_Name
Supplier_ID
Children
Children Outdoors
Outdoor things, Kids
Children Sports
A-Team, Kids
Supplier_
Name
.
.
.
.
.
Level 2
8. Merging Using the IN= and RENAME= Options
a. Write a PROC SORT step to sort orion.customer by Country to create a new data set,
work.customer.
b. Write a DATA step to merge the resulting data set with orion.lookup_country
by Country to create a new data set, work.allcustomer.
In the orion.lookup_country data set, rename Start to Country and rename Label to
Country_Name.
Include only four variables: Customer_ID, Country, Customer_Name, and Country_Name.
c. Create the report below. The results should contain 308 observations.
Partial PROC PRINT Output
Obs
1
2
...
306
Customer_ID
.
.
3959
Country
Customer_Name
AD
AE
ZA
Country_Name
Andorra
United Arab Emirates
Rita Lotz
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
South Africa
10.5 Solutions
307
308
.
.
ZM
ZW
10-73
Zambia
Zimbabwe
d. Modify the DATA step to store only the observations that contain both customer information
and country information. A subsetting IF statement that references IN= variables in the MERGE
statement must be added.
e. Submit the program to create the report below. The results should contain 77 observations.
Partial PROC PRINT Output
Obs
Customer_ID
1
2
3
4
5
29
41
53
111
171
Country
AU
AU
AU
AU
AU
Customer_Name
Candy Kinsey
Wendell Summersby
Dericka Pockran
Karolina Dokter
Robert Bowerman
Country_
Name
Australia
Australia
Australia
Australia
Australia
Challenge
9. Merging and Outputting to Multiple Data Sets
a. Write a PROC SORT step to sort orion.orders by Employee_ID to create a new data set,
work.orders.
b. Write a DATA step to merge orion.staff and work.orders by Employee_ID and create two new
data sets: work.allorders and work.noorders.
work.allorders should include all observations from work.orders, regardless of matches or
non-matches from the orion.staff data set.
work.noorders should include only the observations from orion.staff that do not have a
match in work.orders.
Both new data sets should include only Employee_ID, Job_Title, Gender, Order_ID,
Order_Type, and Order_Date.
c. Submit the program and confirm that work.allorders was created with 490 observations
and 6 variables and work.noorders was created with 324 observations and 6 variables.
d. Create a detailed listing report for each new data set with an appropriate title.
10.5 Solutions
Solutions to Exercises
1. Concatenating Like-Structured Data Sets
How many observations in work.thirdqtr are from orion.mnth7_2011? 10 observations
How many observations in work.thirdqtr are from orion.mnth8_2011? 12 observations
How many observations in work.thirdqtr are from orion.mnth9_2011? 10 observations
data work.thirdqtr;
set orion.mnth7_2011 orion.mnth8_2011 orion.mnth9_2011;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-74
run;
proc print data=work.thirdqtr;
run;
2. Concatenating Unlike-Structured Data Sets
What are the names of the two variables that are different in the two data sets?
orion.sales
orion.nonsales
First_Name
First
Last_Name
Last
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10.5 Solutions
set
run;
10-75
orion.us_suppliers orion.consultants;
c. Submit a PROC CONTENTS step to examine work.contacts. From which input data set were the
variable attributes assigned? the first data set in the set statement, orion.charities
e. Submit a PROC CONTENTS step to examine work.contacts2. From which input data set were
the variable attributes assigned? the first data set in the set statement, orion.us_suppliers
f. Write a DATA step to concatenate orion.us_suppliers and orion.consultants, creating a
temporary data set, contacts3.
Why did the DATA step fail? ContactType has been defined as both character and numeric.
4. Merging Two Sorted Data Sets in a One-to-Many Merge
proc contents data=orion.orders;
run;
proc contents data=orion.order_item;
run;
data work.allorders;
merge orion.orders
orion.order_item;
by Order_ID;
keep Order_ID Order_Item_Num Order_Type
Order_Date Quantity Total_Retail_Price;
run;
proc print data=work.allorders noobs;
where Order_Date between '01Oct2011'd and '31Dec2011'd;
run;
/* alternate solution */
proc print data=work.allorders noobs;
where Order_Date>='01Oct2011'd and Order_Date<='31Dec2011'd;
run;
proc print data=work.allorders noobs;
where qtr(Order_Date)=4 and year(Order_Date)=2011;
run;
5. Merging a Sorted Data Set and an Unsorted Data Set in a One-to-Many Merge
proc sort data=orion.product_list
out=work.product_list;
by Product_Level;
run;
data work.listlevel;
merge orion.product_level work.product_list ;
by Product_Level;
keep Product_ID Product_Name Product_Level Product_Level_Name;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-76
run;
proc print data=work.listlevel noobs;
where Product_Level=3;
run;
6. Using the MERGENOBY Option
b. What is the purpose of this option and why is it used? This option is used to issue a warning or
an error when a BY statement is omitted from a merge. Performing a merge without a BY
statement merges the observations based on their position. This is almost never done
intentionally and can lead to unexpected results.
c. Complete the following table to include the values that this option can assume.
Value
Description
Default (Y/N)
NOWARN
WARN
ERROR
Writes an error message to the log and the DATA step terminates.
10.5 Solutions
10-77
empsjp
Gender
M
M
F
Country
China
China
China
First
Cho
Tomi
Gender Region
F
Japan
M
Japan
30
30
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-78
empsjp(rename(Region=Location));
b. Answer
set empscn(rename=(Country=Location))
empsjp(rename=(Region=Location));
c.
38
38
52
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10.5 Solutions
59
59
96
p110a02s
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-79
10-80
empsau
phonec
both empsau and phonec
There is insufficient information.
empsauc
First
Togar
Kylie
Birin
Gender EmpID
Phone
M
121150 +61(2)5555-1795
F
121151
M
121152 +61(2)5555-1667
121153 +61(2)5555-1348
128
128
EOF
First
Togar
Kylie
Birin
phonec
Gender EmpID
M
121150
F
121151
M
121152
EmpID
Phone
121150 +61(2)5555-1795
121152 +61(2)5555-1667
121153 +61(2)5555-1348
data empsauc;
merge empsau(in=Emps)
phonec(in=Cell);
by EmpID;
run;
PDV
First
137
Gender EmpID
121153
non-match
Emps
Phone
0 +61(2)5555-1348
Cell
1
137
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10.5 Solutions
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
10-81
10-82
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-2
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Business Scenario
Orion Star management wants to know the number of
male and female sales employees in Australia.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-3
11-4
Considerations
Use the FREQ procedure to analyze the Gender variable
in a subset of orion.sales.
The FREQ Procedure
Gender
Frequency
Percent
F
XX
XX.XX
M
XX
XX.XX
FREQ Procedure
The FREQ procedure produces a one-way frequency
table for each variable named in the TABLES statement.
p111d01
PROC FREQ displays a report by default. The output can be saved in a SAS data set.
The procedure can compute the following:
chi-square tests for one-way to n-way tables
tests and measures of agreement for contingency tables
tests and measures of association for contingency tables and more
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-5
F
27
42.86
27
42.86
M
36
57.14
63
100.00
Description
NOCUM
NOPERCENT
Cumulative frequencies and percentages are useful when there are at least three levels of a variable in an
ordinal relationship. When this is not the case, the NOCUM option produces a simpler, less confusing
report.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-6
F
27
42.86
27
42.86
M
36
57.14
63
100.00
NOPERCENT
suppresses
11.01 Quiz
Open and submit p111a01. Review the log to determine
the cause of the error. Correct the program and resubmit.
What change was needed?
10
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Idea Exchange
This step creates a table for every variable in the data set:
proc freq data=orion.sales;
run;
Employee_ID
First_Name
Last_Name
Gender
Salary
Job_Title
Country
Birth_Date
Hire_Date
Business Scenario
Orion Star management wants to know how many sales
employees are in each country, as well as the count of
males and females.
14
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-7
11-8
TABLES Statement
You can list multiple variables in a TABLES statement. A
separate table is produced for each variable.
proc freq data=orion.sales;
tables Gender Country;
run;
PROC FREQ Output
The FREQ Procedure
Cumulative
Cumulative
Gender
Frequency
Percent
Frequency
Percent
F
68
41.21
68
41.21
M
97
58.79
165
100.00
Cumulative
Cumulative
Country
Frequency
Percent
Frequency
Percent
AU
63
38.18
63
38.18
US
102
61.82
165
100.00
p111d02
15
BY Statement
The BY statement is used to request separate analyses
for each BY group.
proc sort data=orion.sales out=sorted;
by Country;
run;
proc freq data=sorted;
tables Gender;
by Country;
run;
The data set must be sorted or indexed by the variable (or
variables) named in the BY statement.
16
p111d02
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
F
27
42.86
27
42.86
M
36
57.14
63
100.00
-------------------------- Country=US -----------------------The FREQ Procedure
Cumulative
Cumulative
Gender
Frequency
Percent
Frequency
Percent
F
41
40.20
41
40.20
M
61
59.80
102
100.00
17
Crosstabulation Table
An asterisk between two variables generates a two-way
frequency table, or crosstabulation table.
proc freq data=orion.sales;
tables Gender*Country;
run;
rows
columns
18
p111d02
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-9
11-10
Country
Frequency
Percent
Row Pct
Col Pct AU
US
Total
27
41
68
16.36 24.85 41.21
39.71 60.29
42.86 40.20
36
61
97
21.82 36.97 58.79
37.11 62.89
57.14 59.80
Total
63
102
165
38.18
61.82
100.00
19
Option
Description
NOROW
NOCOL
NOPERCENT
NOFREQ
20
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Country
Frequency
Percent
Row Pct
Col Pct AU
US
Total
27
41
68
16.36 24.85 41.21
39.71 60.29
42.86 40.20
36
61
97
21.82 36.97 58.79
37.11 62.89
57.14 59.80
Total
63
102
165
38.18
61.82
100.00
21
Country
Frequency
Percent
Row Pct
Col Pct AU
US
Total
27
41
68
16.36 24.85 41.21
39.71 60.29
42.86 40.20
36
61
97
21.82 36.97 58.79
37.11 62.89
57.14 59.80
Total
63
102
165
38.18
61.82
100.00
22
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-11
11-12
F
AU
27
16.36
27
16.36
F
US
41
24.85
68
41.21
M
AU
36
21.82
104
63.03
M
US
61
36.97
165
100.00
Table of Gender by Country
LIST
option
Row
Column
Gender
Country
Frequency
Percent
Percent
Percent
F
AU
27
16.36
39.71
42.86
US
41
24.85
60.29
40.20
Total
68
41.21
100.00
---------------------------------------------------------------M
AU
36
21.82
37.11
57.14
US
61
36.97
62.89
59.80
Total
97
58.79
100.00
---------------------------------------------------------------Total
AU
63
38.18
100.00
US
102
61.82
100.00
Total
165
100.00
23
Business Scenario
A new data set, orion.nonsales2, must be validated. It
contains information on non-sales employees and might
include invalid and missing values.
Partial orion.nonsales2
Employee_
First
ID
120101 Patrick
Last
Gender
Salary
Job_Title
163040 Director
Country
Lu
AU
120104 Kareen
Billington
au
120105 Liz
Povey
27110 Secretary I
AU
120106 John
Hornsey
. Office Asst II
120107 Sherie
Sheedy
AU
120108 Gladys
Gromek
AU
AU
25
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Considerations
Use the FREQ procedure to screen for invalid, missing,
and duplicate data values.
Requirements of non-sales employee data:
Employee_ID values must be unique and not missing.
Gender must be F or M.
Job_Title must not be missing.
Country must have a value of AU or US.
Salary values must be in the numeric range of
24000 to 500000.
26
11.02 Quiz
What problems exist with the data in this partial data set?
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-13
11-14
p111d03
29
F
110
G
1
M
123
Frequency Missing = 1
Country
Frequency
AU
33
US
196
au
3
us
3
30
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
NLEVELS Option
The NLEVELS option displays a table that provides the
number of distinct values for each analysis variable.
proc freq data=orion.nonsales2 nlevels;
tables Gender Country / nocum nopercent;
run;
PROC FREQ DATA=SAS-data-set NLEVELS;
TABLES variable(s) ;
RUN;
p111d03
31
Gender
4
1
3
Country
4
0
4
Gender
Frequency
F
110
G
1
M
123
Frequency Missing = 1
Country
Frequency
AU
33
US
196
au
3
us
3
32
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-15
11-16
p111d04
33
120108
2
120101
1
120104
1
120105
1
120106
1
121134
121141
121142
121146
121147
121148
1
1
1
1
1
1
Frequency Missing = 1
34
p111d04
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
NLEVELS Option
NLEVELS can also be used to identify duplicates, when
the number of distinct values is known.
proc freq data=orion.nonsales2 nlevels;
tables Employee_ID / noprint;
run;
This example uses the NOPRINT option to suppress the
frequency table. Only the Number of Variable Levels table
is displayed.
p111d04
35
Employee_ID
234
1
233
There are 235 employees, but there are only 234 distinct
Employee_ID values. Therefore, there is one duplicate
value and one missing value for Employee_ID.
36
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-17
11-18
NLEVELS Option
The _ALL_ keyword with the NOPRINT option displays
the number of levels for all variables without displaying
frequency counts.
proc freq data=orion.nonsales2 nlevels;
tables _all_ / noprint;
run;
p111d04
37
Employee_ID
234
1
233
First
204
0
204
Last
228
0
228
Gender
4
1
3
Salary
230
1
229
Job_Title
125
1
124
Country
4
0
4
38
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11.03 Quiz
Modify p111a02 to analyze Job_Title. Display the
NLEVELS table listing the frequency counts in decreasing
order.
How many unique, nonmissing job titles exist?
Which job title occurs most frequently?
What is the frequency of missing job titles?
39
41
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-19
11-20
Employee_ID
2
6
7
10
12
14
84
87
101
125
197
200
120104
120108
120108
120112
120114
.
120695
120698
120723
120747
120994
120997
First
Last
Gender
Kareen
Gladys
Gabriele
Ellis
Jeannette
Austen
Trent
Geoff
Deanna
Zashia
Danelle
Mary
Billington
Gromek
Baker
Glattback
Buddery
Ralston
Moffat
Kistanna
Olsen
Farthing
Sergeant
Donathan
F
F
F
F
G
M
M
M
F
F
F
Salary
46230
27660
26495
26550
31285
29250
28180
26160
33950
43590
31645
27420
Job_Title
Administration Manager
Warehouse Assistant II
Warehouse Assistant I
Security Manager
Service Assistant II
Warehouse Assistant II
Warehouse Assistant I
Corp. Comm. Specialist II
Financial Controller I
Office Administrator I
Shipping Administrator I
Country
au
AU
AU
AU
AU
AU
au
au
US
us
us
us
original
observation
numbers
42
Business Scenario
The manager of Human Resources has requested
a report showing the number and percent of sales
employees hired each year.
44
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
01JAN1978
17
10.30
01FEB1978
2
1.21
01APR1978
1
0.61
01JUL1978
1
0.61
01AUG1978
1
0.61
45
p111d05
1978
23
13.94
1979
2
1.21
1980
4
2.42
1981
3
1.82
1982
7
4.24
46
p111d05
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-21
11-22
11.04 Quiz
Open and submit p111a03 and view the output. Add a
statement to apply the TIERS format to Salary and
resubmit.
Can user-defined formats be used to group data?
47
FORMAT Statement
User-defined formats can also be used to display levels
with alternate text in a frequency table.
proc freq data=orion.sales;
tables Gender*Country;
format Country $ctryfmt.
Gender $gender.;
run;
49
p111d06
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-23
Country
Labels are
Frequency
wrapped.
Percent
Row Pct
Col Pct AustraliUnited S Total
a
tates
Female
27
41
68
16.36 24.85 41.21
39.71 60.29
42.86 40.20
Male
36
61
97
21.82 36.97 58.79
37.11 62.89
57.14 59.80
Total
63
102
165
38.18
61.82
100.00
50
The default format for cell statistics is 8.d, where d is 0 for frequencies and 2 for percentages. Eight print
positions are sometimes not enough for column headings.
FORMAT= Option
Use the FORMAT= option in the TABLES statement to
format the frequency value and to change the width of the
column.
proc freq data=orion.sales;
tables Gender*Country / format=12.;
format Country $ctryfmt.
Gender $gender.;
run;
51
p111d06
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-24
Country
Columns are 12
characters wide.
Frequency
Percent
Row Pct
Col Pct Australia
United States
Total
Female
27
41
68
16.36
24.85
41.21
39.71
60.29
42.86
40.20
Male
36
61
97
21.82
36.97
58.79
37.11
62.89
57.14
59.80
Total
63
102
165
38.18
61.82
100.00
52
The FORMAT= option applies only to the frequency. Percentage values are always displayed with two
decimal places.
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Level 1
1. Counting Levels of a Variable with PROC FREQ
a. Retrieve the starter program p111e01.
b. Submit the program without making changes to analyze Customer_ID and Employee_ID in
orion.orders. Would you expect to see frequencies of 1 for customers and sales employees? ____
c. Modify the program to produce two separate reports.
1) Display the number of distinct levels of Customer_ID and Employee_ID for retail orders.
a) Use a WHERE statement to limit the report to retail sales (Order_Type=1).
b) Do not display the frequency count tables.
c) Display the title Unique Customers and Salespersons for Retail Sales.
d) Submit the program to produce the following report:
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Customer_ID
Customer ID
31
Employee_ID
Employee ID
100
2) Display the number of distinct levels for Customer_ID for catalog and Internet orders.
a) Use a WHERE statement to limit the report to catalog and Internet sales by selecting
observations with Order_Type values other than 1.
b) Specify an option to display the results in decreasing frequency order.
c) Specify an option to suppress the cumulative statistics.
d) Display the title Catalog and Internet Customers.
e) Submit the program to produce the following report:
Partial PROC FREQ Output
Catalog and Internet Customers
The FREQ Procedure
Customer ID
Customer_ID
Frequency
Percent
16
15
6.52
29
9
3.91
5
8
3.48
...
26148
1
0.43
70059
1
0.43
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-25
11-26
Level 2
3. Producing Frequency Reports with PROC FREQ
a. Retrieve the starter program p111e03.
b. Add statements to the PROC FREQ step to produce three frequency reports.
1) Number of orders in each year: Apply a format to the Order_Date variable to combine all
orders within the same year.
2) Number of orders of each order type: Apply the ORDERTYPES. format defined in the starter
program to the Order_Type variable. Suppress the cumulative frequency and percentages.
3) Number of orders for each combination of year and order type: Suppress all percentages that
normally appear in each cell of a two-way table.
c. Submit the program to produce the following output:
PROC FREQ Output
Order Summary by Year and Type
The FREQ Procedure
Date Order was placed by Customer
Cumulative
Cumulative
Order_Date
Frequency
Percent
Frequency
Percent
2007
104
21.22
104
21.22
2008
87
17.76
191
38.98
2009
70
14.29
261
53.27
2010
113
23.06
374
76.33
2011
116
23.67
490
100.00
Order Type
Order_
Type
Frequency
Percent
Retail
260
53.06
Catalog
132
26.94
Internet
98
20.00
2007
45
41
18
2008
51
20
16
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Total
104
87
2009
27
23
20
2010
67
33
13
2011
70
15
31
Total
260
132
98
11-27
70
113
116
490
Challenge
5. Creating an Output Data Set with PROC FREQ
Write a program to perform a frequency analysis on Product_ID in orion.order_fact.
a. Create an output data set containing the frequency counts based on Product_ID. Explore the SAS
Help facility or online documentation for information about creating an output data set of counts
from PROC FREQ results.
b. Combine the output data set with orion.product_list to obtain the Product_Name value for each
Product_ID code. Output only products that have been ordered.
c. Sort the combined data so that the most frequently ordered products appear first in the resulting
data set. Print the first five observationsthat is, those that represent the five products ordered
most often. Use the OBS= data set option to limit the number of observations displayed.
d. Submit the program to produce the following report:
PROC PRINT Output
Top Five Products by Number of Orders
Orders
6
6
5
5
4
Product
Number
230100500056
230100600030
230100600022
240400300035
230100500082
Product
Knife
Outback Sleeping Bag, Large,Left,Blue/Black
Expedition10,Medium,Right,Blue Ribbon
Smasher Shorts
Lucky Tech Intergal Wp/B Rain Pants
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-28
Objectives
56
Business Scenario
The payroll manager would like to see the average salary
for all employees.
57
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
MEANS Procedure
The MEANS procedure produces summary reports with
descriptive statistics.
proc means data=orion.sales;
run;
58
Employee_ID
165
120713.90
450.0866939
120102.00
121145.00
Salary
165
31160.12
20082.67
22710.00
243190.00
Birth_Date
165
3622.58
5456.29
-5842.00
10490.00
Hire_Date
165
12054.28
4619.94
5114.00
17167.00
59
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-29
11-30
VAR Statement
The VAR statement identifies the analysis variable
(or variables) and their order in the output.
proc means data=orion.sales;
var Salary;
run;
VAR variable(s);
165
31160.12
20082.67
22710.00
243190.00
p111d07
60
Business Scenario
Analyze Salary by Country within Gender.
female
male
61
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
CLASS Statement
The CLASS statement identifies variables whose values
define subgroups for the analysis.
proc means data=orion.sales;
var Salary;
class Gender Country;
run;
CLASS classification-variable(s);
p111d08
62
F
AU
27
27
27702.41
1728.23
25185.00
30890.00
63
US
41
41
29460.98
8847.03
25390.00
83505.00
AU
36
36
32001.39
16592.45
25745.00
108255.00
US
61
61
33336.15
29592.69
22710.00
243190.00
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-31
11-32
11.05 Quiz
For a given data set, there are 63 observations with
a Country value of AU. Of those 63 observations,
only 61 observations have a value for Salary.
Which output is correct?
a.
AU
63
61
b.
AU
61
63
64
Business Scenario
Analyze Salary by Country within Gender. Generate a
report that includes the number of missing Salary values,
as well as the minimum, maximum, and sum of salaries.
female
male
66
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
p111d09
67
F
AU
27
0
25185.00
30890.00
747965.00
US
41
25390.00
83505.00
1207900.00
AU
36
25745.00
108255.00
1152050.00
US
61
0
22710.00
243190.00
2033505.00
68
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-33
11-34
Description
MAXDEC=
NONOBS
69
MAXDEC= Option
MAXDEC=0
The MEANS Procedure
Analysis Variable : Salary
N
Country
Obs
N
Mean
Std Dev
Minimum
Maximum
AU
63
63
30159
12699
25185
108255
US
102
102
31778
23556
22710
243190
MAXDEC=1
The MEANS Procedure
Analysis Variable : Salary
N
Country
Obs
N
Mean
Std Dev
Minimum
Maximum
AU
63
63
30159.0
12699.1
25185.0
108255.0
US
70
102
102
31778.5
23555.8
22710.0
243190.0
p111d10
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
NONOBS Option
N Obs included by default
The MEANS Procedure
Analysis Variable : Salary
N
Country
Obs
N
Mean
Std Dev
Minimum
Maximum
AU
63
63
30158.97
12699.14
25185.00
108255.00
US
102
102
31778.48
23555.84
22710.00
243190.00
NONOBS option
The MEANS Procedure
Analysis Variable : Salary
Country
N
Mean
Std Dev
Minimum
Maximum
AU
63
30158.97
12699.14
25185.00
108255.00
US
102
31778.48
23555.84
22710.00
243190.00
p111d10
71
CSS
CV
LCLM
MAX
MEAN
MIN
MODE
NMISS
KURTOSIS
RANGE
SKEWNESS
STDDEV
STDERR
SUM
SUMWGT
UCLM
USS
VAR
P1
P5
P10
Q1 | P25
Q3 | P75
P90
P95
P99
QRANGE
72
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-35
11-36
Idea Exchange
Which PROC MEANS statistics would you request when
validating numeric variables?
73
Business Scenario
Validate salary data in orion.nonsales2. Salary must be
in the numeric range of 24000 to 500000.
Partial orion.nonsales2
Employee
First
_ID
120101 Patrick
Last
Gender
Salary
Job_Title
163040 Director
Country
Lu
120104 Kareen
Billington
au
AU
120105 Liz
Povey
27110 Secretary I
AU
120106 John
Hornsey
. Office Asst II
120107 Sherie
Sheedy
AU
120108 Gladys
Gromek
AU
AU
75
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
UNIVARIATE Procedure
PROC UNIVARIATE displays extreme observations,
missing values, and other statistics for the variables
included in the VAR statement.
proc univariate data=orion.nonsales2;
var Salary;
run;
PROC UNIVARIATE DATA=SAS-data-set;
<VAR variable(s);>
RUN;
p111d11
76
77
-----Highest----
Value
Obs
Value
Obs
2401
2650
24025
24100
24390
20
13
25
19
228
163040
194885
207885
268455
433800
1
231
28
29
27
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-37
11-38
NEXTROBS= Option
The NEXTROBS= option specifies the number of extreme
observations to display.
proc univariate data=orion.nonsales2
nextrobs=3;
var Salary;
run;
Partial PROC UNIVARIATE Output
The UNIVARIATE Procedure
Variable: Salary
Extreme Observations
-----Lowest----
-----Highest----
Value
Obs
Value
Obs
2401
2650
24025
20
13
25
207885
268455
433800
28
29
27
78
p111d11
The default value for NEXTROBS= is 5, and n can range between 0 and half the maximum number of
observations. You can specify NEXTROBS=0 to suppress the table of extreme observations.
ID Statement
The ID statement displays the value of the identifying
variable (or variables) in addition to the observation
number.
proc univariate data=orion.nonsales2;
var Salary;
id Employee_ID;
run;
ID variable(s);
79
p111d11
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
------------Highest----------
Value
Employee_ID
Obs
Value
Employee_ID
Obs
2401
2650
24025
24100
24390
120191
120115
120196
120190
121132
20
13
25
19
228
163040
194885
207885
268455
433800
120101
121141
120260
120262
120259
1
231
28
29
27
80
Count
-----Percent Of----Missing
All Obs
Obs
0.43
100.00
81
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-39
11-40
11.06 Quiz
PROC UNIVARIATE identified two observations with
Salary values less than 24,000.
What procedure can be used to display the observations
containing the invalid values?
82
Exercises
If you restarted your SAS session since the last exercise, open and submit the libname.sas program
found in the data folder.
Level 1
6. Creating a Summary Report with PROC MEANS
a. Retrieve the starter program p111e06.
b. Display only the SUM statistic for the Total_Retail_Price variable.
c. Add a CLASS statement to display separate statistics for each combination of Order_Date and
Order_Type.
d. Apply the ORDERTYPES format so that the order types are displayed as text descriptions.
Apply the YEAR4. format so that order dates are displayed as years.
e. Submit the program to produce the following report:
Partial PROC MEANS Output
Revenue from All Orders
The MEANS Procedure
Analysis Variable : Total_Retail_Price Total Retail Price for This Product
Date
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-41
Order
was
placed
by
Order
N
Customer
Type
Obs
Sum
2007
Retail
53
7938.80
2008
Catalog
52
10668.08
Internet
23
4124.05
Retail
63
9012.22
Catalog
23
3494.60
Internet
22
3275.70
Level 2
8. Analyzing Missing Numeric Values with PROC MEANS
a. Retrieve the starter program p111e08.
b. Display the number of missing values and the number of nonmissing values present in the
Birth_Date, Emp_Hire_Date, and Emp_Term_Date variables.
c. Add a CLASS statement to display separate statistics for each value of Gender.
d. Suppress the column that displays the total number of observations in each classification group.
e. Submit the program to produce the following report:
PROC MEANS Output
Number of Missing and Non-Missing Date Values
The MEANS Procedure
Employee
N
Gender
Variable
Label
Miss
N
F
Birth_Date
Employee Birth Date
0
191
Emp_Hire_Date
Employee Hire Date
0
191
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-42
139
52
Birth_Date
Employee Birth Date
0
233
Emp_Hire_Date
Employee Hire Date
0
233
Emp_Term_Date
Employee Termination Date
169
64
Challenge
10. Creating an Output Data Set with PROC MEANS
a. Retrieve the starter program p111e10.
b. Modify the PROC MEANS step to create an output data set containing the sum of
Total_Retail_Price values for each value of Product_ID. Creating an output data set from
PROC MEANS results is discussed in the SAS Help facility and in the online documentation.
c. Combine the output data set with orion.product_list to obtain the Product_Name value for each
Product_ID code.
d. Sort the combined data so that the products with higher revenues appear at the top of the resulting
data set.
e. Apply the OBS= data set option in a PROC PRINT step to display the first five observations
that is, those that represent the five products with the most revenue.
f. Display the revenue values with a leading euro symbol (), a period that separates every three
digits, a comma that separates the decimal fraction, and two decimal places.
g. Submit the program to produce the following report:
PROC MEANS Output
Top Five Products by Revenue
Obs
1
2
3
4
5
Revenue
3.391,80
3.080,30
2.250,00
1.937,20
1.796,00
Product
Number
230100700009
230100700008
230100700011
240200100173
240200100076
Product
Family Holiday 6
Family Holiday 4
Hurricane 4
Proplay Executive Bi-Metal Graphite
Expert Men's Firesole Driver
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-43
11. Selecting Only the Extreme Observations Output from the UNIVARIATE Procedure
a. Write a PROC UNIVARIATE step to validate Product_ID in orion.shoes_tracker.
b. Before the PROC UNIVARIATE step, add the following ODS statement:
ods trace on;
c. After the PROC UNIVARIATE step add, the following ODS statement:
ods trace off;
d. Submit the program and notice the trace information in the SAS log.
What is the name of the last output added in the SAS log? ______________________________
e. Add an ODS SELECT statement immediately before the PROC UNIVARIATE step to select only
the Extreme Observation output object. Documentation about the ODS TRACE and ODS
SELECT statements can be found in the SAS Help facility and in the online documentation.
f. Submit the program to create the following PROC UNIVARIATE report:
The UNIVARIATE Procedure
Variable: Product_ID (Product ID)
Extreme Observations
--------Lowest-------
-------Highest------
Value
Obs
Value
Obs
2.20200E+10
2.20200E+11
2.20200E+11
2.20200E+11
2.20200E+11
4
1
2
3
5
2.2020E+11
2.2020E+11
2.2020E+11
2.2020E+11
2.2020E+12
6
7
9
10
8
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-44
87
Business Scenario
Generate reports in various formats for distribution within
Orion Star.
88
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
First_Name
Last_Name
Satyakam
Monica
Kevin
Petrea
Marina
Shani
Fang
Michael
Amanda
Vincent
Denny
Kletschkus
Lyon
Soltau
Iyengar
Duckett
Wilson
Minas
Liebman
Eastley
Job_Title
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Sales
Rep.
Rep.
Rep.
Rep.
Rep.
Rep.
Rep.
Rep.
Rep.
Rep.
II
IV
I
II
III
I
II
I
II
III
Salary
26780
30890
26955
27440
29715
25795
26810
26970
27465
29695
89
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-45
11-46
91
92
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-47
ODS Statements
Open an ODS destination, submit one or more
procedures that generate output, and then close the
destination.
ods html file="&path\myreport.html";
proc freq data=orion.sales;
tables Country;
run;
ods html close;
p111d12
93
The Output Delivery System works in all operating environments. Use the RS=NONE option when you
create HTML and RTF files on z/OS (OS/390).
/* z/OS example */
ods html file="&path..report(example)" rs=none;
proc freq data=orion.sales;
tables Country;
run;
ods html close;
94
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-48
ODS Destinations
ODS creates various types of output based on the
specified destinations and file types.
Destination
Type of File
Extension
Viewed In
LISTING
Plain text
HTML
Hypertext Markup
Language
html
Portable
Document Format
RTF
rtf
95
p111d14
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
p111d15
97
No Open Destinations
Be sure to have at least one destination open.
2723
2724
2725
2726
Output is generated
but not displayed.
ods listing;
proc freq data=orion.sales;
tables Country;
run;
Output is displayed.
NOTE: There were 165 observations read from the data set
ORION.SALES.
98
p111d15
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-49
11-50
11.07 Quiz
What is the problem with this program?
ods html file="&path\myreport.html";
proc print data=orion.sales;
run;
ods close;
p111a05
99
STYLE= Option
Use a STYLE= option in the ODS statement to specify
a style definition.
101
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
HTML Examples
STYLE=DEFAULT
STYLE=SASWEB
p111d12
102
STYLE=RTF
STYLE=OCEAN
103
p111d17
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-51
11-52
Available Styles
Use the TEMPLATE procedure to see the available
styles.
Partial Output
Listing of: SASHELP.TMPLMST
proc template;
Path Filter is: Styles
list styles;
Sort by: PATH/ASCENDING
run;
Obs
Path
Type
1
Styles
Dir
2
Styles.Analysis
Style
3
Styles.Astronomy
Style
4
Styles.Banker
Style
5
Styles.BarrettsBlue
Style
6
Styles.Curve
Style
7
Styles.Default
Style
8
Styles.Dtree
Style
9
Styles.EGDefault
Style
10
Styles.Education
Style
p111d13
104
Business Scenario
Create SAS reports that can be opened in Microsoft Excel
using the CSVALL, MSOFFICE2K, and EXCELXP
destinations.
csvall
msoffice2k
excelxp
106
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Type of File
Extension
Viewed In
CSVALL
Comma-Separated
Value
csv
Editor or
Microsoft Excel
MSOFFICE2K
Hypertext Markup
Language
html
Web browser or
Microsoft Word
or Microsoft
Excel
TAGSETS.
EXCELXP
Extensible Markup
Language
xml
Microsoft Excel
107
11.08 Quiz
Complete the ODS statements below to send the output
to a CSVALL destination.
ods
file="&path\myexcel.
";
close;
108
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-53
11-54
CSVALL Destination
CSVALL does not include any style information.
110
MSOFFICE2K Destination
MSOFFICE2K keeps the style information, including
spanning headers.
111
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
EXCELXP Destination
EXCELXP keeps the style information. Output from each
procedure is on a separate sheet.
112
Keep in Mind
The file you are creating is not an Excel file.
CSVALL
MSOFFICE2K
EXCELXP
113
This demo can be run in SAS Enterprise Guide and in the SAS windowing environment.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-55
11-56
Open and submit p111d18. It creates HTML, PDF, and RTF output.
ods listing close;
ods html file="&path\myreport.html";
ods pdf file="&path\myreport.pdf";
ods rtf file="&path\myreport.rtf";
title 'Report 1';
proc freq data=orion.sales;
tables Country;
run;
title 'Report 2';
proc means data=orion.sales;
var Salary;
run;
title 'Report 3';
proc print data=orion.sales;
var First_Name Last_Name
Job_Title Country Salary;
where Salary > 75000;
run;
ods _all_ close;
ods listing;
title;
For UNIX, the following ODS statements are used:
ods listing close;
ods html file="&path/myreport.html";
ods pdf file="&path/myreport.pdf";
ods rtf file="&path/myreport.rtf";
For z/OS (OS/390), the following ODS statements are used:
ods listing close;
ods html file="&path..report(myhtml) rs=none";
ods pdf file="&path..report(mypdf) rs=none";
ods rtf file="&path..report(myrtf) rs=none";
Click the appropriate tab to see the results within Enterprise Guide. If a download prompt is displayed,
select the download.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
HTML Results
PDF Results
RTF Results
After the RTF file is downloaded, Microsoft Word opens automatically to display the RTF results.
The RTF tab is removed from the workspace.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-57
11-58
3. The program created three output files. They are listed in the Results window with an appropriate
icon based on file type. Double-click an icon to view the results.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
This demo can be run in SAS Enterprise Guide and in the SAS windowing environment.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-59
11-60
3. Launch Microsoft Excel to view the three files created by this program.
4. Within Excel, select File Open and navigate to your data folder. In the Filename box, type
myexcel and select myexcel.csv to view the CSV file created through the CSVALL destination.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-61
5. Within Excel, open myexcel.html to view the HTML file created through the MSOFFICE2K
destination.
6. Within Excel, open myexcel.xml to view the XML file created through the EXCELXP destination.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-62
Exercises
Level 1
12. Directing Output to PDF and RTF Destinations
Open p111e12. Create a PDF version of the PROC PRINT report by adding ODS statements. Use the
following naming convention when creating the PDF file:
Windows
"&path\p111s12p.pdf"
UNIX
"&path/p111s12p.pdf"
z/OS (OS/390)
"&path...report(p111s12p)"
Compare this PDF output to the equivalent report that appears in the Output window.
Modify your ODS statements to create the RTF version of the PROC PRINT report. Use the
following naming convention when creating the RTF file.
Windows
"&path\p111s12r.rtf"
UNIX
"&path/p111s12r.rtf"
z/OS (OS/390)
"&path..report(p111s12r)"
b. Submit the program and view the RTF output in Microsoft Word:
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-63
What happens if the RUN statement is moved to the end of the program?
c. Add the STYLE= option to the ODS RTF statement to use the Curve style definition.
d. Submit the program and view output in Microsoft Word.
Level 2
13. Creating ODS Output Compatible with Microsoft Excel
a. Open p111e13. Add ODS statements to send the report to a file that can be viewed in Microsoft
Excel. Choose the ODS destination and file extension based on whether you want the following:
1) style information stored in the report output
2) the reports in a single worksheet or multiple worksheets
If selecting a destination that supports style information, specify the LISTING style definition.
Use the following naming convention when creating the file. For Windows or UNIX, choose
an appropriate extension for the file depending on the type of file that is created.
Windows
"&path\p111s13.xxx"
UNIX
"&path/p111s13.xxx"
z/OS (OS/390)
"&path..report(p111s13)"
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-64
c. Open the file with Microsoft Excel. The report should resemble the following results. Your
output will look different depending on the ODS destination that you choose.
Challenge
14. Adding HTML-Specific Features to ODS Output
a. Open p111e14. Create an HTML version of the PROC PRINT report by adding ODS statements.
Use the following naming convention when creating the HTML file:
Windows
"&path\p111s14.html"
UNIX
"&path/p111s14.html"
z/OS (OS/390)
"&path..report(p111s14)"
b. Customize the title so that it becomes a clickable hyperlink when displayed in a web browser.
The hyperlink should point to the URL http://www.sas.com (the SAS home page).
An option must be added to the TITLE statement to make it an active hyperlink. Explore
TITLE statement options in the SAS Help facility or online documentation.
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11.4 Solutions
11-65
"&path\p111e15c.css"
UNIX
"&path/p111e15c.css"
z/OS (OS/390)
"&path..report(p111e15c)"
Explore the ODS HTML statement in the SAS Help facility or online documentation to
investigate the syntax required to reference an existing CSS file. Follow the
documentation links for the STYLESHEET= option and its URL= suboption.
11.4 Solutions
Solutions to Exercises
1. Counting Levels of a Variable with PROC FREQ
Would you expect to see frequencies of 1 for customers and sales employees? No. This file contains
all customer orders, and the ID of the employee who helped with the sale. Employee_ID is likely
to have higher frequency counts, as would frequent customers.
title1 'Unique Customers and Salespersons for Retail Sales';
proc freq data=orion.orders nlevels;
where Order_Type=1;
tables Customer_ID Employee_ID / noprint;
run;
title;
title1 'Catalog and Internet Customers';
proc freq data=orion.orders order=freq;
where Order_Type ne 1;
tables Customer_ID / nocum;
run;
title;
2. Validating orion.shoes_tracker with PROC FREQ
proc freq data=orion.shoes_tracker nlevels;
tables Supplier_Name Supplier_ID;
run;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-66
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11.4 Solutions
Product_Name='Product'
Count='Orders';
run;
title;
6. Creating a Summary Report with PROC MEANS
proc format;
value ordertypes
1='Retail'
2='Catalog'
3='Internet';
run;
title 'Revenue from All Orders';
proc means data=orion.order_fact sum;
var Total_Retail_Price;
class Order_Date Order_Type;
format Order_Date year4. Order_Type ordertypes.;
run;
title;
7. Validating orion.price_current with the UNIVARIATE Procedure
proc univariate data=orion.price_current;
var Unit_Sales_Price Factor;
run;
Find the Extreme Observations output.
How many values of Unit_Sales_Price are over the maximum of 800? one (5730)
How many values of Factor are under the minimum of 1? one (0.01)
How many values of Factor are over the maximum of 1.05? two (10.20 and 100.00)
8. Analyzing Missing Numeric Values with PROC MEANS
title 'Number of Missing and Non-Missing Date Values';
proc means data=orion.staff nmiss n nonobs;
var Birth_Date Emp_Hire_Date Emp_Term_Date;
class Gender;
run;
title;
9. Validating orion.shoes_tracker with the UNIVARIATE Procedure
proc univariate data=orion.shoes_tracker;
var Product_ID;
run;
How many values of Product_ID are too small? one (2.20200E+10)
How many values of Product_ID are too large? one (2.2020E+12)
10. Creating an Output Data Set with PROC MEANS
proc means data=orion.order_fact noprint nway;
class Product_ID;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-67
11-68
var Total_Retail_Price;
output out=product_orders sum=Product_Revenue;
run;
data product_names;
merge product_orders orion.product_list;
by Product_ID;
keep Product_ID Product_Name Product_Revenue;
run;
proc sort data=product_names;
by descending Product_Revenue;
run;
title 'Top Five Products by Revenue';
proc print data=product_names(obs=5) label;
var Product_Revenue Product_ID Product_Name;
label Product_ID='Product Number'
Product_Name='Product'
Product_Revenue='Revenue';
format Product_Revenue eurox12.2;
run;
title;
11. Selecting Only the Extreme Observations Output from the UNIVARIATE Procedure
ods trace on;
ods select ExtremeObs;
proc univariate data=orion.shoes_tracker;
var Product_ID;
run;
ods trace off;
What is the name of the last output added in the SAS log? ExtremeObs
12. Directing Output to PDF and RTF Destinations
ods pdf file="&path\p111s12p.pdf";
ods rtf file="&path\p111s12r.rtf" style=curve;
title 'July 2011 Orders';
proc print data=orion.mnth7_2011;
run;
ods pdf close;
ods rtf close;
13. Creating ODS Output Compatible with Microsoft Excel
ods csvall file="&path\p111s13.csv";
ods msoffice2k file="&path\p111s13.html" style=Listing;
ods tagsets.excelxp file="&path\p111s13.xml" style=Listing;
title 'Customer Type Definitions';
proc print data=orion.customer_type;
run;
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11.4 Solutions
AU
63
US
102
11
p111a01s
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-69
11-70
40
p111a02s
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11.4 Solutions
Tier1
1
0.61
1
0.61
Tier2
158
95.76
159
96.36
Tier3
4
2.42
163
98.79
Tier4
2
1.21
165
100.00
p111a03s
48
AU
63
61
b.
AU
61
63
65
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-71
11-72
Employee_ID
4
13
20
120106
120115
120191
First
Last
John
Hugh
Jannene
Hornsey
Nichollas
Graham-Rowe
Gender
M
M
F
Salary
.
2650
2401
Job_Title
Office Assistant II
Service Assistant I
Trainee
Country
AU
AU
AU
p111a04s
83
100
p111a01s
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11.4 Solutions
109
p111a06s
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
11-73
11-74
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
12-2
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
12.1 Introduction
12.1 Introduction
Objectives
2
2
Customer Support
SAS provides a variety of resources to help customers.
http://support.sas.com/resourcekit/
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
12-3
12-4
Education
SAS Education provides comprehensive training, including
http://support.sas.com/training/
4
4
http://support.sas.com/certify/
5
5
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
12.1 Introduction
Networking
Social media channels and user group organizations
enable you to
6
6
Icons are from top left to right, and then bottom left to right:
Twitter, RSS, Myspace, YouTube, Facebook, Technorati, and sasCommunity.org
SAS Publishing
SAS Publications offers a complete selection
of publications, including
e-books
CD-ROM
hard-copy books
books written by outside authors.
http://support.sas.com/publishing/
7
7
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
12-5
12-6
8
8
Next Steps
To learn more about this:
Creating reports using the REPORT and SAS Report Writing 1: Essentials
TABULATE procedures, plus the Output
Delivery System (ODS)
Performing statistical analysis using
SAS/STAT software
9
9
Copyright 2013, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.