SAS E Miner Cloud-Based Software - Tutorial 1
SAS E Miner Cloud-Based Software - Tutorial 1
SAS E Miner Cloud-Based Software - Tutorial 1
Chapter 1
squares, LARS and LASSO, nearest neighbor, and importing models defined by
other users or even outside SAS Enterprise Miner.
• Assess the data by evaluating the usefulness and reliability of the findings from the
data mining process. This step includes the use of tools for comparing models and
computing new fit statistics, cutoff analysis, decision support, report generation, and
score code management.
You might or might not include all of the SEMMA steps in an analysis, and it might be
necessary to repeat one or more of the steps several times before you are satisfied with
the results.
After you have completed the SEMMA steps, you can apply a scoring formula from one
or more champion models to new data that might or might not contain the target
variable. Scoring new data that is not available at the time of model training is the goal
of most data mining problems.
Furthermore, advanced visualization tools enable you to quickly and easily examine
large amounts of data in multidimensional histograms and to graphically compare
modeling results.
Scoring new data that is not available at the time of model training is the goal of most
data mining exercises. SAS Enterprise Miner includes tools for generating and testing
complete score code for the entire process flow diagram as SAS Code, C code, and Java
code, as well as tools for interactively scoring new data and examining the results. You
can register your model to a SAS Metadata Server to share your results with users of
applications such as SAS Enterprise Guide and SAS Data Integration Studio that can
integrate the score code into reporting and production processes. SAS Model Manager
complements the data mining process by providing a structure for managing projects
through development, testing, and production environments and is fully integrated with
SAS Enterprise Miner.
The graphical user interface (GUI) is designed in such a way that the business analyst
who has little statistical expertise can navigate through the data mining methodology,
and the quantitative expert can explore each node in depth to fine-tune the analytical
process.
SAS Enterprise Miner automates the scoring process and supplies complete scoring code
for all stages of model development in SAS, C, Java, and PMML. The scoring code can
Accessibility Features of SAS Enterprise Miner 15.2 3
If you have questions or concerns about the accessibility of SAS products, send email to
[email protected].
1. Toolbar Shortcut Buttons — Use the toolbar shortcut buttons to perform common
computer functions and frequently used SAS Enterprise Miner operations. Move the
mouse pointer over any shortcut button to see the text name. Click a shortcut button
to use it.
2. Project Panel — Use the Project Panel to manage and view data sources, diagrams,
results, and project users.
3. Properties Panel — Use the Properties Panel to view and edit the settings of data
sources, diagrams, nodes, and users.
4. Property Help Panel — The Property Help Panel displays a short description of any
property that you select in the Properties Panel. Extended help can be found from the
Help main menu.
6 Chapter 1 • Introduction to SAS Enterprise Miner 15.2
5. Toolbar — The Toolbar is a graphic set of node icons that you use to build process
flow diagrams in the Diagram Workspace. Drag a node icon into the Diagram
Workspace to use it. The icon remains in place in the Toolbar, and the node in the
Diagram Workspace is ready to be connected and configured for use in the process
flow diagram.
6. Diagram Workspace — Use the Diagram Workspace to build, edit, run, and save
process flow diagrams. In this workspace, you graphically build, order, sequence,
and connect the nodes that you use to mine your data and generate reports.
7. Diagram Navigation Toolbar — Use the Diagram Navigation Toolbar to organize
and navigate the process flow diagram.
TIP The book “Predictive Modeling with SAS Enterprise Miner: Practical Solutions
for Business Applications” provides examples of saving and exporting SAS code and
offers additional discussion about the SAS Enterprise Miner graphical user interface.
7
Chapter 2
TIP “Predictive Modeling with SAS Enterprise Miner: Practical Solutions for
Business Applications” provides several additional example process flow diagrams
for you to create and run.
information about the structure of the sample data, see Sample Data Reference on page
63.
10 Chapter 2 • Learning by Example: Building and Running a Process Flow
11
Chapter 3
TIP For organizational purposes, it is a good idea to create a separate project for each
major data mining problem that you want to investigate.
To create the project that you will use in this example:
1. Open SAS Enterprise Miner.
2. In the Welcome to Enterprise Miner window, click New Project. The Create New
Project Wizard opens.
3. Proceed through the steps below to complete the wizard. Contact your system
administrator if you need to be granted directory access or if you are unsure about
the details of your site's configuration.
a. Select the logical workspace server to use. Click Next.
b. Enter Getting Started Charitable Giving Example as the Project
Name.
The SAS Server Directory is the directory on the server machine in which SAS
data sets and other files that are generated by the project will be stored. It is
likely that your site is configured in such a way that the default path is
appropriate for this example. Click Next.
c. The SAS Folder Location is the directory on the server machine in which the
project itself will be stored. It is likely that your site is configured in such a way
that the default path is appropriate for the example project that you are about to
create. Click Next.
Note: If you complete this example over multiple sessions, then this is the
location to which you should navigate after you select Open Project in the
Welcome to Enterprise Miner window.
d. Click Finish.
Create a Library
In order to access the sample data sets using SAS Enterprise Miner, you must create a
SAS library to indicate to SAS the location in which they are stored. When you create a
library, you give SAS a shortcut name and pointer to a storage location in your operating
environment where you store SAS files.
To create a new SAS library for the sample data:
1. On the File menu, select New ð Library. The Library Wizard opens.
2. Proceed through the steps below to complete the wizard. Contact your system
administrator if you need to be granted directory access or if you are unsure about
the details of your site's configuration.
a. The Create New Library option button is automatically selected. Click Next.
b. Enter Donor as the Name.
Then enter the Path to the directory on the server machine that contains the
sample data that you downloaded from the web. For example, if the sample data
is located on the desktop of the server machine (denoted by the C drive), then
you could enter C:\Users\<username>\Desktop, where <username> is
your user name on the server machine. Click Next.
c. Click Finish.
Create a Data Source 13
To change an attribute, click the value of that attribute and select from the drop-
down menu that appears.
Note: SAS Enterprise Miner automatically assigns the role Target to any
variable whose name begins with the prefix TARGET_. For more
information about the rules that SAS Enterprise Miner uses to automatically
assign roles, see the SAS Enterprise Miner Help.
f. Select the Yes option button to indicate that you want to build models based on
the values of decisions. Click Next.
• On the Prior Probabilities tab, select the Yes option button to indicate that
you want to enter new prior probabilities. In the Adjusted Prior column of the
table, enter 0.05 for Level 1 and 0.95 for Level 0.
The values in the Prior column reflect the proportions of observations in the
data set for which TARGET_B is equal to 1 and 0 (0.25 and 0.75,
respectively). However, as the business analyst, you know that these
proportions resulted from over-sampling of donors from the 97NK
solicitation. In fact, you know that the true proportion of donors for the
solicitation was closer to 0.05 than 0.25. For this reason, you adjust the prior
probabilities.
• On the Decision Weights tab, the Maximize option button is automatically
selected, which indicates that you want to maximize profit in this analysis.
Enter 14.5 as the Decision 1 weight for Level 1, -0.5 as the Decision 1
weight for Level 0, and 0.0 as the Decision 2 weight for both levels. Click
Next.
TIP Refer to “Predictive Modeling with SAS Enterprise Miner: Practical Solutions
for Business Applications” for more examples about creating new projects, creating
data sources, creating diagrams, and adding nodes to your diagram workspace. The
book also discusses your metadata options.