CCPR Computing Services: Workshop 1: Programming Basics, Unix, Remote Computing October 13, 2004
CCPR Computing Services: Workshop 1: Programming Basics, Unix, Remote Computing October 13, 2004
CCPR Computing Services: Workshop 1: Programming Basics, Unix, Remote Computing October 13, 2004
Workshop 1:
Programming Basics, Unix, Remote Computing
October 13, 2004
Part 1: Programming Basics
Motivation
Before you start coding
Programming Conventions
Documentation
Names, comments
Directory Structure
Basic Constructs
Miscellaneous (debugging, cross-checking
results)
Motivation
Facilitate research
Save time
Cleaner code
Easily share programs
Basic Concepts
MUCH better programming
Programming Conventions
What are conventions?
Examples
Who cares?
Readability of code
Organization
Transferring code to others
Apply conventions consistently to:
variable names and function names
comments
directory structure
Before you start coding
THINK
WRITE down the problem
WRITE down the algorithm in English (not code)
Modularity
Comments
Create test (if reasonable)
TRANSLATE one section to code
TEST the SECTION thoroughly
Translate/Test next section, etc.
Documentation - File Header
#Laura Piersol ([email protected])
#HRS project
#/u/socio/laurapiersol/HRS/
#October 11, 2004
#Python version 2.4
#Stata version 8
#Purpose: Create and merge two datasets in Stata,
# then convert data to SAS
#Input programs:
# HRS/staprog/H2002.do,
# HRS/staprog/x2002.do,
# HRS/staprog/mergeFiles.do
#Output:
# HRS/stalog/H2002.log,
# HRS/stalog/x2002.log,
# HRS/stalog/mergeFiles.log
# HRS/stadata/Hx2002.dta
# HRS/sasdata/Hx2002.sas
#Special instructions: Check log files for errors
# check for duplicates upon new data release
File header includes:
Name
Project
Project location
Date
Version of software
Purpose
Inputs
Outputs
Special Instructions
Naming Conventions
Not a detail!
Good names clarify your code
Portray meaning/purpose
Adopt a convention and BE CONSISTENT
Naming Conventions, cont.
Use language standard (if it exists)
If no standard, pick one and BE CONSISTENT
Functions: getStats, calcBetas, showResults
Scalar variables: scPi, scGravity, scWorkHours
String variables: stName, stCareer
Global variables: _Country, _Nbhd
Be aware of language-specific rules
Max length, underscore, case, reserved words
Naming Conventions, cont.
Differentiating log files:
Programs MergeHH.sas, MergeHH.do
Log files MergeHHsas.log, MergeHHsta.log
Meaningful variable names:
LogWt vs. var1
AgeLt30 vs. x
Procedure that cleans missing values of Age:
fixMissingAge
Matrix multiplication X transpose times X
matXX
Commenting Code
Good code is SELF-COMMENTING
Naming conventions, structure, header explain 95%
Comments explain
PURPOSE, not every detail
TRICKS
(good) reasons for unusual coding
Comments DO NOT
fix sloppy code
translate syntax
Commenting Code - Stata example
SAMPLE 2
*Convert names in dataset to lowercase.
program def lowerVarNames
foreach v of varlist _all {
local LowName = lower("`v'")
*If variable is already lowercase,
*rename statement throws error.
if `"`v'"' != `"`LowName'"' {
rename `v' `=lower("`v'")'
}
}
end
SAMPLE 1
program def function1
foreach v of varlist _all {
local x = lower("`v'")
if `"`v'"' != `"`x'"' {
rename `v' `=lower("`v'")'
}
}
end
No conventions, comments, structure
Comments: succinct and not overdone
Names:
lowerVarNames
-action word for program
-distinct use of case
LowName
-descriptive
-distinct use of case
v
-looping variable and short scope
-non-descriptive, but does not detract
from meaning!
Structure- indentations, parentheses lined up!
Directory Structure
A project consists of
many different types of
files
Use folders to
SEPARATE files in a
logical way
Be consistent across
projects if possible
ATTIC folder for older
versions
HOME
PROJECT NAME
DATA
RESULTS
LOG
PROGRAMS
ATTIC
Miscellaneous Tips
BACKUP! Weekly zip file stored externally
README.txt file to describe folder
BE ORGANIZED
CROSS-VERIFY results
Something not working?
Remember the computer is following YOUR
directions go back to your code
Programming Constructs
Tools to simplify and clarify your coding
Available in virtually all languages
Constructs - Looping
Repeat section of code
START value, INCREMENT, STOP value
Example-
convert uppercase to lowercase for each variable
in a dataset
Constructs Looping Examples
C for loop: Start with x=1, Increment = x+1, Stop when x==10
for(x=1; x<10; x++) {
code
}
PERL while loop: Start with count= 1, Increment= count+1, Stop when count==11
$count=1;
while ($count<11) {
print "$count\n";
$count++;
}
STATA foreach loop: Start = first variable in varlist, Increment = next variable in
varlist, Stop =last variable in varlist
foreach v of varlist _all {
local LowName = lower("`v'")
if `"`v'"' != `"`LowName'"' {
rename `v' `=lower("`v'")'
}
}
Constructs - If/then/else
Execute section of code if condition is true:
if condition then
{execute this code if condition true}
end
Execute one of two sections of code:
if condition then
{execute this code if condition true}
else
{execute this code if condition false}
end
Constructs - Elseif/case
Elseif - Execute one of many sections of code:
if condition1 then
{execute this code if condition1 true}
elseif condition2 then
{execute this code if condition2 true}
else
{execute this code if condition1, condition2, condition3 are all false}
end
Case- same idea, different name
case condition1 then
{execute this code if condition1 true}
case condition2 then
{execute this code if condition2 true}
etc.
Constructs - And, or, xor
1 AND 1 True
1 AND 0 False
0 AND 0 False
1 OR 1 True
1 OR 0 True
0 OR 0 False
AND - BOTH conditions must be true results in True
OR - AT LEAST ONE condition must be true results in True
XOR - EXACTLY ONE condition must be true for statement to be true
1 XOR 1 False
1 XOR 0 True
0 XOR 0 False
Constructs - Break
Stop execution of program
Examples:
Debugging. If particular error occurs then break.
Parameters in function call are nonsensical. Print
error and break.
Constructs - keywords
Looping - for, foreach, do, while
If statements if, then, else, case
And/Or/Xor logical, and, or, xor, &, |
Break exit, break
PART 2: Unix
Motivation
Basic Commands
Job submission and management
Pipes
Unix Shell
Script files
Unix
Motivation
A quick history
Unix variants
(AIX, Solaris, FreeBSD, Linux)
Where?
Nicco (SSCs server)
CCPRs linux cluster (coming soon)
CCPRs data server (coming soon)
Unix Basic Commands
man command
list help for command (man if short for
manual)
man -k command keyword search for command
whatis command give a brief description of command
apropos keyword
list commands with keyword in the NAME
section their man page
Getting Help
Unix Basic Commands
File Management
ls list files (options l for long, -a for all)
mv filename1 filename2 rename filename1 with the name filename2
cp filename1 filename2 make a copy of filename1 and call it filename2
rm filename delete filename
more filename print contents of filename1 to the screen
cat filename print contents of filename to the screen
cat filename1 >>
filename2
append contents of filename1 to the file filename2
cat part1 part2 >>
bothparts
appends the contents of file part1 and file part to to the
file bothparts
head filename show first 10 lines to screen
tail filename show last 10 lines to screen
Unix Basic Commands
Directory Management
pwd show current directory
mkdir dirname create new directory dirname
rmdir dirname remove directory dirname
cd dirname Change to directory dirname
cd change to your home directory
cd .. move one directory up
cd ../.. move two directories up
~ home directory
cd ~/scripts change to the scripts folder in your home directory
Unix Basic Commands
Using Previous Commands
!! repeat last command
!v repeat the last command that started with v
!-2 repeat second to last command
arrow up/down scroll through list of previous commands
history list history of commands used
Unix Basic Commands
Other Useful Unix Tools
* wildcard (matches any number of characters)
? wildcard (matches single character)
grep word filename list lines of filename containing word
diff filename1
filename2
shows differences between filename1 and filename2
wc filename
counts number of lines, words, and characters in
filename
sort < filename sorts the lines of filename
who lists users currently on the system
cal displays current month's calendar
date displays date
Unix Basic Commands
du s Total kilobytes used in current directory
du a Same as above, but more detail
ls l Gives individual file sizes in bytes
Disk Usage
Editing Files in Unix
Vi Editor and Emacs
Neither are user-friendly for starters
Look at CCPR internet when you start
Best way to learn is to start editing a document
Once you get used to them, theyre easy and fast to
use.
Being nice
Always run your jobs nicely
Prevents interfering with other users
Precede command with nice +19 (no quotes)
user@nicco%nice +19 stata b jobfile.do
Job Submission in Unix
Interactive
user@nicco%stata
Foreground jobs
user@nicco%nice +19 stata b do jobfile.do
user@nicco%nice +19 sas jobfile.sas
Background jobs &
user@nicco%nice +19 stata b do jobfile.do &
user@nicco%nice +19 sas jobfile.sas &
Background jobs with logoff nohup
user@nicco%nohup nice +19 stata b do jobfile.do &
user@nicco%nohup nice +19 sas jobfile.sas &
Job Management
Ctrl-c Cancel a foreground job
Ctrl-z Suspend a job in the foreground
bg Move a suspended foreground job to the background
ps u List information for processes you own (under current shell)
ps ux Lists information for process owned by you and others
ps aux Lists information for all processes (including root, bin, etc.)
ps aux | more Display output of ps aux one page at a time
ps aux | grep PID List lines from command ps aux containing PID
kill PID Kills (cancels) process number PID
top Lists 15 processes using the most cpu processing power
q Stops command top
Pipes
| redirects
command 1 output command 2 input
Command Output from Input to Result
top | grep piersol top grep piersol Jobs in top 15 containing piersol
ps -aux | more Ps aux More Jobs listed one page at a time
ls | wc -l ls Wc l Count number of files and
directories in current directory
Extends to more than 2 commands
Unix Shell
Whats a Unix Shell?
The Unix shell is the program that provides the
interface between the user and the kernel
Whats a shell script?
A list of commands put into a file that can be
interpreted by the Unix Shell.
What are scripting languages?
Generally easier to code but less efficient
Shell scripts, Perl, Python
Remote Computing
Windows to Whitney
Remote Login File Transfer
Remote Desktop
Connection
>= XP included*
< XP download*
Map a Drive via
Windows Explorer
Web-based
Remote Desktop
Connection*
*See http://www.ccpr.ucla.edu/asp/compserv.asp, XP users get latest version
Remote Computing
Windows to Unix
Remote Login File transfer
SSH Secure
Shell Client*
SSH Secure
File Transfer Client*
Map a Drive via
Windows Explorer
Need Samba account**
* http://computing.sscnet.ucla.edu/training/tutorial_SSH.htm
** http://computing.sscnet.ucla.edu/training/tutorial_samba.htm
Finally
Questions and Feedback