Perl Programming
Perl Programming
Perl Programming
An Introduction
September 2005
Notes:
Structure Of This Course
All of the material in this course comes from “Programming Perl 3rd edition”
and the Perl Cookbook.
If there’s anything which is not clear then ask as we go.
An assumption is that everyone has some programming experience. This course isn’t
going to teach programming.
Some parts of Perl are not going to be covered - Ties and DBM, Formats, Many
system functions. This is all reference material which you can find in any of the
standard texts - or in the man pages.
Agenda - Day 1/2
Each day is 09:00 to 16:00 with 30 minutes for lunch and a 15 minutes break in both
the morning and the afternoon.
Agenda is flexible. If there are specific areas which I haven’t covered in which you
have an interest, then ask.
There are lots of LABS and exercises - most are small to start with and get more
detailed as we get to the end of the course.
By the time we get to the end of the overview everyone should be capable of writing
simple scripts which manipulate files and do simple pattern matching and substitution.
We will be largely learning by example - lots of the examples in this course come from
the Perl Cookbook.
The Pursuit Of Happiness (Or The Hard Sell)
Notes:
What Is Perl?
Any language can be used to write code which is not maintainable. Perl isn’t an
exception to that rule.
3. One-liners.
As part of the course notes there is a style guide for Perl. Follow it (or something like
it).
Some of the things we won’t have time to cover on this course are OO Perl and
Advanced Data Structures. If this is something which interests you, then let me
know since a follow-up course is possible/likely.
The History Of Perl
Notes:
More About Perl
A misconception:
Perl is interpreted and so it’s slow!
Perl compiles to an intermediate format (like Java bytecode or Pascal P-Code).
Once it is compiled it is passed to the interpreter for execution.
Hence:
You can write faster code in C but you can write code faster in Perl.
Unix:
Available on-site. See:
/pd/perl/5.005_503/bin/perl
/pd/perl/5.8.6/bin/perl
/usr/local/bin/perl
Windows:
Active-state Perl (version 5.8.6) from www.activestate.com
Linux:
Included as part of all standard Linux distributions (version 5.8.6)
Mac OS X:
Included as part of OS X (version 5.8.1 on OS X 10.3.9)
Internet:
www.perl.com (The Perl homepage)
www.perl.org (The Perl mongers homepage)
www.oreilly.com
search.cpan.org (Go here to find Perl modules)
Comp.lang.perl newsgroup hierarchy:
comp.lang.perl.misc
comp.lang.perl.moderated
comp.lang.perl.modules
comp.lang.perl.tk
Man perl from a unix command line:
Gives all the perl help topics
Ask
All the news groups listed above are available in this building.
Perl is probably the most widely used and understood programming language in
Bristol. People can always come and ask me a question if they have a problem.
Places To Get Useful Information - II
Books:
Programming Perl (3rd edition)
Larry Wall & Tom Christiansen & Jon Orwant - ISBN 0-596-00027-8
Learning Perl (3rd edition)
Randal Schwartz and Tom Phoenix - ISBN 0-596-00132-0
Perl Cookbook (2nd edition).
Tom Christiansen & Nathan Torkington - ISBN 1-56592-243-3
Mastering Algorithms With Perl
Jon Orwant, Jarrko Hietaniemi & John Macdonald - ISBN 1-56592-398-7
Advanced Perl Programming
Sriram Srinivasan - ISBN 1-56592-220-4
If you only buy one book make it the camel book (A.K.A. programming Perl) ,
followed by the Perl Cookbook. If you do buy programming Perl make sure it’s the 3rd
edition and NOT the 2nd edition.
There are two Perl in 21 Days books, one of which is available on-line at the CR&D
bookshelf web-site.
(The on-line version can be found in the tutorial areas as a series of PDF files).
Since a lot of this course is going to be Perl by example, I’ve placed a few programs
into the various tutorial areas which all can be used (reused) as you wish. There’s
also a copy of a Perl module (Netlist_Functions.pm) which contains a lot of useful
functions which can be imported into your own programs. Hey, why bother
programming when you can steal! (This really is the philosophy you should be
adopting in your own work.
(Some of) The Perl Manpages
Manpage Covers
perl What perl manpages are available
perldata Data types
perlsyn Syntax
perlop Operators and precedence
perlre Regular expressions
perlvar Predefined variables
perlsub Subroutines
perlfunc Built-in functions
perlmod How to make modules work
perlref References
perlobj Objects
perlipc Inter-process communications
perlrun How to run Perl commands, plus switches
perldebug Debugging
perldiag Diagnostic messages
Notes:
(More About) The Perl Manpages
See also:
perlfaq1 to perlfaq9
As of Perl version 5.6.1 you can search individual Perl manpages by using the
name of the manpage as a command and passing a Perl regular expression
as the search pattern.
Examples:
perlop comma
perlfunc split
perlvar ARG
perldiag ‘assigned to typeglob’
When you don’t know where something is in the documentation, search all
the FAQ’s:
perlfaq round
Some Terminology
Idiomatic Perl:
Widespread and accepted ways of doing certain things in Perl.
If ( $variable != 56 ) print “Your variable did not equal $variable\n”;
print “Your variable did not equal $variable\n” unless ( $variable == 56 );
Interpolation:
Replacing a variable with the variables value.
Regexp’s:
Regular expressions.
CPAN:
The Comprehensive Perl Archive Network.
The place to go to get modules written and contributed by other Perl
programmers.
Don’t reinvent the wheel, or if you do then make sure it’s a better wheel.
Share code within your office/group/site/business unit.
Idiomatic Perl is one of the most confusing bits of Perl since there are so many
different ways of doing things. This can be both useful (you can program in the way
which suits you) and a drawback (reading other peoples code isn’t always easy)
Interpolation will be mentioned a lot by people who use Perl a lot - it’s just a fancy
computer science term.
Regexps - these are not exactly the same as regular expressions in other UNIX
applications - so be careful.
CPAN - pretty light on EDA type code. Maybe we should start a forum!
Account Details
Notes:
Account Details
Notes:
A Standard Header
There are other binary invocations that use “eval’ with some “magic”.
PREVIEW - Examples Of sprintf()
Field Meaning
%% A percent sign
%s A string
Be careful - sprintf() in Perl does its own formatting - it is NOT calling the
underlying sprintf() function in the C library.
PREVIEW - Examples Of sprintf()
Field Meaning
%n A special: stores the number of characters output so far into the next variable in the
argument list.
In addition to the formats on the previous slide, Perl also supports the following
conversions.
%I - a synonym for %d
%D - a synonym for %ld
%U - a synonym for %lu
%O - a synonym for %lo
%F - a synonym for %f
PREVIEW - Examples Of sprintf()
Flag Meaning
.number “Precision”: digits after the decimal point for floating-point numbers, maximum length
for a string, minimum length for an integer.
l Interpret integer as a C type long or unsigned long
h Interpret integer as C type short or unsigned short (if no flags are supplied interpret
integer as C type int or unsigned
Perl allows the following flags between the % and the conversion character.
PREVIEW - Examples Of chop() And chomp()
$last_char = chop($var);
chop() always returns the character it removes. If you chop() a list, then every
item in the list is chopped. The thing which ends up in $answer in the question on
the slide is the character which was removed from the string $tmp. The thing you
probably wanted was $tmp.
$number = hex("ffff12c0");
sprintf uses the same
sprintf "%lx", $number; # (That's an ell, not a one.) conventions as C’s sprintf.
Note that you can always set the value of any variable with a hex value just by doing
this:
$h_number = 0xffdd;
print $h;
The hex() function is interpreting a string as a hex number, not a value. If the string
begins with “0x”, this is ignored. To do a reverse conversion use sprintf() as
shown.
Hex strings can only represent integers. Strings which would cause integer overflow
will trigger a warning.
oct() will interpret a string as an octal value. If the string starts with “0” it will be
interpreted as octal. If the string starts with “0x” it will be interpreted as a hex
value. If it begins with “0b” it will be interpreted as a binary value.
Try this:
September 2005
Notes:
Getting Started
For many programming tasks you’d like a language in which you can say:
There are many slides in this course which have this symbol in the top left
corner of the slide. All such slides are gathered together into a single
document called “How-to.pdf” in your labs and exercises directory.
This is a minimal (and complete) Perl program, but it illustrates some important
points.
1. You don’t have to say much before you say what you want to say.
2. You don’t have to say much after you’ve said what you want to say either. Unlike
many languages, Perl thinks it’s okay that you just fall off the end of your
program. You may use the exit() function to end a program (actually, you
should use the exit() function to end a program) just as you may force yourself to
pre-declare variables before you use them (actually …) It’s up to you!
LAB1 - HELLO_1
Variables, Arrays & Lists, Hashes
Notes:
Variables And Their Syntax
These are the two fundamental data types in Perl. One of a thing, and more than one
of a thing.
We call a variable which contains more than one thing, either an array/list or an
associative array/hash.
Variables And Their Syntax
We can write a different version of our first example (in the getting started section)
like this:
Perl has some other variable types with names like hash and handle and typeglob.
Later we’ll see that it is a good idea to force yourself to predefine variables before you
use them (using my()).
Hash and handle we’ll cover later. Typeglob won’t be covered in this course.
LAB1 - HELLO_2
LAB1 - HELLO_3
LAB1 - HELLO_4
Variables And Their Syntax
Tips:
We’ll cover subroutines in detail later in the course. Typeglob won’t be covered in this
course.
Variables And Their Syntax
Construct Meaning
$days Simple scalar value of $days
$days[28] 29th element of @days
$days{‘Feb’} "Feb" value from hash %days
Construct Meaning
Review:
Note that the range operator (..) has made an appearance. So 1 .. 20 will give you all
the integers between 1 and 20 inclusive. We’ll talk more about the range operator
later.
In the quiz example we’ve introduced a lot of new stuff. qw (think of this as quote-
word) lets you use Barewords to create lists. This whole example is an illustration of
context - the value of $days after the example has run is ?
Numeric Literals
$x = 12345; # integer
$x = 12345.67; # floating point
$x = 6.02e23; # scientific notation
$x = 4_294_967_296; # underline for legibility
$x = 0377; # octal
$x = 0xffff; # hexadecimal
$x = 0b11000000; # binary
You can’t use “,” in numbers since in Perl the , is an operator - so we use _ instead.
The “=“ symbol does assignment. Be careful because the “==“ symbol is used for
equality. At some point in your life you’ll accidentally confuse the two.
Scalars can also hold references to data structures, subroutines and objects.
$ary = \@myarray; # reference to a named array
$hsh = \%myhash; # reference to a named hash
$sub = \&mysub; # reference to a named subroutine
Variable interpolation:
$pet = “Camel”;
$sign = “I love my $pet”;
print $sign;
References will be covered extensively when we get to the in-depth look at Perl.
References are the key to writing efficient Perl code with subroutines, and the only
way to do OO programming.
In the example:
the => is the same as a comma “,” - this is convenience which lets us see easily
where the keys and where the values are. (Often known as syntactic sugar).
Variables Types - Scalars
If you use a variable which has never been assigned a value then:
The uninitialized variable springs into existence.
Is created with the null value - either 0 or “”.
Depending on how you use them variables will be interpreted as:
Strings.
Numbers.
True or False, i.e. boolean.
Context - suppose you said this:
$camels = '123';
print $camels + 1, "\n";
Answer: $camels is a string containing the text ‘123’. When Perl tries to add 1 to a
string it first converts the string containing the text ‘123’ into the number 123. It then
adds 1 and (hopefully) gets 124. This is then converted back into a string containing
the text ‘124’ which is then printed. A newline is then printed.
LAB2 - VARIABLES1
LAB2 - VARIABLES2
LAB2 - VARIABLES3
LAB2 - VARIABLES4_A
LAB2 - VARIABLES4_B
Arrays are also called lists - the distinction is blurred - when an array is used with
subscripts it’s generally regarded as an array, when it’s used as an ordered list and
used with push() pop() shift() and unshift() it’s generally regarded as a list.
It also depends upon context as well as how you think about a particular problem.
TMTOWTDI.
Variables Types - Arrays
($alpha,$omega) = ($omega,$alpha);
$home[0] = "couch";
$home[1] = "chair";
$home[2] = "table";
$home[3] = "stove";
The list can contain numbers, strings, or a mixture of both. It can also contain
references to variables and references to objects or references to other arrays or
references to other hashes.
To assign a list value to an array you simply group the values together with “(“ and
“)”.
If you use @home in a list context (on the right side of a list assignment) you’ll get
the list back. So you could set 4 scalar variables as shown.
List assignments happen in parallel so you can swap two scalar variables as shown in
the third example.
Arrays are 0 based (as in C) so while the list contains 4 elements the elements are
numbered 0 to 3.
Array subscripts are enclosed in “[“ and “]” so an individual element is referred to as
$home[n]. Since the element is a scalar (a single thing) it is preceded by $.
Variables Types - Arrays
Examples:
1: @stuff = ("one", "two", "three");
2: $stuff = ("one", "two", "three");
4: @x = (@stuff,@nonsense,funkshun())
5: @releases = (
"alpha",
"beta",
"gamma",
);
6: @froots = qw(
apple banana carambola
coconut guava kumquat
mandarin nectarine peach
pear persimmon plum
);
Review: an array variable is able to store a series of values with each uniquely
identified by an integer known as its index. The contents of an array are accessed
collectively by giving the array name prefixed by an @.
Examples:
3: If you don’t want some of the things returned in a list, throw them away by
undef’ing them.
4: Here we take $a and $b from the list and then the rest of the list goes into @rest.
Here’s an important principle - the first list in the list (so to speak) gets everything
else in the list! In the next example $a and $b get the first two values from
@arg_list and then the hash %rest gets everything else. There’s an issue here
concerning how many items are left in the list before it’s assigned to the hash %rest -
the length of the list needs to be a multiple of 2.
The last two examples show how you can force things into scalar context - the
scalar() function is one way.
List And Array Examples
Examples:
Note: lists grow dynamically, so you can have a 4 element list like this:
my @list = qw( fred barney wilma betty );
and say this:
$list[656] = "dino";
And Perl will create all the intervening array slots for you (they will all have the value
undef).
If you create a big array and you’d later like to delete it (to save on memory perhaps)
then you can do this:
my @big_array = (); # create the array
@big_array = <SOME_FILE>; # load a ton of stuff into it"
@big_array = undef; # delete the array
If you want to remove all the entries in an array without undef’ing it and then
recreating it, then just do this:
@my_array = ();
The same works for hashes as well - to empty a hash just do this:
%my_hash = ();
Variables Types - Arrays
Since arrays are ordered you can do useful operations on them such as;
Stack operations:
push()
pop()
shift() shift and unshift push and pop
unshift() work here. work here.
Example:
@home = ( "go", "where" , "no", "one" , "has" , "gone" );
Perl regards an array as an ordered list. The end of the array (i.e. the right-hand part
of the list) is considered the top of the stack. push() and pop() work on the top of
the stack.
shift() and unshift() work on the other end of the stack. shift() takes one element
from the start of a list, unshift puts a new element at the start of the list.
What does the list @home contain once the example has been run?
How Do I … Specify A List In A Program?
@a = qw( Why are you bugging me? ); Use qw() if you have a lot of
Single-word elements
@bigarray = (); Use something like this if you want
open(DATA, "< mydatafile") to read a list from a file
or die "Couldn't read from datafile: $!\n";
while (<DATA>) {
chomp;
push(@bigarray, $_);
}
More info: See The Perl Cookbook, section 4.1 Page 91.
How Do I … Specify A List In A Program?
@banner = qw|The vertical bar (\|) looks and behaves like a pipe.|; Different
quoting
character
More info: See The Perl Cookbook, section 4.1 Page 91.
qx() and backticks are not exactly the same. Backticks do not stop variable
interpolation while qx() does. If you don’t want Perl variables to be expanded then
you can use a single-quote delimiter on qx() to stop this.
q(), qq() and qx() quote single strings. qw() quotes a list of single word strings
by splitting its argument on whitespace without variable interpolation.
If you don’t want to change the quoting character, use a backslash to escape the
delimiter in the string.
How Do I … Change The Size Of an Array?
$ARRAY[$NEW_LAST_ELEMENT_INDEX_NUMBER] = $VALUE;
More info: See The Perl Cookbook, section 4.3 Page 95.
How Do I … Swap Values Without Using
A Temporary Variable?
You want to exchange the values of two variables, but don’t want to use a
temporary variable.
$temp = $a;
$a = $b; Normally you would
$b = $temp; do something like
this (say in C)
($alpha, $beta, $production) = qw(January March August);
# move beta to alpha,
# move production to beta, You can swap more
# move alpha to production than two things at
($alpha, $beta, $production) = ($beta, $production, $alpha); a time
Most programming languages require you to use a temporary variable when swapping
two variables values. Perl however will track both sides of the assignment and
guarantees that you won’t accidentally clobber any of your values. This lets you
eliminate the temporary variable.
You want to join two arrays together by adding all the items of one to the end of
the other.
push(@ARRAY1, @ARRAY2); Solution: Use push()
This is output:
Time Flies Like An Arrow
Fruit Flies Like A Banana
More info: See The Perl Cookbook, section 4.9 Page 108.
If you use list flattening beware that this takes more memory and is slower.
If you want to insert elements of one array into the middle of another, use
splice().
We’ve already seen push, pop, shift and unshift. They are all examples of a generic
function called splice(). The splice function takes four arguments: an array to be
modified, the index at which it is to be modified, the number of elements to be
removed (starting at the index specified in the previous argument), and a list of extra
elements to be inserted at the index (after the previous elements are removed). The
function returns a list of the elements which are removed.
List Flattening
This doesn’t produce a hierarchical list of three elements where the third element is
itself a two-element list.
Each element of a list must be a scalar, not another list.
Above example is actually the same as:
@virtues = ( “Faith” , “Hope” , “Love” , “Charity );
LAB3 - ARRAYS_1
LAB3 - ARRAYS_2
LAB3 - ARRAYS_3
LAB3 - ARRAYS_4
LAB3 - ARRAYS_5
Pick Your Own Quotes
Some of these forms are syntactic sugar which allow you to not put lots of formatting
in strings (which might be confusing and lead to mistakes).
In the first example we’ve used ! As the quote mark, which means we can freely use “
and ‘ in the text string we wish to build. We could have used our normal quotes and
escaped the “ and ‘ quotes inside the string, but it would have been very hard to
read.
1 2 3 4 Sat
Couch Chair Table Stove Saturday
Thu
Tue Thursday
Tuesday
Fri
Mon Friday
Monday
Wed
Sun Wednesday
Sunday
The % character is used to mark hash names.
Hash keys are not automatically implied by their position. In fact the concept of
position has no meaning for a hash. (And as we will see later, this means that you
can’t use foreach on a hash to loop over all the things in the hash).
You can assign a list to a hash (just like an array) but pairs of items from the list will
be interpreted as key/value pairs in the hash. So you can say this:
%hash = ( “Sat” => “Saturday” , “Sun” => “Sunday” , etc , “Fri” => “Friday” );
Variables Types - Hashes
%longday = (
"Sun" => "Sunday",
"Mon" => "Monday",
"Tue" => "Tuesday",
"Wed" => "Wednesday",
"Thu" => "Thursday",
"Fri" => "Friday",
"Sat" => "Saturday",
);
As in the example from the previous slide - suppose you wanted to translate
abbreviated days names to their corresponding full names. You could write the list
assignment as shown in the top box.
This is visually noisy, so Perl provides the => (comma operator) so that with a bit of
creative formatting the same statement can be written as shown in the second
example.
Remember - Hashes have no order to them - all accessing is done via the keys. Do
not try to use foreach to loop over the values in a hash.
Variables Types - Hashes
$wife{”Tony"} = ”Cherie";
You can assign a list to a hash - see our previous examples - each pair of items in the
list is taken as (respectively) a key and a value.
You can assign a hash to a list. If you do then it’ll convert the hash into a list of
key/value pairs.
Often we use:
It is generally true that things don’t come back out of a hash in the same order that
they go in (if say, you get all the keys back out with the keys() function). Do not try
to use push(), pop(), shift() or unshift() with hashes. They don’t work -
remember, position in a hash has no meaning.
Functions Which Work With Hashes
In the same way that an array can be deleted by assigning it with undef, so can a
hash. So to delete a hash, do this:
%my_hash = undef;
If however you just want to remove all the entries in the hash without undef’ing it
and then recreating it, then just do this:
%my_hash = ();
More info: See The Perl Cookbook, section 5.0 Page 129.
Solution: assign a list of pairs of items to the hash. You can also use the => operator
to do the same thing - it visually easier to see what is happening and where the
key/value pairs are located in the list.
Single-word hash keys are also automatically quoted, so you can write
$hash{“somekey”} as $hash{somekey}.
Hashes are stored in an order which is convenient for the implementation of hashes,
which means that the extraction order is not the same as the insertion order.
How Do I … Add An Element To A Hash?
More info: See The Perl Cookbook, section 5.1 Page 130.
Solving this problem is easy - just add any new entry as shown. Perl will take care of
all memory management for you, and just as with arrays and lists, you don’t need to
worry about overflow.
If you use undef as a hash key it will be turned into the empty string “”.
If you try to get a value for a key which isn’t in the hash you’ll also get undef, so you
can’t simple use if $hash{key} to see if a key exists. You need to use
exists($hash{key}) to test whether the key is in the hash,
defined($hash{key}) to see if it is or is not undef, and if($hash{key}) to
test it for true or false.
Hashes
%map = ('red',0xff0000,'green',0x00ff00,'blue',0x0000ff);
%map = ( red => 0xff0000, green => 0x00ff00, blue => 0x0000ff,
);
$field = radio_group(
NAME => 'animals',
VALUES => ['camel', 'llama', 'ram', 'wolf'],
DEFAULT => 'camel',
LINEBREAK => 'true',
LABELS => \%animal_names,
);
The => operator has the nice side effect of quoting anything on its left, so we can
leave the quotes off red, green, blue in the third example. The value on the right of
=> will still need quotes if it is a character string.
The hash when it’s initialized, is done in some order. The values generally don’t come
back out in the order they went in.
You can’t use scalar( %hash ) (or even use %hash in scalar context) to find out how
many things are in the hash. If you want to know that, use:
LAB4 - HASH_1
LAB4 - HASH_2
LAB4 - HASH-3
LAB4 - HASH_4
LAB4 - HASH_5
Array And Hash Slices
Slicing an array:
print @tragedy[3,4,5]
These are equivalent
Note: [ and ]
Slicing a hash:
Slicing an array:
The things in the array slice are not copies - they are the same elements. So
assigning to the array slice is also assigning to the original array elements. (The same
is also true for a hash slice).
The slice is a list (hence the @) and the brackets are [ and ].
Slicing a hash:
The values() function returns hash values in an apparently random order, so to create
a list of values from a hash with a specific order we often have to do something
similar to what is shown in the example. Instead of putting a single key in the curley
braces, we put a list of keys in the curley braces.
The slice is a list (hence the @ and NOT a $ or a %) and the brackets are { and }.
Scalar And List Context
Examples:
The second set of examples are all evaluated in list context - even if the assignment
only picks out a single value from such a list.
The rules don’t change when using my to force ourselves to declare variables.
A well designed function can figure out what context it’s been called in (using
wantarray) and return what is appropriate.
If wantarray
{
return @an_array;
}
else
{
return $a_scalar;
}
Variables Types - Simple Data Structures
Once this is done you can refer to individual elements like this:
$wife{"Jacob"}[0] = "Leah";
$wife{"Jacob"}[1] = "Rachel";
$wife{"Jacob"}[2] = "Bilhah";
$wife{"Jacob"}[3] = "Zilpah";
Sometimes you need to build not-so-lovely and not-so-simple data structures. Perl lets
you do this by pretending that complicated values are really simple ones.
You can see in the second example how this looks like a multi-dimensional array with
one string subscript and one numeric subscript.
We’ll discuss this is more detail tomorrow … This example (and the one on the
following page) are here to demonstrate that making complex data structures is easy.
Variables Types - Simple Data Structures
Example:
$kids_of_wife{"Jacob"} = {
"Leah" => ["Reuben","Simeon","Levi","Judah","Issachar","Zebulun"],
"Rachel" => ["Joseph","Benjamin"],
"Bilhah" => ["Dan","Naphtali"],
"Zilpah" => ["Gad","Asher"],
};
$kids_of_wife{"Jacob"}{"Leah"}[0] = "Reuben";
$kids_of_wife{"Jacob"}{"Leah"}[1] = "Simeon";
$kids_of_wife{"Jacob"}{"Leah"}[2] = "Levi";
$kids_of_wife{"Jacob"}{"Leah"}[3] = "Judah";
$kids_of_wife{"Jacob"}{"Leah"}[4] = "Issachar";
$kids_of_wife{"Jacob"}{"Leah"}[5] = "Zebulun";
$kids_of_wife{"Jacob"}{"Rachel"}[0] = "Joseph";
$kids_of_wife{"Jacob"}{"Rachel"}[1] = "Benjamin";
$kids_of_wife{"Jacob"}{"Bilhah"}[0] = "Dan";
$kids_of_wife{"Jacob"}{"Bilhah"}[1] = "Naphtali";
$kids_of_wife{"Jacob"}{"Zilpah"}[0] = "Gad";
$kids_of_wife{"Jacob"}{"Zilpah"}[1] = "Asher";
Suppose we not only wanted to know the names of Jacob’s wives, but also the names
of all sons of all his wives.
In this case we want to treat a hash as a scalar - we use { and } for that. Now we
have an array in a hash in a hash.
Adding another level to a nested data structure is like adding another dimension to a
multi-dimensional array. The important point is that Perl lets you pretend that
something which is complex is a simple scalar.
Perl’s whole object oriented structure is built upon this kind of encapsulation.
Packages are a way of splitting up your code. They are roughly equivalent to
C/Spice/Verilog .include statements.
package Matrix;
The effect of this is that from this point onwards any global name in Matrix.pm will
be prefixed by Matrix::
So if you say:
package Matrix;
$result = &print_me();
Then the real name of $result is $Matrix::result and the real name of
&print=me() is &Matrix::print_me()
If we don’t use a package declaration in our program then the default name is
“Main::” This means that the previous example will work since print_me() in
Matrix.pm is really &Matrix::print_me() while print_me in solve.pm is really
&Main::print_me(). {We would be better off in Solve.pm using a declaration like
package Solve; - what would the &print_me() subroutine be called then?}
Code which is brought into a program like this with a use command, is also called a
module. The standard is to name the module with the same name as the package it
contains (but with an initial uppercase letter) and with a .pm filename suffix. Thus the
code for package Matrix; would be contained in a file called Matrix.pm
The nice thing about Perl is that there are a *lot* of packages “out there” that you
can use to solve all sorts of problems.
Variables Types - Pragma’s
In the previous section we used the “use” command to load in some new code (a
module).
Some of the built-in modules in Perl don’t add code. Rather they change the way
that the language behaves.
These special modules are called pragmas.
Example:
use strict;
Pragma’s change the way the language works. In the example shown, it tightens up
on some of the rules which Perl uses by default and requires the programmer to be
explicit. This example would require that you predefine all your variable names - this
is usually a good thing - see the section on style in about five minutes time.
How Do I … Round Floating-Point Numbers?
Unrounded: 0.255
Rounded: 0.26
Unrounded: 0.255
Rounded: 0.26
More info: See The Perl Cookbook, section 2.4 Page 46.
The “f” argument in sprintf will let you specify how many decimal places the
argument should be rounded to. Perl looks at the next digit in the number, rounds it
up if it is 5 or greater, or down otherwise.
How Do I … Compare Floating-Point Numbers?
sub equal {
my ($A, $B, $dp) = @_;
More info: See The Perl Cookbook, section 2.2 Page 45.
Alternatively use a large multiplier on both numbers (like 1000000), turn that result
into an integer and then use “==“, but this demands that you have some idea of the
magnitude of the numbers before you start. If the number of decimal places is fixed
this make this latter solution easier.
How Do I … Convert Binary And Decimal Numbers?
You have an integer whose binary representation you would like to print out, or a
binary number which you would like to print as an integer.
sub dec2bin {
my $str = unpack("B32", pack("N", shift));
$str =~ s/^0+(?=\d)//; # otherwise you'll get leading zeros
return $str;
}
sub bin2dec {
return unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
}
More info: See The Perl Cookbook, section 2.3 Page 48.
You can’t solve either problem with sprintf since it doesn’t have a “print in binary”
format. So we use pack and unpack for manipulating strings of data. Both the pack
and unpack functions take arguments which specify what they should do with their
arguments.
How Do I … Control Case?
if (uc($a) eq uc($b)) {
print "a and b are the same\n";
}
More info: See The Perl Cookbook, section 1.9 Page 19.
The two ways of doing the conversions (functions and string escapes) look different,
but do the same thing. You can set the case of either the first character or the whole
word.
The use locale directive tells the Perl case conversion functions and pattern matching
engine to respect your language environment, allowing for languages with umlauts,
accent marks, cedillas and other diacritics used in many languages.
You can also use the case conversion functions and pattern matching to do case
insensitive string comparisons.
How Do I … Find Out Today’s Date?
You need to find out the year, month and day values for today’s date.
use Time::localtime;
$tm = localtime;
($DAY, $MONTH, $YEAR) = ($tm->mday, $tm->mon, $tm->year);
More info: See The Perl Cookbook, section 3.1 Page 73.
Solution - use localtime() and extract the information you want from the list it
returns.
Notes:
Running Perl Programs And Scripts
For longer scripts put the code into a file and say this:
% perl grading
The most convenient way is to make the file executable and ensure this line is at
the top of the file:
#!/usr/local/bin/perl -w
% grading
Useful tip - never just use this at the top of your file to invoke Perl:
#!/usr/local/bin/perl
#!/usr/local/bin/perl -w
use strict;
use Netlist_Functions;
# Define a constant
my @args = ();
my $flag = TRUE;
exit 0;
A more extensive version of this template can be found in the tutorial area and in
your notes.
Note: Once you “use strict;” all your variable will have to be defined like this:
my $variable;
Or
my $variable = 56;
You’ll get compile time errors if you don’t use my. Perl will also tell you about
variables you define and never use.
For any programs other than one-liners, ALWAYS use a methodology like this - it will
save you lots of time in debugging applications. We’ll talk more about strict later.
Style Guidelines
Note that there’s a complete style guide included in the course notes. There’s also a
separate style presentation later in the course.
Filehandles
Notes:
Filehandles
Using open()
open(SESAME, "filename") # read from existing file
open(SESAME, "<filename") # (same thing, explicitly)
open(SESAME, ">filename") # create file and write to it
open(SESAME, ">>filename") # append to existing file
open(SESAME, "| output-pipe-command") # set up an output filter
open(SESAME, "input-pipe-command |") # set up an input filter
You can use open to create filehandles for a variety of purposes (input, output,
piping).
Once opened the filehandle can be used to access the file or device until it is closed
with …
Using open with the same filehandle again will close the first filehandle.
Once a file is open it can be read from using the line reading operator <>. An empty
<> will read from STDIN.
What is STDOUT doing with the print statement in the second example? Since it’s the
default - you don’t need it.
The last two examples do the same thing - you’ll most frequestly see the first - this is
one of Perl’s common idioms.
Note that when you do use a filehandle with a print statement, there’s no “,” between
the print, the filehandle and the text.
How Do I … Process All The Files In A Directory
$dir = "/usr/local/bin";
Example: Read all the
print "Text files in $dir are:\n";
files and add on the
opendir(BIN, $dir) or die "Can't open $dir: $!";
directory path at the
while( defined ($file = readdir BIN) ) {
front of the filenames
print "$file\n" if -T "$dir/$file";
}
closedir(BIN);
More info: See The Perl Cookbook, section 9.5 Page 318.
The opendir, readdir and closedir functions operate on directories the same
way that open, close and <> operate on files. Both use handles, but the handles
used by the directory functions are different from those used by files.
In scalar context readdir returns the next filename from a directory until it runs out of
names, at which point it returns undef.
In list context it returns the rest of the filenames in a directory or an empty list if
there are no filenames left.
Operators - Arithmetic
You can always use ( and ) to force the order of evaulation you want.
Operators - String
There’s also a “multiply” operator for strings, called the repeat operator.
$a = 123;
$b = 3;
print $a * $b; # prints 369
print $a x $b; # prints 123123123
Note in the above how Perl is converting from numbers to strings as needed.
Of the three different ways of printing shown above, interpolation is the easiest to
understand.
Operators - Assignment
Assignment:
$a = $b;
$a = $b + 5;
$a = $a * 3;
$a *= 3;
chop($number = <STDIN>);
Second and third examples are op= syntax and works for all of Perl’s binary
operators.
Operators - Unary Arithmetic
If you place the operator in front of the variable it is known as pre-increment or pre-
decrement.
The value is changed before it is used.
If you place the operator after the variable it is known as post-increment or post-
decrement.
The value is changed after it is used.
$count = 3;
$limit = $count++;
print “Count=$count and Limit=$limit\n”;
Count=4 and Limit=3
or
$count = 3;
$limit = ++$count;
print “Count=$count and Limit=$limit\n”;
Count=4 and Limit=4
Operators - Unary Arithmetic
Example:
$a = 5; # $a is assigned 5
$b = ++$a; # $b is assigned the incremented value of $a, 6
$c = $a--; # $c is assigned 6, then $a is decremented to 5
Notes:
Operators - Logical
The bottom example is from our grading program. Perl tries to open the file called
“grades”. If it succeeds then the program continues with statements which follow this
line, otherwise Perl issues an error message via the die() function and stops.
Note that this code is visually easy on the eye and the important thing which the line
it trying to do is the first thing on the line - secondary actions are off to the right of
the code.
Operators - Numeric And String Comparison
There are two sets of operators - one for numbers and one for strings.
Notes:
Operators - File Test
File test operators let you find out information about files before you blindly muck
about with them.
Here are a few of the file test operators.
There are a lot more operators not listed - see the Perl man pages or Programming
Perl etc.
More On Input Operators
$_ is the default variable which is used implicitly (when you’re not explicit).
You can use the backtick operator to execute any system command like this:
The command will undergo variable interpolation - so the $user gets converted into a
real user name, then the command is passed to the shell, and all output from the
shell is passed back to the command and put into the variable $info. The numeric
status of the command is stored in the Perl variable $?. If you need to pass a $
symbol to the shell then you’ll need to escape it with \, so the $user in our example is
seen by Perl and not the shell.
If you just use <> without a file handle, then STDIN is assumed. So:
$input = <STDIN>; and $input = <>; both do the same thing; read a line of
input from STDIN. You can use this to advantage with Perl one-liners where STDIN is
actually a pipe from a shell command like this (the $ is the shell prompt):
Normally when you use the <> operator, you use it like this:
Remember, this special “magic” requires that the only thing inside the while loop is
the <> operator, if you use the <> operator anywhere else you must assign the
result explicitly if you want to keep the value.
LAB5 - FILES_1
LAB5 - FILES_2
LAB5 - FILES_3
The Range Operator ..
Examples:
2: $#foo is the index of the last item in @foo - this is true for all arrays.
3: Using a negative subscript on an array counts backwards from the end of the
array.
If the left value is greater than the right value in a .. Command then a null list is
returned. If what you really wanted was to count backwards then do this:
4: When used with strings we get some magic - this gives all the uppercase letters in
the English alphabet.
The .. operator is false as long a its left operand is false. Once the left operand is true
the .. operator is true until the right operand is true, then the .. operator becomes
false again.
The Conditional Operator ?:
The condition part is always evaluated in scalar context - for Truth or Falsity.
Question: In the example - what will the value of $result be if $count is 12?
How Do I … Establish Default Values?
You would like to give a default value to a variable, but only if it doesn’t already
have one.
The difference between the two types of solution is what they test for - something
being defined, or something being true.
Three values which are defined are false. 0 “0” and “”. If a variable already held one
of those values and you wanted to keep that value then || won’t work.
How Do I … Establish Default Values?
You would like to give a default value to a variable, but only if it doesn’t already
have one.
# find the user name on Unix systems The first expression which is true
$user = $ENV{USER} is the result which is assigned to
|| $ENV{LOGNAME} $user.
|| getlogin()
|| (getpwuid($<))[0]
|| "Unknown uid number $<";
LAB5 - FILE_4
Control Structures
Notes:
Control Structures - Truth
Notes
Loop Statements
LABEL BLOCK
LABEL BLOCK continue BLOCK
Continue BLOCKS are always optional
The while statements execute as long as EXPR is true. If while is replaced with until,
then the sense of the test is reversed. Note that unlike some languages which have
do - until loops, in Perl the until test is made at the start of the loop and not the end.
The while and until statement can have an optional continue block. This block
is executed every time the block is continued either by falling off the end of the first
block or by an explicit next (a loop-control operator which goes to the next iteration
of the loop).
Loop Control
The LABEL is optional - if it’s missing then the last, next, redo is the innermost
enclosing loop. But if you want to jump out of nested loops then the LABEL is needed.
Even though I’ve talked about continue blocks a lot - not many people use them.
Loop Control - An Example
LABEL: while <CONDITION>
{
# Code
# Code
# Code
# Code
}
continue
{
# Code
The LABEL is optional - if it’s missing then the last, next, redo is the innermost
enclosing loop. But if you want to jump out of nested loops then the LABEL is needed.
Compound Statements - If And Unless
if (EXPR) BLOCK
if (EXPR) BLOCK else BLOCK
if (EXPR) BLOCK elsif (EXPR) BLOCK ..
if (EXPR) BLOCK elsif (EXPR) BLOCK .. else BLOCK
unless simply reverses the true/false value of if. Note that unless also works with
else and elsif. There’s no such thing as elseunless.
Compound Statements - If And Unless
Examples:
}
unless ($x == 1) ...
if ($x != 1) ... These all do the same thing.
if (!($x == 1)) ... TMTOWTDI
Notes:
Compound Statements - If And Unless
Examples:
unless (open(FOO, $foo)) { die "Can't open $foo: $!" }
if (!open(FOO, $foo)) { die "Can't open $foo: $!" }
In the preferred example - there’s no if and no unless - we’re relying on the short-
circuit evaluation.
$! Is the error code returned by a shell for open, chdir and close (and also for lots
of other shell operations).
Control Structures - If And Unless
Examples:
if ($debug_level > 0) {
# Something has gone wrong. Tell the user.
print "Debug: Danger, Will Robinson, danger!\n";
}
Note - if has else and elsif. unless does not have an elseunless.
Control Structures - If And Unless
Another example of idiomatic Perl. You’ll see the interchangeability of statements like
this a lot.
Control Structures - While And Until
Perl has four main looping constructs, while & until and for & foreach.
While & until act like if and unless except that they loop repeatedly.
1. First the condition is checked.
2. If the condition is met, that is the condition is:
1. True for the while loop.
2. False for an until loop.
3. Then the block of code is executed.
while ($tickets_sold < 10000) {
$available = 10000 - $tickets_sold;
print "$available tickets are available. How many would you like: ";
$purchase = <STDIN>;
chomp($purchase);
$tickets_sold += $purchase;
}
Note: If the original condition is never met then the loop is never entered. Make sure
if you intend to leave the loop at some point that you have some code in the loop
which changes the variable which keeps you going through the loop.
The bottom example assigns the next line from the GRADES file to the variable $line
and returns the value of the line so the condition of the while statement can be
evaluated for truth. You might wonder if Perl will exit prematurely when it sees blank
lines in the file - the answer is it won’t because a blank line is a “\n” or newline
character and this is not false. When we do reach the end of the file the line input
operator returns the value undef, which always evaluates to false and so at this point
the loop does terminate. There’s no need for an explicit test because the input
operator is set up to work smoothly in a conditional context.
While Loops
A variable declared local to the while loop (here done with my $line) exists only
inside the loop. If you want $line to be visible after the loop has ended then declare
the variable before the loop begins. We’ll discuss scope shortly.
Also, the use of a continue block here is redundant - we could have easily put all the
statements in the continue block inside the main while loop. We’ll also discuss
last,next and redo shortly.
Control Structures - While And Until
You will often see command line arguments processed like this:
while (@ARGV) {
process(shift @ARGV);
}
The shift operator removes one element from the argument list each time through the
loop and sends it to a subroutine for processing (here called process()).
Control Structures - For And Foreach
Examples:
The for loop takes three expressions. An initial expression - set only once, a condition
to be tested every time the loop is executed and an expression to modify the loop
variable.
The foreach loop is used to iterate through the contents of an array. The foreach loop
treats the expression in ( and ) as a list (this is list context) always - even if there’s
only one element in the list. Then each element is aliased to the loop variable in turn
- IMPORTANT - MODIFYING THE LOOP VARIABLE ALSO MODIFIES THE ORIGINAL
ARRAY.
For Loops
{
my $i = 1; These are
LABEL: while ($i <= 10) equivalent
{
}
continue { $i++; }
}
Notes:
For Loop Examples
Examples:
for (my ($i, $bit) = (0, 1); $i < 32; $i++, $bit <<= 1) {
print "Bit $i is set\n" if $mask & $bit;
}
# loop's versions of $i and $bit now out of scope
You can do more than one thing in the three parts of the loop.
The <<= 1 part of the loop is shifting the value of $bit 1 bit to the right.
Foreach Examples
Examples:
With foreach there isn’t any way to know where you are in a list (unless you decide to
keep track of it yourself with counters etc.)
If the list contains modifiable values (i.e. variables, not constants), then you can
modify those variables by modifying the variable inside the loop. The variable in the
loop is an alias for the variable in the list.
Foreach Examples
Examples:
On the last slide we said that the variable inside the loop in a foreach loop was an
implicit alias for the variable in the list which is passed to foreach.
So when we alter the variable in the loop ($pay in the top example) we’re actually
altering the variable in the list which we are reading through.
Control Structures - Breaking Out - Next & Last
Notes:
Control Structures - Breaking Out - Next & Last
It’s possible to break out of nested loops by labeling your loops and specifying
which loop you want to break out of.
A label
Would anyone care to speculate
On what this piece of code does?
Notes:
Case Statements
SWITCH: {
if (/^abc/) { $abc = 1; last SWITCH; }
if (/^def/) { $def = 1; last SWITCH; }
if (/^xyz/) { $xyz = 1; last SWITCH; }
$nothing = 1;
}
OR
SWITCH: {
/^abc/ && do { $abc = 1; last SWITCH; };
/^def/ && do { $def = 1; last SWITCH; };
/^xyz/ && do { $xyz = 1; last SWITCH; };
$nothing = 1;
}
Perl doesn’t have a case/switch structure since it is so easy to build one. The SWITCH
is a label (remember the convention that all labels are in upper-case), and not some
Perl keyword we haven’t discussed yet.
We haven’t covered do (it’s on the next page), but think of it as a dummy keyword
which enables a statement (the bit between { and }) to be written. All three lines in
the second statement are using short-circuit evaluation. The first thing on the line
(reading from left to right) which is false makes the whole line false and all the
statements following are not evaluated. Remember: in short-circuit evaluation it’s the
first thing which is false in an && statement and the first thing which is true in an ||
statement which controls the flow of the program.
while(<RESULTS>) {
/LFSR\s\=\s(\w+)/ && do { print LFSRFILE “$1\n” };
$lastfile = $1;
}
The do BLOCK executes a sequence of statements in the BLOCK and returns the
value of the last expression evaluated in the BLOCK.
The do BLOCK itself does not count as a loop, so the loop control statements next,
last, redo cannot be used to leave or restart the BLOCK.
The do (FILE) Construct
If the file compiles and runs, the value If do can’t read the file it returns
returned is the value of the last undef and sets $! to the error.
expression evaluated.
The do FILE form uses the value of FILE as a filename and executes the contents
of the file as a Perl script.
Its use is to include subroutines from a Perl subroutine library, but it has been
superceded by use. It is still useful for loading things like configuration data into your
program as shown in the example.
If the file can be read but doesn’t compile then an error is set in $@.
If the file can’t be read then an error is set in $!
Goto
Perl does support goto - so that’s at least one thing they got wrong then!
You can:
goto LABEL
goto Expression
goto &name (subroutine)
Notes:
How Do I … Do Something With Every Element In A List?
foreach $var (sort keys %ENV) { Sometimes you need to use a function
print "$var=$ENV{$var}\n"; to generate the list needed by foreach
}
foreach $user (@all_users) { The code in the loop can call last to jump out
$disk_space = get_usage($user); of the loop, next to move on to the next element,
if ($disk_space > $MAX_QUOTA) { of redo to jump back to the first statement inside
complain($user); the block.
}
}
More info: See The Perl Cookbook, section 4.4 Page 97.
The variable set to each value in the list is called the loop iterator. If no variable is
supplied then the global variable $_ will be used. $_ is the default variable used in
many of Perl’s string, list and file functions.
How Do I … Do Something With Every Element In A List?
More info: See The Perl Cookbook, section 4.4 Page 97.
IMPORTANT NOTE: The top example works the way we might hope for. The value of
$_ in the while loop is preserved when the foreach loop is executed. However, if
the while loop had been the inner loop then BAD THINGS would have happened
since the while <FH> construct clobbers the value of the global $_ (I.e. it doesn’t
localize it). Consider this to be a bug or a feature - either way it’s an accident waiting
to happen. See the full explanation on page 99 of the Perl Cookbook.
I would always recommend using lexical variables. These are localized at their point
of declaration and the risk of side-effects is much reduced.
Also note that with a foreach loop, the loop iterator is not a copy of the variable
from the list, it actually is the variable in the list - change the variable and it changes
in the list. This is important - it’s not a copy, it’s an alias.
How Do I … Find Elements In One List But Not In Another?
You want to find the elements which are in one list but not in another.
# assume @A and @B are already loaded
%seen = (); # lookup table to test membership of B
@aonly = (); # answer
Straight-forward version
More info: See The Perl Cookbook, section 4.7 Page 104.
Solution: Build a hash of the keys in @B to use as a lookup table. Then iterate through
@A looking to see if the item in @A is in the lookup table. If it is then it’s in both @A
and @B. If it’s not then it’s in @B but not in @A.
How Do I … Find Elements In One List But Not In Another?
You want to find the elements which are in one list but not in another.
my %seen; # lookup table
my @aonly; # answer
More info: See The Perl Cookbook, section 4.7 Page 104.
The two different answers vary in how they build the hash. The first (previous slide)
iterates over @B. This one uses a hash slice. A hash slice is built like this:
$hash{“key1”} = 1;
$hash{“key2”} = 2;
The list in {} holds the keys while the list on the right holds the values. In this second
example we say this:
@seen{@B} = ();
This uses the items in @B as keys for %seen, setting each to undef (because the list
on the right is empty). We later check for the existence of the key - not the logical
truth or the definedness of the value.
How Do I … Extract Unique Elements From A List?
More info: See The Perl Cookbook, section 4.6 Page 102.
Solution: Use a hash to record which items have been seen and then use keys on the
hash to extract them.
Warning. Using a hash like this can use up a lot of memory, and once you’ve used a
hash the keys function will return the keys in a random order (not the insertion
order). If this matters then you need a different solution.
How Do I … Extract Unique Elements From A List?
More info: See The Perl Cookbook, section 4.6 Page 102.
How Do I … Reverse An Array?
More info: See The Perl Cookbook, section 4.10 Page 109.
The for loop actually processes the list in reverse order but keep the list in its
original order.
If you use reverse() to reverse a list you just sorted then make sure its in the
order you want.
The sort() function takes an optional code block which lets you replace the default
alphabetic comparison subroutine with your own, This function is called each time
sort() has to compare two values. The values are loaded into $a and $b which are
automatically localised, so they won’t interfere with any variables you already have
called $a or $b.
The comparison function should return a negative number if $a should appear before
$b in the output list, 0 if the order doesn’t matter and a positive number if $a should
appear after $b in the output list. Perl has two operators that behave this way: <=>
for sorting numbers in ascending order, and cmp for sorting strings in ascending
alphabetic order. By default sort() uses cmp-style comparisons. Of course, you can
always provide your own comparison subroutine.
How Do I … Traverse A Hash?
foreach $food (keys %food_color) { Solution: Use keys with a foreach loop
my $color = $food_color{$food};
print "$food is $color.\n";
}
Banana is yellow.
Apple is red.
Carrot is orange.
Lemon is yellow.
foreach cannot be used with hashes, nor can push(), pop(), shift, unshift() WARNING
More info: See The Perl Cookbook, section 5.4 Page 135.
How Do I … Delete Something From A Hash?
More info: See The Perl Cookbook, section 5.3 Page 133.
You can’t delete a key by setting its value to undef since undef is a value which a
hash can can store. You must use the delete() function.
If you want to clear a hash then simply assign it to the empty list like this:
%hash = ();
How Do I … Sort A Hash?
More info: See The Perl Cookbook, section 5.9 Page 144.
Solution: Get a list of keys and sort based on the ordering you want.
Sort by default sorts alphabetically. The optional code block passed to sort will be
called every time sort needs to compare two values in the sort function. $a and $b
are localised sort variables.
How Do I … Test For The Presence Of A Key In A Hash?
More info: See The Perl Cookbook, section 5.2 Page 131.
Toddler: It exists because we gave it a value in the hash, that value is defined (3) and
since it’s non-zero, it is true.
Unborn: It exists because we gave it a value in the hash, that value is defined (0) and
since it’s zero it is not true.
Phantasm: It exists because we gave it a value in the hash, that value is undefined so
it fails the defined test and since undef is false it fails the truth test as well.
Relic: It doesn’t exist since we never put it into the hash. So it fails all three tests.
How Do I … Invert A Hash?
You have a hash and a value for which you want to find the corresponding key.
# %LOOKUP maps keys to values Solution: Use the
%REVERSE = reverse %LOOKUP; reverse() function
What happens if two different keys happen to have the same value?
Result - The inverted hash will only have one. For a solution to this
see the “Perl Cookbook” pages 140 and 141.
More info: See The Perl Cookbook, section 5.8 Page 142.
Use reverse() to create an inverted hash whose values are the original hashes keys
and whose keys are the original hashes values.
or
because we can’t predict the order in which things come out of hashes. Reversing this
list (assume the first list is the one we get) gives this:
Banana is a food.
Martini is a drink.
More info: See The Perl Cookbook, section 5.2 Page 131.
exists() checks for the existence of a key in a hash. It doesn’t say anything about the
keys value (if the key exists).
How Do I … Print A Hash?
You want to print a hash, but neither print “%hash” nor print %hash works.
while ( ($k,$v) = each %hash ) { Solution: Iterate using each()
print "$k => $v\n";
}
print map { "$_ => $hash{$_}\n" } keys %hash; Solution: Use map to generate a
list of strings
foreach $k (sort keys %hash) { You can print in key order at the
print "$k => $hash{$k}\n"; cost of doing a sort()
}
More info: See The Perl Cookbook, section 5.5 Page 137.
More info: See The Perl Cookbook, section 5.3 Page 133.
You can’t delete a key by setting its value to undef since undef is a value which a
hash can store. You must use the delete() function.
delete() can also work with a hash slice to remove multiple keys from a hash, like
this:
You need to make a new hash with the entries of two existing hashes.
%merged = (%A, %B); Solution: Treat the hashes as lists and join them
as you would lists. Keys which appear in both hashes
%merged = (); will only appear once in the final hash.
while ( ($k,$v) = each(%A) ) { Alternative: Loop over the hashes elements and
$merged{$k} = $v; build a new hash.
}
while ( ($k,$v) = each(%B) ) {
$merged{$k} = $v;
}
More info: See The Perl Cookbook, section 5.10 Page 145.
How Do I … Traverse A Hash?
foreach $key (keys %HASH) { Solution: Use keys with a foreach loop
$value = $HASH{$key};
# do something with $key and $value
}
More info: See The Perl Cookbook, section 5.4 Page 135.
The each() function returns a two element list from the hash each! time it is called.
Remember, order has no meaning in hashes, so regardless of the order with which
you put values into the hash, it is very unlikely that they will come back out in that
same order. It is possible to retrieve items in insertion order, but that is beyond the
scope of this course.
How Do I … Find The Most Common Anything?
You want to know how many times a value in an array or in a hash occurs in the
array or hash.
%count = (); Solution: Use a hash to count how many time each
foreach $element (@ARRAY) { element (for an array) or key (for a hash) occurs.
$count{$element}++;
} The foreach adds one to $count{$element} for every
occurrence of $element.
More info: See The Perl Cookbook, section 5.14 Page 150.
How Do I … Operate On A Series Of Integers?
Remember, for and foreach are synonyms, so that gives us another 4 variations
More info: See The Perl Cookbook, section 2.5 Page 49.
Solution: use a for loop or a foreach with the range operator (..)
When iterating over consecutive integers, the third method is most efficient.
Regular Expressions
Notes:
Regular Expressions
s/Windows/Linux/;
When you see something that looks like /foo/ you’re looking at a pattern match
operator (the / and the /).
If you can find patterns in a string then you can also replace those patterns with
something else. So when you see something like s/Windows/Linux/ you’re looking at a
substitution of Linux for Windows (which some people might say is a good thing)!
Finally patterns can also specify where something isn’t. This is used with the split
operator - see next slide.
Regular Expressions
Tip - the best way to split a string which contains lots of white space:
We haven’t covered the \s character class yet - but it stands for any white-space
character. The \s+ means any string containing one or more consecutive white-space
characters (it can be different numbers at different places on a line of text - the fields
on which the split occurs don’t all have to be the same length).
Regular Expressions
The simplest regular expressions are those which match several characters in a row:
while (<FILE>) {
print if /http:/;
print if /ftp:/;
print if /mailto:/;
# What next?
}
In the first example we’re looking for all lines containing /http:/ exactly.
The =~ operator is called the binding operator. It’s telling Perl to look for a match in
the variable $line. If we don’t use the =~ operator then Perl by default searches the
system variable $_. This is a special scalar variable which is used in many places in
Perl - not just pattern matching.
In the second example we’re using the default value $_ (which is also set by the <>
operator).
In the third example we’re looking for lots of different types of links, http, ftp, mailto.
What happens if this later needs to be extended. Wouldn’t it be easier to look for any
number of alphabetic characters followed by a colon?
Regular Expressions
/[a-zA-Z]+:/
The [ and ] define a character class. The a-z and A-Z represent all the alphabetic
characters (the - means all characters between the starting and ending character
inclusive).
Example: /a./ will match any string containing an “a” that is not the last character in
a string. Why?
So this will match “at” or “am” or “a!” but not “a” since there’s nothing after the “a”
for the dot (any character) to match with.
It’ll also match “camel” and “oasis”, but not “sheba”. It matches “caravan” on
the first “a”.
Regular Expressions - Quantifiers
The character classes we’ve seen so far all match one character.
You can match a word with \w+ and the “+” is one kind of quantifier.
General quantifiers are like this:
{min,max}
Example Matches
\d{6,8} Any number of between 6 and 8 digits
\d{5,5} A number of exactly 5 digits
\d{5,} A number of 5 digits or more
\d{,5} A number of 5 digits or less
Code Meaning
+ {1,}
* {0,}
? {0,1}
Exercise: What does this do, i.e. what will be in $line after the substitution?
To apply a quantifier to more than one character, use ( and ) like this:
One other thing to note: all matching in Perl is greedy - Perl will match as much as it
can
Regular Expressions - Anchors
Examples:
/\bFred\b/ would match in Answer And Reason
"The Great Fred" Yes
"Fred The Great" Yes
"Frederick The Great" No - Fred is not followed by a non-word character.
So when we said:
next LINE if line =~ /^#/;
When you try to pattern match, Perl will try to match in every location until it
succeeds. An anchor allows you to specify where a pattern can match.
The special symbol \b matches on a word boundary which is defined as the “nothing”
which exists between a word character “\w” and a non-word character “\W”.
Answer: Go to the next iteration of the loop if the first character on a line is the “#”
character.
Also, when we said that the sequence \d{6,8} would match a number of between 6
and 8 digits - that wasn’t quite true, since it would also match any number containing
9 or more digits as well. To get the desired result we would have to combine
quantifiers with anchors.
Exercise: write a pattern which will match a number of 5 or 6 digits - but will fail to
match one of more than 6 digits.
Regular Expressions - Back References
/\d+/
Both these patterns match
the same thing - a number
/(\d+)/ But this one remembers what
was matched
When you match patterns you can use “(“ and “)” to remember the bits of a string
which did match.
The “(“ and “)” don’t change what matches.
How you remember what was matched depends on where you want to remember it
from. Inside the same pattern the bits of pattern which match are stored in variables
\1 \2 \3 etc. The match from the first pair of “(“ and “)” is in \1 and so on.
Outside the pattern the bits of pattern which match are stored in $1 $2 $3 etc.
Be careful - once you start a new pattern match the old values of $1 $2 $3 etc. are all
wiped out, so if you want to remember them long-term then copy $1 $2 $3 etc. into
new variables.
By the way - there’s no limit to how many bits of the pattern can be remembered,
once you get to \9 or $9 Perl continues with \10 and $10 and so on.
Whoops - no easy answer here this time - you’ll have to work it out.
Regular Expressions - List Processing
Examples:
@array = (1 + 2, 3 - 4, 5 * 6, 7 / 8);
Lots of Perl operators can produce either scalar results or list results.
It depends on how they are used. They just “know” what is expected of them.
In the second example each of @dudes, @chicks and other() returns a list, all
the lists are then joined together to produce a single (big) list and that is passed to
sort().
Some operators produce lists (like keys), while some consume them (like print).
You can stack several up several list operators in a row - see example 3. This takes all
the keys from %hash, turns them all into lower-case by applying the lc operator (via
map { }), passes that list to the sort function and then passes that list to the
reverse function which then (finally) prints that list.
If you do a pattern match in list context then all the back-references are pulled out as
a list - see example 4 and example 5. TMTOWTDI.
How Do I … Parse Comma-Separated Data?
You have a file containing comma-separated values that you need to read in, but
these data fields may have quoted commas or escaped quotes in them.
sub parse_csv { This procedure is
my $text = shift; # record containing comma-separated values from “Mastering
my @new = ();
push(@new, $+) while $text =~ m{
Regular Expressions”
# the first part groups the phrase inside the quotes.
# see explanation of this pattern in MRE
"([^\"\\]*(?:\\.[^\"\\]*)*)",?
| ([^,]+),?
| ,
}gx;
push(@new, undef) if substr($text, -1,1) eq ',';
return @new; # list of values that were comma-separated
}
More info: See The Perl Cookbook, section 1.15 Page 31.
Text::ParseWords hides all this complexity from you. Pass its quoteword()
function two arguments and a CSV string. The first argument is the separator (in this
case a comma); the second is a value which is true or false, and which controls
whether the strings returned have quotes around them.
How Do I … Check If A String Is A Valid Number?
if ($string =~ /PATTERN/) {
# is a number
General solution
} else {
# is not Specific solutions
}
More info: See The Perl Cookbook, section 2.1 Page 44.
This is something which is common when validating input as part of a CGI script.
The solution is easy as long as you can decide what you mean by a number, and can
then write a regular expression (or series of expressions) to look for the pattern you
desire.
If numbers can have leading or trailing space then a substitution to remove that
space should occur, like this:
$probable_number = s/\s+//g;
How Do I … Copy And Substitute Simultaneously?
You want a easy way in pattern matching of copying and substituting at the same
time.
$dst = $src; You want to avoid
$dst =~ s/this/that/; this
More info: See The Perl Cookbook, section 6.1 Page 164.
How Do I … Match Only Letters When Pattern Matching?
More info: See The Perl Cookbook, section 6.2 Page 165.
The obvious way of doing this isn’t good enough in the general case since it doesn’t
respect a users locale setting. If you need to match letters with diacritical marks, then
use something like the second example which matches against a negated character
class.
The \w regular expression matches one alphabetic character, one numeric character
or _. Therefore \W is not one of those. The negated character class [^\W\d_]
specifies a byte which must not be alphanumeric, a digit, or an underscore. That
leaves nothing but alphabetics.
How Do I … Match Only Words When Pattern Matching?
You need to decide what you want a word to be, and then
write a pattern to detect it.
More info: See The Perl Cookbook, section 6.3 Page 167.
What you mean by a word varies between languages. Perl doesn’t have a built-in
definition of what a word is. You must make them from character classes and
quantifiers.
More info: See The Perl Cookbook, section 6.4 Page 168.
Use the /x modifier. This will cause the regular expression engine to ignore most
whitespace inside a regular expression and will also allow for the insertion of
comments. The allowed whitespace is space, tabs, and newlines.
How Do I … Find The Nth Occurrence Of A Match?
You want to find the Nth match in a string, not just the first one.
Input: One fish two fish red fish blue fish Example: Find the word preceding
the third occurrence of “fish”.
$WANT = 3; Use the /g modifier in a while
$count = 0; loop and keep count of the
while (/(\w+)\s+fish\b/gi) { number of matches.
if (++$count == $WANT) {
print "The third fish is a $1 one.\n";
# Warning: don't `last' out of this loop
}
}
More info: See The Perl Cookbook, section 6.5 Page 170.
The /g modifier creates a progressive match which can be used in a while loop. To
find the Nth match, it’s easiest to keep your own counter and then whenever you
reach the count you want, do whatever is appropriate.
How Do I … Read Records With A Pattern Separator?
# .Ch, .Se and .Ss divide chunks of STDIN Create a localised copy of
{ $/ which will be restored
local $/ = undef; after the code finishes. By
@chunks = split(/^\.(Ch|Se|Ss)$/m, <>); using split with () we also
} get the captured separators
print "I read ", scalar(@chunks), " chunks.\n"; returned in the final array.
More info: See The Perl Cookbook, section 6.7 Page 176.
You want read all lines from one starting pattern to an ending pattern.
while (<>) { Solution: use the range operator
if (/BEGIN PATTERN/ .. /END PATTERN/) {
# line falls between BEGIN and END in the
# text, inclusive. }
}
More info: See The Perl Cookbook, section 6.8 Page 177.
Solution: Use the range operator .. Either with patterns or with line numbers.
Here’s a very interesting Perl one-liner which makes use of this feature:
More info: See The Perl Cookbook, section 6.14 Page 190.
If you use the /g pattern modifier, the Perl regular expression engine keeps track of
its position when it finishes matching. The next time you match with /g the engine
starts looking for a match from the remembered position. This lets you use a while
loop to extract the information you want from the string.
How Do I … Match From Where The Last Pattern
Left Off?
You want to match again from where the last pattern left off.
$_ = "The year 1752 lost 10 days on the 3rd of September";
More info: See The Perl Cookbook, section 6.14 Page 190.
By default, when your match fails (say when you run out of numbers in the example
above), the remembered position is reset to the start. If you don’t want this to
happen because you want to carry on matching then use the /c modifier with /g.
This pattern:
/\G(\S+)/g
will find whatever non-whitespace characters follow the last number (rd, in this case).
How Do I … Expand And Compress Tabs?
You want to convert the tabs in a string into the appropriate number of spaces, or
vice-versa.
while ($string =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e) {
# spin in empty loop until substitution finally fails 1
}
use Text::Tabs;
@expanded_lines = expand(@lines_with_tabs); 2
@tabulated_lines = unexpand(@lines_without_tabs);
while (<>) {
1 while s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e; 3
print;
}
use Text::Tabs;
$tabstop = 4;
4
while (<>) { print expand($_) }
More info: See The Perl Cookbook, section 1.7 Page 15.
LAB6 - REGEXP_1
LAB6 - REGEXP_2
Scope, Pragmas, Modules, Subroutines, References
Notes:
Scope
By default (if you do nothing at all) Perl’s variables are global and permanent (Later
we’ll see that these are called package variables). Makes writing short programs very
easy, but they can be difficult to debug.
In both cases we have forced all variables to be declared before they are used (using
my) - that doesn’t affect the code. The point is that in the left example $pw and
$pw_length only exist in this piece of code. In the right example the same two
variables exist after the code is finished executing.
Subroutine declarations are global declarations - wherever you place them they are
visible to all code in your package.
Pragmas
Notes:
Pragmas
use constant;
use integer;
use integer;
$x = 10/3;
# $x is now 3, not 3.33333333333333333
use integer;
$x = 1.8;
$y = $x + 1;
$z = -1.8;
This pragma tells the compiler to use integer arithmetic only from now to the end of
the enclosing block.
In the second example you’ll be left with $x == 1.8, $y == 2 and $z == -1. The case
for $z is special since the - sign in front of the 1.8 counts as an operation (unary
minus) so the value of 1.8 is truncated to 1 before its sign bit is flipped.
Pragmas
use lib;
#!/usr/bin/perl -w
use Mosfet;
use Capacitor; Mosfet.pm
use Resistor; Capacitor.pm
use Diode;
use Instance;
Resistor.pm
Diode.pm
Instance.pm
This is used to modify the list of places in which Perl will look to find library modules.
It’s roughly equivalent to adding to your Unix $path variable.
The strict, Carp and English modules are all standard Perl modules. Perl always knows
how to find these.
use strict;
use strict; # Install all three strictures.
This pragma changes what Perl considers to be legal code. Sometimes these
strictures seem too strict for casual programming - until you spend an hour looking
for a bug which wouldn’t have happened if you’d used this pragma.
There are three things we can be strict about: subs, vars, and refs.
Symbolic references are suspect for a lot of reasons - its pretty easy to use one even
when you don’t mean to. With this stricture in effect you can only use real or hard
references. So, what are symbolic references?
Strict vars will trigger a compile time error if you attempt to access a variable which
has not met one of the following criteria:
Carp lets you report errors from the perspective of a user, so if a user fails to use
your modules correctly, the error messages will show up not as problems in your code
(which of course you’ve thoroughly debugged), but in the users code. In other words
this is a blame shifter.
Cwd is a module which lets you find out the current working directory - for Unix this
isn’t too useful since you can always use $cwd = `pwd`; However, this is guaranteed
to work on all systems where Perl is installed even when they don’t have a shell
function which will let them do $cwd = `pwd`;
English lets you use English names instead of the standard Perl names for built-in
variables.
Exporter is used with modules to determine what subroutines can be seen from the
outside of the module.
Subroutines
Syntax:
To declare a named subroutine without defining it do one of these.
sub NAME
sub NAME PROTO
sub NAME ATTRS
sub NAME PROTO ATTRS
say_hello();
Subroutines can be defined anywhere in your program, loaded in from other files via
do, require or use, or generated at run time with eval. You can call a subroutine
directly, indirectly through a variable containing either its name or a reference to the
subroutine, or through an object letting the object determine which subroutine should
really be called.
To create an anonymous subroutine just leave out the name. PROTO and ATTRS
stand for prototype and attributes respectively - they’re not so important. NAME and
BLOCK are essential even when they’re missing. For forms without the name you
need to have some way to call the subroutine, so do this:
&$subref;
Subroutines
Just as in previous examples, the lists passed to a subroutine are all flattened. So the
third call to dictionary_order would contain the contents of the array @sheep,
followed by the contents of the array @goats, the value of “shepherd” and finally
the scalar value stored in $goatherd.
It is possible to pass two or more arrays to a subroutine and have them maintain their
integrity (i.e. keep them unflattened).
If the subroutine does not require arguments then it can be passed an empty
argument list. The list can also be missed completely as long as Perl knows it’s a
subroutine.
Like variables, subroutines have a leading symbol which indicates what they are. The
name of a subroutine is preceded by an & which may be used when calling it. It must
be used when calling a subroutine in certain contexts (we’ll see these in a minute). It
can’t be used when defining the subroutine however. So this won’t work:
Subroutines which have been defined earlier can be called without “(“ and “)”.
Example 1: A subroutine already defined can be called without the “(“ and “)” around
the argument list.
Example 2: Another way to call a subroutine is to use the & prefix but without passing
any arguments. In this case the subroutine has the value of the @_ array passed to it
instead. This is used to call subroutines from within other subroutines. This is almost
never used in new code but may be present in old code. Always use subroutines as
shown in the style section of this course.
Named Subroutine Arguments
sub ls
{
%arg = @_; # convert a list to a hash
#etc
}
Example 1: You don’t want to pass 9 arguments to this subroutine when only a few
are going to change.
Example 2: You could arrange that passing undef as a parameter chooses a default
value but we’d still have to write a long piece of code as shown.
In the first example we set up some default values for some arguments.
In the third example we use a default set of arguments and then override some of
that standard set as well.
Aliasing Of Parameters - Pass By Reference
#!/usr/bin/perl -w
use strict;
exit;
return 0;
}
In this code we pass the parameters in @_ (this is always true) and use them in the
subroutine as aliases. Therefore when we change the value of one or more of the
parameters in the subroutine we are actually changing them in the calling code as
well.
Therefore
$_[1] = “dog”;
my $animal = “dog”;
on line 6.
#!/usr/bin/perl -w
use strict;
exit;
In this code we pass the parameters in @_ (this is always true) and use them in the
subroutine as values by copying them into local variables. Therefore when we change
the value of one or more of the parameters in the subroutine the change is restricted
to the values of the local variables in the subroutine. Therefore the assignment:
$animal = “dog”;
my $dv_by_dt = $delta_v/$delta_t;
Elements of the @_ array are special. They are not copies of the actual arguments.
They are aliases to the actual arguments.
If values $_[0], $_[1] etc. are changed then the argument in the calling routine is
changed, i.e the parameters in this case are passed by reference.
Would prefer to be able to pass by value - this is the more usual form, so explicitly
copy the @_ array into a new array, and to be doubly safe make the receiving array a
my() array.
use Carp;
It corresponds to die().
Subroutine Calling Context
The information about the calling context is obtained from the wantarray function.
We could use his information to decide what value a subroutine needs to return.
Subroutine Prototypes
sub add_two_param ( $$ )
{
return( $_[0] + $_[1] );
}
Notes:
How Do I … Access Subroutine Arguments
You have written a function and want to access the arguments passed by its caller.
sub hypotenuse { Solution
return sqrt( ($_[0] ** 2) + ($_[1] ** 2) );
}
More info: See The Perl Cookbook, section 10.1 Page 335.
All values passed as arguments are in the special array @_. So the first argument is in
@_[0] and so on. The number of arguments is scalar(@_).
Subroutines should always start by copying the arguments into a new private array.
To return a value from a subroutine use the return function. If there is no return
statement, then the value returned by the subroutine is the value of the last
statement executed by the subroutine.
How Do I … Make Variables Private To A Function
More info: See The Perl Cookbook, section 10.2 Page 337.
When you declare many private variables you must do so inside a list, like this:
Variables declared with my have lexical scope, which means that they only exist
within a certain textual area of your code. Such a variable is destroyed when the body
of code is ended. Usually the body of code is a block with braces around it like this:
{
# Your Code Here
}
Since a lexical scope is usually a block you will often hear the phrase lexical variables
being only visible within their block.
How Do I … Create Persistent Private Variables
You want a variable to retain its value between calls to a subroutine but not to be
visible outside that subroutine.
{
Solution: Wrap the function in
my $variable;
another block and declare my
sub mysub {
variables in the blocks scope
# ... accessing $variable }
rather then the functions.
}
More info: See The Perl Cookbook, section 10.3 Page 339.
Lexical variables don’t need to vanish when their scope ends. If something more
permanent is still aware of the lexical then it will be maintained. (Perl does this by
reference counting).
How Do I … Detect Return Context
You want to return a value that depends upon the calling context.
More info: See The Perl Cookbook, section 10.6 Page 344.
Solution: Use wantarry() which returns one of three things depending on how the
function was called.
A function can decide what context it was called in and then return something which
is appropriate to that context.
$ref_to_scalar = \$my_scalar;
$ref_to_array = \@my_array;
$ref_to_hash = \%my_hash;
$ref_to_sub = \&my_sub;
We are going to discuss hard references here and symbolic references (only in
passing) at the end of this section. When we say references we will always mean a
hard reference.
Once we have a reference, we can get at the thing it refers to by prefixing the
reference (optionally in { and }) with the appropriate symbol.
${\$my_scalar};
$$ref_to_scalar;
${$ref_to_scalar};
@{\@my_array};
@$ref_to_array;
@{$ref_to_array};
and so on. If you prefix a reference by the wrong symbol then you’ll get an error.
References
The arrow operator takes a reference on its left and either an array index in [] or a
hash key in {} on its right. It locates the array or hash that the reference refers to
and then access the appropriate element.
References And The ref() Function
Object references are missing from the above list because the thing a
reference to an object will return is the name of the object. This, of course,
changes as you use different objects.
Because dereferencing a reference with the wrong prefix can cause errors it’s
sometimes necessary to be able to figure out what kind of referent a specific
reference is referring to.
The built-in ref() function takes a scalar value and returns a description of the kind of
reference it contains.
If a reference is used where a string is expected then the ref function is called
automatically to produce a string and a unique hex address representing the internal
memory address of the referent is appended. This means that printing out a reference
usually produces something like:
HASH(0x10027588)
my $graphics_object = Polygon->new( 0 0 5 5 10 32 70 10 12 18
); # Polygon coordinates
print ref( $graphics_object ); # Will print “Polygon”
References And Anonymous Arrays
@table = (
( 1 , 2 , 3 ) ,
( 4 , 5 , 6 ) , This won’t work!
( 7 , 8 , 9 ) ,
);
@table = ( 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 );
@row1
@cols 1 2 3
@row1 = ( 1 , 2 , 3 );
@row2 = ( 4 , 5 , 6 ); \@row1 @row2
@row3 = ( 7 , 8 , 9 );
$table \@row2 4 5 6
@cols = ( \@row1 , \@row2 , \@row3 );
\@row3 @row3
$table = \@cols;
7 8 9
The first example doesn’t work because of list flattening. So we need to use
references to solve this problem.
Each element in a Perl array can store a scalar, and a reference is a scalar (albeit a
special kind of scalar).
The bottom half of the slide shows how to set this up using references. The elements
of the rows can be accessed using the arrow -> notation.
$table->[1]->[2];
This means: find the array referred to by the reference in $table (i.e. @cols) and then
get the element at index 1. That element stores a reference (a reference to @row2),
the get the element at index 2.
This is a popular way of creating data structures so Perl provides some simple
assistance. If we place the list values in [] instead of () we create a reference to a
nameless (or anonymous) array. The array is automatically initialised to the specified
values.
References And Anonymous Arrays
@table = (
( 1 , 2 , 3 ) ,
( 4 , 5 , 6 ) , This won’t work!
( 7 , 8 , 9 ) ,
);
$table = [
[ 1 , 2 , 3 ] ,
[ 4 , 5 , 6 ] , But this will!
[ 7 , 8 , 9 ] ,
];
The bottom example is identical to the data structure we set up on the previous page
except that all the internal arrays are anonymous - so you can’t access @cols or
@rows. The only access to the array elements is via the reference to the overall table.
print $table->[$x]->[$y];
Any arrow between a closing square or curly bracket and an opening square or curly
bracket can be removed. So the above can be rewritten like this:
print $table->[1][2];
$association = { cat => “nap” , dog => “gone” , mouse => “ball” };
$behave =
{
cat => { nap => “lap” , eat => “meat” } ,
dog => { prowl = “growl” , pool => “drool” } ,
mouse => { nibble => “cheese” } ,
};
Like the [] array constructor the {} hash constructor creates a reference which must
be assigned to a scalar variable ($association), not to a hash (%association). Like the
array reference, the values in the hash are only accessible via the hash reference:
# ...
sub fn {
.....
return (\%a, \%b, \%c); # or
return \(%a, %b, %c); # same thing
}
More info: See The Perl Cookbook, section 10.9 Page 347.
Just as all lists are flattened when multiple lists are passed to a function, the same
happens with lists returned from functions with the return statement. Therefore to
maintain the integrity of the arrays and hashes which are returned from a function,
the arrays and hashes must be returned as references.
Creating Data Structures
Perl creates a hash called %sue, gives it a new hash element indexed by the string
children, points that to a newly allocated array whose second entry is made to
refer to a newly allocated hash which gets and entry indexed by the string age.
References To Subroutines
The above is useless since there’s no way to execute the subroutine, so do this:
$sub_ref->( “Steve”; )
Notes: The “;” at the end of the second example is required since the whole line is a
statement.
The third example executes the code in the subroutine reference. We need to pass a
parameter to the subroutine and this is done by enclosing it between “(” and “)”.
Passing Subroutine Arguments As References
sub mysub
{
# Arrays are references, counts are scalars Might be useful to prefix
references with ref_
my ( $array1 , $count1 , $array2 , $count2 ) = @_;
# Call the above like this (assumes arrays and counts already set up)
# prints 15 and 36
In this code we are expecting four parameters to be passed to mysub, two arrays,
and two scalars which will be interpreted as an index into those arrays. The arrays are
passed by reference, the scalars by value. Note that we can return more than one
value from a subroutine - in this case we return 2.
Returning Subroutine Results As References
sub make_random_list
{
# Counts are scalars
Subroutines can return references as well as receiving them. This example shows a
subroutine which generates a large list of random numbers and then copies that list
back the the code which called the subroutine. As shown above the list is copied back
by value, I.e. a big copy of the list is passed back to the calling code as a large array.
This means that in the program code there exists:
1 copy of the array in the subroutine, and once the subroutine ends and the array
@new_array goes out of scope, that array is destroyed by Perl.
1 copy of the array is brought into existence in the main program as the end of
subroutine is reached and each of the internal values in new_array is copied back into
big_random_array. TINTWTDI.
Returning Subroutine Results As References
sub make_random_list
{
# Counts are scalars
In this code there is only ever one copy of the list - and it’s the one defined in the
subroutine. When the subroutine ends and returns a reference to the list, normally
Perl would arrange for the list to be destroyed (since it’s local to the subroutine and
it’s about to go out of scope). However, since the subroutine is passing back a
reference to an array, Perl arranges for the array to remain in existence. Only if the
reference to the array is ever made to cease to exist, will Perl then delete the array
which was defined inside the subroutine.
Perl does this using a mechanism called reference counting. Basically it means that all
Perl’s garbage collection is done for you.
If you wanted to force Perl to delete the array inside the subroutine (to save on
memory, say) then all you need to do is to;
undef $big_random_array;
Perl will reduce the reference count on the variable, and if it is zero then the array
created by the subroutine will be deleted.
Also, since only one thing (a scalar which is a reference) is passed back from the
subroutine to the calling code, it’s very quick and efficient.
Symbolic References
Examples:
$name = "bam";
$$name = 1; # Sets $bam
$name->[0] = 4; # Sets the first element of @bam
$name->{X} = "Y"; # Sets the X element of %bam to Y
@$name = (); # Clears @bam
keys %$name; # Yields the keys of %bam
&$name; # Calls &bam
With symbolic references Perl is using the value of one variable as the name of
another variable. This can be error prone and confusing, so I tend not to use this type
of reference. You can force Perl to make all of the above examples into errors by
using:
use strict;
Which I would recommend. If you then have a desperate need to use a symbolic
reference for a while you can then always countermand the stricture with:
no strict ‘refs’;
Packages
sub call
{
( $sub_ref , @args ) = @_;
$sub_ref->( @args );
}
This defines three completely distinct
package phone; subroutines named call.
package poker;
sub call
{ package main;
$pot = 21;
deal(); call( $ref , @args );
}
We would all like to use popular variable names like $count, $filename, $I. If
we did this there wouldn’t be any way to use other peoples code, since they would
have used the same variable names. Perl solves this problem by assigning each
named variable and each named subroutine to a particular family, known as a
package.
Each package maintains its own symbol table or namespace. So two different
packages may each have different variables and subroutines with identical names in
their own namespace.
By default Perl assumes that code is written in the namespace of the main package
(which is called, appropriately enough, “main”). You can change that default by
using the package keyword. A package declaration changes the namespace until
another package declaration is made or until the end of the current enclosing block,
eval, subroutine or file. See example:
The example defines three subroutines called “call” in three different packages. The
first, since it isn’t explicitly named is the main package. If we wanted to call one of
the other subroutines called call, we could either switch to the package or we can call
the subroutine version explicitly by prefixing the subroutine name by the package
name like this:
poker::call();
Package Variables
$i is created when it is referenced and it exists until goes out of scope, in this case
the end of the program since it isn’t a lexical variable - it belongs to the current
package. We can force the use of a variable in another package by prefixing the
name of the variable with the name of the package followed by a ::
Lexical Variables
Lexical variables:
Lexical variables are declared explicitly with the keyword my.
package main;
A lexical
my $i;
variable
for ( $i = 0 ; $i < 100 ; $i++ )
{
A lexical
my $time = localtime();
variable
print “$i at time=$time\n”;
}
1 They don’t belong to any package, so you can’t prefix them with a package name.
2 They can only be accessed within the physical boundaries of the code block or file
scope in which they are declared. In the code shown, the variable $time is only
accessible to code physically located in the for loop and not to code appearing before
of after the loop.
3 They usually cease to exist each time the program leaves the code block in which
they were declared. In the example the variable $time ceases to exist at the end of
each iteration of the for loop (it is recreated at the beginning of each iteration of the
loop).
Modules
A Perl module is a text file with a suffix .pm containing some Perl code.
It’s placed in a “standard” place.
You can add to the “standard” places with a use lib; statement.
When the compiler encounters a use statement in a program it searches through
the standard directories, locates the file, and loads the code.
When you have created a module you can control what is visible to a user with the
Exporter() module. See the example at the end of this section.
An example of exporting a module interface with symbols follows on the next slide.
An example of exporting a modules interface with method calls will be shown when
we come to Object Oriented Perl. (Generally Object oriented modules export nothing,
since the whole idea of methods is that Perl finds them for you automatically based
on the type of the object).
An Example Of Building A Module
To build a module called Bestiary, create a file called Bestiary.pm that looks like
this:
package Bestiary;
require Exporter;
$weight = 1024;
1;
use Bestiary;
to be able to access the camel function (but not the weight variable), and:
When you use a module, the module usually makes some variables or functions
available to your program - some symbols are exported from your module. Most
modules use Exporter to do this.
When modules are loaded they must return a TRUE value to indicate that the loading
was successful. This is usually represented by retuning the TRUE value as shown on
the last line of the example.
An Example Of Building A Module
require Exporter; These two lines make the module inherit from the
our @ISA = ("Exporter"); Exporter class (described in object-oriented Perl).
Bestiary can now export
symbols into other packages
with lines like this.
our @EXPORT = qw($camel %wolf ram); # Export by default
our @EXPORT_OK = qw(leopard @llama $emu); # Export by request
our %EXPORT_TAGS = ( # Export as group
camelids => [qw($camel @llama)],
critters => [qw(ram $camel %wolf)],
);
The first two line make the module inherit from the Exporter class.
The second set of lines tells Bestiary what it is allowed to export into classes which
use it.
The third set of lines can all be used in any program which uses Bestiary to determine
what is and what is not imported into the current package.
Leaving a symbol off the export lists does not render that symbol inaccessible to the
program using the module. The program will always be able to access the contents of
the modules package by fully qualifying the package name, like this:
$Bestiary::number_of_lambs;
POD, Special Variables, Internal Perl Functions
Command Line Switches, Perl One-liners
Notes:
POD
=item snazzle
The snazzle() function will behave in the most spectacular form possible
=cut
sub snazzle {
my $arg = shift;
....
}
If you ever download CPAN modules you’ll find that a lot of them have POD
documentation included within the code. This is confusing at first until you realise that
the compiler just skips over all the POD.
Perl ships with tools to convert files containing POD into various printable file formats:
Or
For a complete overview of POD see Chapter 26 of Programming Perl 3rd edition.
This is not an exhaustive list - see Chapter 28 of Programming Perl, 3rd edition.
Items without a short name don’t need the use English; pragma.
Some Perl Functions (By Category)
Scalar manipulation:
chomp, chop, hex, lc, length, oct, reverse, sprintf, substr, tr///, uc, y///.
Regular expressions:
m//, s///, split.
Numeric functions:
abs, atan2, cos, exp, hex, int, log, oct, rand, sin, sqrt, srand.
Array processing:
pop, push, shift, unshift.
Hash processing:
delete, each, exists, keys, values.
Filehandles, files and directories:
chdir, chmod, chown, chroot, link, mkdiir, open, opendir, rename, rmdir, stat,
umask, unlink, utime.
Notes:
Some Perl Functions (By Category)
$last_char = chop($var);
chop() always returns the character it removes. If you chop() a list, then every
item in the list is chopped. The thing which ends up in $answer in the question on
the slide is the character which was removed from the string $tmp. The thing you
probably wanted was $tmp.
chomp() returns the number of characters it deleted - not the characters themselves.
Examples Of hex() And oct()
$number = hex("ffff12c0");
sprintf uses the same
sprintf "%lx", $number; # (That's an ell, not a one.) conventions as C’s sprintf.
Note that you can always set the value of any variable with a hex value just by doing
this:
$h_number = 0xffdd;
print $h;
The hex() function is interpreting a string as a hex number, not a value. If the string
begins with “0x”, this is ignored. To do a reverse conversion use sprintf() as
shown.
Hex strings can only represent integers. Strings which would cause integer overflow
will trigger a warning.
oct() will interpret a string as an octal value. If the string starts with “0” it will be
interpreted as octal. If the string starts with “0x” it will be interpreted as a hex
value. If it begins with “0b” it will be interpreted as a binary value.
Try this:
Field Meaning
%% A percent sign
%s A string
Be careful - sprintf() in Perl does its own formatting - it is NOT calling the
underlying sprintf() function in the C library.
Examples Of sprintf()
Field Meaning
%n A special: stores the number of characters output so far into the next variable in the
argument list.
In addition to the formats on the previous slide, Perl also supports the following
conversions.
%I - a synonym for %d
%D - a synonym for %ld
%U - a synonym for %lu
%O - a synonym for %lo
%F - a synonym for %f
Examples Of sprintf()
Flag Meaning
.number “Precision”: digits after the decimal point for floating-point numbers, maximum length
for a string, minimum length for an integer.
l Interpret integer as a C type long or unsigned long
h Interpret integer as C type short or unsigned short (if no flags are supplied interpret
integer as C type int or unsigned
Perl allows the following flags between the % and the conversion character.
Examples Of split()
@chars = split //, $word;
@fields = split /:/, $line; Question: What does
@words = split " ", $paragraph; this produce?
@lines = split /^/, $buffer;
split /([-,])/, "1-10,20"; # Produces the list (1, '-', 10, ',', 20);
split /(-)|(,)/, "1-10,20"; # Produces the list (1, '-', undef, 10, undef, ',', 20)
Syntax:
Split /PATTERN/ , EXPR , LIMIT
split /PATTERN/ , EXPR
split /PATTERN/
split
split() scans a string and splits the string into lots of sub-strings, returning the
resulting list in list context, or the count of sub-strings in scalar context. The
separator is determined by pattern matching using the regular expression given as
part of the split() function - so the separators need not be the same size and need
not be the same string, on every match. Normally the separators are not returned
(but if the pattern contains () then the substring matched by each pair of () IS
included in the resulting list, interspersed with the fields which are normally returned).
If more than one pair of () is used then one substring is returned for each pair (some
may be undef, so be careful).
If the pattern doesn’t match at all then split() returns the original string.
If a limit is supplied then Perl will not return more than that number of sub-strings.
while (<>) {
foreach $word (split) {
$count{$word}++;
}
}
Both examples make use of defaults. In both cases the input text is extracted with
the <> operator and thus the splitting occurs on “$_”.
In the second case split() is passed no string (so it uses “$_”) and no pattern (so it
strips all leading whitespace and then splits on whitespace).
Examples Of stat() And unlink()
($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
$atime,$mtime,$ctime,$blksize,$blocks) = stat $filename;
$mode = (stat($filename))[2];
printf "Permissions are %04o\n", $mode & 07777;
use File::stat;
$sb = stat($filename);
printf "File is %s, size is %s, perm %04o, mtime %s\n",
$filename, $sb->size, $sb->mode & 07777,
scalar localtime $sb->mtime;
The stat() function returns a 13 element list giving statistics for a file. If a file stat
isn’t supported on a particular file system then the corresponding entry will be zero.
See page 801 of “Programming Perl, 3rd edition” for more details.
The unlink() function is used to delete a list of files. The function returns the number
of files which were successfully deleted. BE CAREFUL - this is ‘rm’ in disguise.
gmtime And localtime
# 0 1 2 3 4 5 6 7 8
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = gmtime;
# 0 1 2 3 4 5 6 7 8
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime;
$thisday = (Sun,Mon,Tue,Wed,Thu,Fri,Sat)[(localtime)[6]];
All elements of the lists returned by gmtime() and localtime() are numeric, so January
is month 0, Sunday is day 0.
The system() and exec() functions execute any program on your system for you and
return that programs exit status - not the programs output. To capture the output
from a program you must use backticks or qx//.
The difference between the two functions is that system() will fo a fork first and then
wait for the executed program to finish. That is, it runs your program for you and
returns when it is done. Exec() replaces your running program with the the new one,
so it never returns if the replacement succeeds (which makes the return of the exit
status a bit redundant).
See “Programming Perl”, 3rd Edition, page 811, for more details.
In the last example on the slide we use backticks to figure out what our current
directory is. This is an example of how you can capture the output of an external
program - a bad example, because what will happen if you put this script on your
web-page, someone downloads it and then they find out it doesn’t run because their
system doesn’t have a pwd command.
Command Line Switches And Writing Perl One-Liners
The -e switch allows you to write scripts directly on the command line.
Perl one-liners fit the whole of a Perl program onto one line (a command line). See
the accompanying article in the second edition of the Perl Review (contained as a .pdf
file in the Examples directory). Also see the whole of Chapter 19 of “Programming
Perl”, 3rd edition, Pages 486-503 inclusive.
In the second example the pipe operator | takes the output of cat and makes it the
standard input to the Perl program. The diamond operator <> takes lines from
standard input, so this example prints the contents of the file “myfile” and executes
the pattern match shown (which throws away all comments - as long as comments
start with a #).
The third example does the same as the second but uses the file redirection operator
(<).
The fourth example uses the fact that the diamond operator can also open and
redirect the contents of a file specified on a command line. So this example is exactly
equivalent to both examples 2 and 3.
Switch Effect
-e Used to enter one or more lines of a script.
-i Specifies that files processed by <> are to be edited in place.
-iEXTENSION Specifies that files processed by <> are to be edited in place
-mMODULE Loads MODULE as if you had executed a use.
-n Causes Perl to assume a loop around your code which makes it iterate over
filename arguments. See Example.
-p Causes Perl to assume a loop around your code which makes it iterate over
filename arguments. See Example.
Use the -I option with care. It renames the input file, opens and output file with the
original name and then selects that output file for all print, printf and write
statements.
If you use only the -I option then NO BACKUP COPY OF YOUR ORIGINAL FILE IS
MADE. The original file will be overwritten. If you do specify EXTENSION then the
original file is backed up using extension to supply a new name.
Here’s an example:
This will load the file called xyz, rename a backup copy to xyz.orig, open a new
version of xyz for output and run the substitution on the original file contents, placing
the result of the substitutions into the new file (still called xyz).
An Example Of A Perl One-Liner
#!/usr/bin/perl
$extension = '.orig';
LINE: while (<>) {
if ($ARGV ne $oldargv) {
if ($extension !~ /\*/) {
$backup = $ARGV . $extension;
}
else {
($backup = $extension) =~ s/\*/$ARGV/g;
}
unless (rename($ARGV, $backup)) { This,
warn "cannot rename $ARGV to $backup: $!\n";
close ARGV;
next;
}
open(ARGVOUT, ">$ARGV");
select(ARGVOUT);
$oldargv = $ARGV; Does exactly the
}
same as this.
s/foo/bar/;
}
continue {
print; # this prints to original filename
}
select(STDOUT); perl -p -i’.orig’ -e ‘s/foo/bar/’ xyz
The example from the previous slide is expanded here as the minimum needed to
replace the functionality of the one-liner.
The Perl -n And -p Command Line Switches
The -n switch causes Perl to assume the following loop around your script, which
makes it iterate over the filename arguments much as sed -n or awk do.
LINE:
while (<>) {
... # your script goes here
}
The -p switch causes Perl to assume the following loop around your script, which
makes it iterate over the filename arguments much as sed does.
LINE:
while (<>) {
... # your script goes here
}
continue {
print or die "-p destination: $!\n";
}
In both cases you can use LINE as a loop label from within your script, even though
you can’t actually see it in your file.
With the -n switch, lines are not printed by default. With the -p switch, lines are
printed automatically.
In both cases BEGIN and END blocks may be used to capture control before or after
the implicit loop - just like awk.
Other Perl Command Line Switches
Switch Effect
-c Causes Perl to check the syntax of the script and then exit without executing what has
just been compiled.
-d Runs the script under control of the Perl debugger.
-h Prints a summary of Perl’s command line options.
-T Turns on “taint” checks - an extra form of security useful for running CGI scripts.
-v Prints the version number and patch level of the Perl executable.
-w Prints warnings about variables which are used only once, and variables which are
used before being set. See Chapter 33 of “Programming Perl” 3rd edition.
Everyone should always run Perl with the -w option, either as here, as part of the
command line, or more generally as part of the:
#!/usr/local/bin/perl -w
There are many more command line switches than those listed. See the whole of
Chapter 19 of “Programming Perl”, 3rd edition for a complete description.
Command Line Arguments etc.
Item Description
ARGV The special filehandle that iterates over command line filenames in @ARGV.
$ARGV Contains the name of the current file when reading from the ARGV handle using <>.
@ARGV The array containing the command-line arguments intended for the script. $#ARGV is
the number of arguments minus one. $ARGV[0] is the first argument, not the
command name. Use scalar @ARGV for the number of program arguments.
@ARG Within a subroutine, this array holds the argument list passed to that subroutine.
@_ Within a subroutine, this array holds the argument list passed to that subroutine.
Notes:
Adding Command Line Arguments To Your Own Programs
while ( $numargs-- )
{
$next_arg = shift( @ARGV );
SWITCH: {
if ( $next_arg =~ m/^\-i/i ) { $main::infile = shift( @$ref_arguments ); $numargs-- ; last SWITCH; }
if ( $next_arg =~ m/^\-o/i ) { $main::outfile = shift( @$ref_arguments ); $numargs-- ; last SWITCH; }
if ( $next_arg =~ m/^\-d/i ) { $main::debug = TRUE; last SWITCH; }
if ( $next_arg =~ m/^\-/i ) { croak( "Unknown command line switch $next_arg" ); }
}
}
return TRUE;
}
Note that the input arguments are via a reference. You should also include some code
to look for something like -h or -help, print out something useful and then exit the
program.
Conclusion
Notes:
1 Introduction
This document presents guidelines for anyone who writes Perl scripts for design support tasks. The
aim is to introduce a common style and understanding for the benefit of anyone who either writes
new programs, or has to debug and/or maintain old ones.
2 Program Structure
Structure your program in the same way you would structure a C program. Have one section of
code that is the equivalent of C’s main(), and as long as the total program size is anything other than
trivially small, put code into subroutines that are called from the main program body.
Don’t structure the top-level of a program in file-scope since any variables declared there are visible
in all following subroutines (even if they’re lexical, or my, variables)– instead create the top-level
of your program as a code block (if you to think in C terms, even label it as MAIN if this helps you)
and put all code there. Also, don’t use global variables at all (i.e., outside the code block), since this
allows variables to have side-effects in different subroutines. To achieve both of these features
structure your code like this:
#!/usr/local/bin/perl
use strict;
use warnings;
use diagnostics;
MAIN:
{
my $variable_1 = 27;
exit;
use strict and use warnings are never optional, while use diagnostics gives readable
error messages that are useful for new users (and old ones).
The loop with the label MAIN: is where the main body of the program is written. A code block like
this is the equivalent of a loop that runs exactly once, but has the feature that all the lexical
variables declared within its scope are restricted to that scope, i.e., subroutine_1 can’t see the
values of any lexical variables like $variable_1 unless they are passed to subroutine_1 as an
argument of a subroutine call to subroutine_1 (which is basically how you’d hope a program
would behave). Also note that the label (MAIN:) is optional, and can be omitted.
Note that subroutine_1 is declared before the main body of the program. This is only needed if
the subroutine definitions follow the main program – if they precede it then the forward declarations
aren’t needed since the declaration is also the definition. Also note that subroutines can optionally
be declared with prototypes (the $$$ in ( $$$ ) which here declares that the subroutine is
expecting three scalar arguments). This check is performed at compile time so there’s no run-time
overhead for doing this.
If you must use a global variable (you really shouldn’t) then make it explicit that this is what you’re
doing by referring to it as a package variable like this:
#!/usr/local/bin/perl
sub subroutine_1();
$main::count = 56;
MAIN:
{
$main::count = 27;
subroutine_1();
}
exit;
sub subroutine_1()
{
print “The value of count is $main::count\n”;
}
Here we’ve declared a global variable called $main::count (it’s a variable named $count in
package main, the default package name, which is why it’s name is $main::count). This code
prints the value 27 when executed since the initial value of 56 is overwritten in the main body of
code and this is the value seen in subroutine_1 when it is executed. Note that the value of
$main::count wasn’t passed to subroutine_1 as a parameter, but subroutine_1 can still see
its value (it can change its value as well – this is what I mean by having a side-effect).
# Subroutine code goes here. $var_1 etc are private to this code
}
This is a common Perl idiom where all the variables from the @_ array are copied into lexical
variables in the subroutine. This makes those variables local to the subroutine – changing them in
the subroutine will NOT change them in the calling code. This is normally how you would expect
programs to behave.
If you do want a variable in a subroutine to be changed in the calling code then pass the variables
to the subroutine by reference instead. This is done like this:
MAIN:
{
my $a = 56;
subroutine_1( $a );
print “A=$a\n”;
}
exit;
subroutine_1( $ )
{
$_[0] = 99; # Alter the first element of the @_ array
}
The elements of the @_ array are references to the variables in the calling code, so changing the
value of $_[0] will change the variable $a in the example above. Therefore the value printed will
be A=99. This form is not recommended since it’s confusing and inconsistent with normal usage.
MAIN:
{
my @list_1 = qw( Alpha Baker Charlie Delta );
my @list_2 = qw( Zulu Yankee Xray Whisky );
subroutine_1( $$ )
{
my ( $list_1_r , $list_2_r ) = @_;
The two arguments (which are themselves scalars) are references to the original lists so the
subroutine can access the individual elements of the lists. Therefore the above example prints out
“Baker Whisky”.
MAIN:
{
my @list = subroutine_1();
my $scalar = subroutine_1();
exit;
subroutine_1( $$ )
{
if ( wantarray )
{
return qw( one two three four five );
}
else
{
return( “once I caught a fish alive\n” );
}
}
The first call to subroutine_1 is in list context (the calling program expects a list to be returned).
In subroutine_1 the wantarray function is evaluated and for this first call it will be TRUE,
therefore subroutine_1 sends back a list of five things (the textual representation of the numbers
one to five inclusive). The second call to subroutine_1 is in scalar context (the calling program
expects a single thing to be returned). Now when the wantarray function is evaluated a single
thing is returned (a string consisting of the text “once I caught a fish alive”.
Note that you can also return information from a subroutine that is expected to be interpreted as a
hash. If this is true then you should make sure that you return an even number of scalars (each pair
of scalar’s will be used as a key/value pair in the resulting hash).
MAIN:
{
my @values = qw ( 6.32 7.88 9.54 12.83 17.99 31.36 18.25 );
my ( $mean , $median , $mode , $variance ) = statistics( @values );
exit;
sub statistics
{
# Code to compute mean, median, mode, variance
We arrange for the subroutine to return four scalar variables in a list, and we arrange for the
receiving code to place those four returning values in that list, into another four scalar variables.
MAIN:
{
my $tmp;
exit;
BEGIN
{
my $count_value = 0;
sub count()
{
$count_value++;
return $count_value;
}
}
Place the subroutine definition(s) in a code block (subroutines are visible from everywhere
regardless of how you “hide” them). The lexical variable $count_value is locally scoped to the
July 31, 2005 5 / 18
code block its defined in and is therefore available to the subroutine count(). However, while
normally a lexical variable will be destroyed once a code block finishes execution, in this case the
compiler arranges for it to continue to exist since something is still referring to it (in technical terms
the subroutine count() has incremented $count_value’s reference count, and that stops Perl
from destroying it).
The only problem is how to get an initial value of zero into the value of $count_value. This is
done by placing all the code in a BEGIN block. Perl guarantees to execute all BEGIN blocks as soon
as they are compiled, thus ensuring that the single line of code “my $count_value = 0” is
executed before any call to the subroutine is made. The above code therefore prints out Tmp = 1
followed by Tmp = 2.
Of course, there’s no reason why several subroutines cannot share a variable in this way to provide
a globally accessed variable that cannot suffer from unintended side-effects. Here’s how:
MAIN:
{
my $tmp;
initialize( 37 );
exit;
BEGIN
{
my $value = 0;
sub initialize( $ )
{
$value = shift @_;
}
sub increment()
{
$value++; return $value;
}
sub decrement()
{
$value--; return $value;
}
}
This is a very secure way to create something that can be accessed from anywhere in a controlled
and predictable manner. The variable $value is secure from any unintended side-effects (or even
intended ones) and can be initialized/incremented/decremented from anywhere (you could of course
also add a read subroutine to just return the value). We’ve almost strayed into OO land here since
we’ve created something that is encapsulated (the variable value) and can only be accessed via
subroutine calls (equivalent of OO methods).
SWITCH:
{
if ( $condition == TRUE)
{
# Run some code
next SWITCH;
}
if ( $some_other_condition == TRUE)
{
# Run some other code
last SWITCH;
}
Here, SWITCH is a label (so each switch statement needs a different label and this is a drawback)
while the last SWITCH piece of code is the equivalent of C’s break. Since this is a loop you can
repeat it with next (all clauses except the last) , and end it with last (the last clause only).
OUTER:
{
foreach my $item ( @item_list )
{
INNER:
{
foreach my $object ( @object_list )
{
# Code
# Code
/design/rmc/tools/
/design/rmc/tools/Perl_Modules/tool/dev/
and in both cases release them. Don’t forget to write documentation, ideally as POD (Perl has
translators to generate man pages, html and PDF). Don’t reinvent the wheel.
Since a lot of what we do involves reading and parsing files, and then writing some new file(s), use
Netlist_Tools.pm in the Perl_Modules directory. These routines are debugged and work quite
happily with files that are gigabytes in size and they’ll transparently gunzip any files that are
gzipped even if you don’t know they’re gzipped. Don’t reinvent the wheel.
Also, before you write a mega-thingy widget that will revolutionize human-kind, look on CPAN
just in case someone else has beaten you to it (they probably have)! Don’t reinvent the wheel.
If you’re writing code that makes several different tests on some data, put the most common tests
before the less common ones. For example, if you’re testing a string in a loop like this:
next;
}
}
I’ve rendered it in a small font size to illustrate a point: the formatting has been preserved exactly as
it was written, and this is a small fragment of a much larger code-base of well over 5000 lines of
code just like this. And my point? I absolutely guarantee to you that one week after the above code
was written, that the original author will not know all the nuances that went into it’s authorship.
Any debugging exercise will be very difficult for that author, let alone someone who comes fresh to
the task with responsibility to maintain this code once the originator has moved on.
my $lef_filename = undef;
my $log_filename = undef;
my $default_log_filename = "lefPortStrip.log";
my $pin_names_r = [];
my $layer_names_r = [];
my $lef_filename = undef;
my $log_filename = undef;
my $default_log_filename = "lefPortStrip.log";
my $pin_names_r = [];
my $layer_names_r = [];
run_lef_import( $lef_filename ,
$log_filename ,
$default_log_filename ,
$pin_names_r ,
$layer_names_r );
Put the opening curly brace on the line after a keyword and lined up with the start of the keyword.
A one-line BLOCK may be put on one line, including left- and right-brace.
Don’t omit the semicolon in a one-line BLOCK even though you can (in the above example it’s the
semicolon after the “E” in FALSE. At some point it’s a certainty that you’ll change that one line
block to a multi-line block by adding new commands. At that point the semicolon is needed and
you’ll have to add it anyway.
Don’t put space before the semicolon after a statement. Do put space both before and after a “,”
when separating parameters and list items.
Don’t put space between a function name and its opening parenthesis.
MAIN:
{
my $radius = 2.0;
my $area = PI * $radius * $radius;
}
$array_r->[ 56 ] = PI;
While in the following example it should be obvious that something has gone wrong because the
dereference operator is not being used on a reference (the _r is missing).
my $number = 56;
$number->[ 0 ] = get_random_integer();
foreach ( @l )
{
Which doesn’t tell you much about what’s going on and why, whereas the far more readable:
tells you exactly what was/is intended. This will be more clear to others when they read your code
and will be clearer to you when you come back to debug your code in a years time.
And here are two examples of how you should not use them:
Name variables using my (i.e., use lexical variables). Never use global variables and don’t be
tempted in the heat of debugging to insert just one or two to get around a problem.
When in doubt use parentheses. Just because you can omit them doesn’t mean you should omit
them.
If your program is running for more than a few seconds, give your users some feedback. If you’re
programming a GUI in PerlTk, use a progress bar.
If your program is a command line driven program then always program a -help parameter to give
users some idea of what the program does and what to type. Make the invocation of the program
with no parameters display some help information. Give a user the option to get more help with a
–help parameter.
Allow default options. Make sure a user knows what they are, when he/she asks for help.
Make error messages clear so a user knows what to fix when things don’t run the way they expect.
Since many programs are often chained together or are run within a single controlling program,
make sure all scripts return an error or success code. Error codes for success are always 0 (zero). If
programs are designed to be chained together in a shell script, then follow the Unix philosophy of
having programs that complete successfully return no output at all (i.e., they are silent).
# Exit codes :
exit( EXIT_OKAY );
Always return a value from both your program and any subroutines in that program. If you don’t
use an explicit return statement then the value returned is the result of last statement evaluated. This
will change as you modify your code, and in particular since most code is added at the end of a
program, the return value from what you’re currently writing will be changing what is seen by
whatever wrapper is running your code.
If it’s vital that your code not return a value, because, say, you want to indicate that an error
occurred but it wasn’t a fatal error, then return undef. In Perl undef is a value that represents not
defined.
July 31, 2005 13 / 18
When you write Modules, remember that a module must always return a value of TRUE, so the last
line of a Module should look like this.
1;
4 Testing
If your code is destined to be used by others then you must test it. In particular keep a directory or
folder with files that are read by your code, and write some scripts to run common cases. When you
add new features or debug problems, make sure all the old tests are run so that you can prove that
the modifications or additions haven’t caused unintended side-effects that cause old code to stop
working correctly (in computer science parlance this is called regression testing).
5 Traps For The Unwary (Or, Things That Catch Everyone Out Eventually)
Remember to use == for numeric tests and eq for string tests. Don’t fall into the C trap of using =
(assignment) when you mean == (comparison).
use warnings;
use strict;
use diagnostics;
All arrays count from 0, not 1. An array of size 20 has elements [0] to [19] inclusive. There isn’t an
array item [20].
Hashes have no order, so you can’t use for or foreach with a hash. You also can’t index into them
with []. If you need to iterate over a hash you’ll need to use keys and values.
Second solution: Write your own routine. Here’s a template for it:
my $numargs = @$arguments_r;
my $argument = undef;
my $next_arg = undef;
my $input_filename = undef;
my $output_filename = undef;
my $print_flag = FALSE;
while ( $numargs-- )
{
$next_arg = shift( @$arguments_r );
SWITCH:
{
if ( $next_arg =~ m/^\-input/i )
{
$input_filename = shift( @$arguments_r ); $numargs-- ;
last SWITCH;
}
if ( $next_arg =~ m/^\-output/i )
{
$output_filename = shift( @$arguments_r ); $numargs--;
last SWITCH;
}
if ( $next_arg =~ m/^\-print_flag/i )
{
$print_flag = TRUE;
last SWITCH;
}
if ( $next_arg =~ m/^\-/i )
{
croak( "Unknown command line switch $next_arg" ); }
my ( $input_filename ,
$output_filename ,
$print_flag ) = Parse_Command_Line_Arguments( \@ARGV );
#!/usr/local/bin/perl
use strict;
use warnings;
use diagnostics;
use Carp;
use Cwd;
use Config;
use Netlist_Tools;
MAIN:
{
my $file_r = Read_File( “BigFile.txt” );
This code will load one of (in this order) BigFile.txt, BigFile.txt.gz,
BigFile.txt.gzip. If you specify an output filename in Write_File that is suffixed in either
.gz or .gzip then the file will be compressed (with gzip) before it is written.
A major advantage of Read_File is that not only will it transparently read in the file via gzip if
necessary, all the lines are then formatted so that every line is in a list that can be iterated, and every
line is guaranteed to have no white-space before the first non-white-space character. There will also
be no white-space at the end of the line and all “words” on a line will be separated by exactly one
space.
If, alternatively, you want to create a new file based on some or all of the contents of an input file,
you can re-write the body of the code in the previous program like this:
use Netlist_Tools;
MAIN:
{
my $in_file_r = Read_File( “BigFile.txt” );
my $out_file_r = [];
exit 0;
}
This will write out the contents of a list (@$out_file_r) which you build up piece-meal based on
some or all of what you read from the original input file.
# This example shows how to use ELDO for which we have 4 licenses.
# We’ll limit ourselves to use 2 of them. While we’re limiting ourselves
# here because of scarce license resource, the same code can be used to
# stop queues being flooded with jobs that are pending but consuming
# queue slots (and making yourself pretty damn unpopular).
File_2.cir
.
.
.
File_98.cir
File_99.cir )
{
# Test the queue
system( $command );
exit;
Style
September 2005
A Standard Header
There are other binary invocations that use “eval’ with some “magic”.
The magic #! line works for all machines on-site, regardless of whether they are
SunOS (Solaris) or Linux based.
We always use strict and warnings. Diagnostics are useful for less experienced
programmers but if omitted can be added on a command line invocation with -
Mdiagnostics.
Carp is the standard blame shifter (makes errors show up in client code rather then in
your code). Cwd is a platform independent way of finding the current working directory.
Config is used to allow programs to transparently load precompiled code (C, C++ etc.)
on different binary platforms.
FindBin allows a program to find out from what directory it is being run and to add that
directory to Perl’s path.
Structure your program in the same way you would structure a C program.
#!/usr/local/bin/perl
MAIN:
{
my $variable_1 = 27;
Main Program
# Program code – equivalent of C’s main()
}
exit; Exit
By placing all the code for your program into subroutines and one top-level code block
(here called “Main Program”), we can enforce the scope of all variable declarations
and reduce or eliminate side-effects. Note that the top-level code block is headed by a
label (MAIN:) but this is optional, and the name of theblock can be anything (I’ve called
it MAIN to lull C programmers into a false sense of security).
If You Must Use Global variables
MAIN:
{ Write
$main::count = 27;
subroutine_1();
}
exit;
sub subroutine_1()
{
Read
print “The value of count is $main::count\n”;
}
There really isn’t any good reason to use global variables in the sense shown above.
The problem is that the global variable is seen by all the subroutines that follow it
because its scope is file scope. Therefore any subroutine can modify it and cause
other subroutines that also see the variable to change their behavior - this isn’t usually
what is intended.
Subroutine Parameters - I
subroutine_1( $$$ )
{
my ( $var_1 , $var_2 , $var_3 ) = @_;
The Right Way
(Value)
# Subroutine code goes here. $var_1 etc are private to this code
}
MAIN:
{
my $a = 56;
subroutine_1( $a );
print “A=$a\n”;
} The Wrong Way
(Reference)
exit;
subroutine_1( $ )
{
$_[0] = 99; # Alter the first element of the @_ array
}
Of the ways to pass parameters to a subroutine, the best way (the correct way) is to
pass them by value. This is done by copying all the parameters into local variables
(lexical variables) at the start of the subroutine. Make this the first thing that any
subroutine does. Then, if you change the value of any of the variables then it doesn’t
affect the value of that variable in the code that called your subroutine. If you do want
to change the value of one of the input parameters then you can pass by reference
(option 1), or you can return a new value for the variable as a return value from the
subroutine and assign it back to the corresponding variable in the calling code (option
2). Option 1 corresponds to the way you might choose to do this in C. Option 2 is the
correct wy to do this in Perl. Note: option 1 and option 2 DO NOT refer to the two
sections of code above.
Subroutine Parameters - II
exit;
subroutine_1( $$ )
{ Copy To Lexicals
my ( $list_1_r , $list_2_r ) = @_;
Note a subtlety: We’re passing references here to make our program fast. If the lists
that the references point to are large, then we don’t end up copying those large lists
via the stack. We localise the references into subroutine_1 with the my statement, but
we can still change any value in the lists that the references point to, by simply running
the code as shown with list_1_r->[ 1 ] on the left-hand side of an assignment. In this
respect we’ve exactly emulated C where we’ve called a subroutine with a const pointer
- you can’t change the pointer but you can change the thing it’s pointing at. We’ve also
violated our “Option 2” rule from 2 slides back “Subroutines I”.
Returning Results From Subroutines - I
exit;
subroutine_1( $$ )
{
if ( wantarray )
{
return qw( one two three four five );
}
else
{
return( “once I caught a fish alive\n” );
}
}
Subroutines can return data in context, that is, subroutines can be made to know how
they were called: in list context or in scalar context.
Returning Results From Subroutines - II
exit;
sub statistics
{
# Code to compute mean, median, mode, variance
If you want to return more than one thing from a subroutine, then return a list. You can
then assign that list to another list in the calling code. Note that this can be error prone
(you need to get the right number and order of variables with no language assistance).
You could return a hash with named results, but you then run the risk (especially if you
pass subroutine parameters in a hash as well) of turning each subroutine call into
something with more overhead than code.
The Equivalent Of C Static Variables - I
exit;
BEGIN
{
my $count_value = 0;
sub count()
{
$count_value++;
return $count_value;
}
}
Subroutine names are globally visible, so even though count() is buried one level down
everything/anything that needs to call it can do so. However, with code written as
shown, count() can access the variable named $count_value but nothing else in the
program can. It’s a lexical variable and not a package variable (so you can’t say
$main::count_value because that isn’t the way to access this particular variable) and
the fact that count() is referring to it will make sure that perl keeps its reference count
non-zero (so it is persistent and exists for the lifetime of the program). A long as we
make the block in which it is defined a BEGIN block then it will be initialised by Perl
before any of your code starts to run.
The Equivalent Of C Static Variables - II
initialize( 37 );
exit;
BEGIN
{
my $value = 0;
This is a very secure way to create something that can be accessed from anywhere in
a controlled and predictable manner. The variable $value is secure from any
unintended side-effects (or even intended ones) and can be
initialized/incremented/decremented from anywhere (you could of course also add a
read subroutine to just return the value). We’ve almost strayed into OO land here since
we’ve created something that is encapsulated (the variable value) and can only be
accessed via subroutine calls (equivalent of OO methods).
Implementing A SWITCH Statement
next SWITCH;
}
if ( $some_other_condition == TRUE)
{
# Run some other code
last SWITCH;
}
Here, SWITCH is a label (so each switch statement needs a different label and this is
a drawback) while the last SWITCH piece of code is the equivalent of C’s break. Since
this is a loop, you can repeat it with next (all clauses except the last) , and end it with
last (the last clause only).
Labels - Use Them/Don’t Use Them
# Code
Use labels to be explicit about where the commands next and last transfer you (and
goto, but you’re never going to use goto, are you!).
If you use labels it is always clear where you are transferring control to, but it is never clear at
the transfer point (i.e., the actual label) where transfer of control has come from, and this
makes it very hard to debug code – next and last with labels are just synonyms for goto (and
you’re never going to use goto, are you!) On balance, use labels for SWITCH and one level
loop operations.
Writing Efficient, Maintainable And Useable Code - I
If you’re writing code that makes several different tests on some data, put the most common
tests before the less common ones.
If you run the code on the slide, with a file containing 10 million lines, of which 99.99% of the
lines are not either comments, blank, or start with white-space, then you’ll end up executing
approximately 40 million tests. If you put the bottom most test (the test for lines without
leading white-space) first, then this code will now run and execute about 10 million tests.
Hints For Readable Code - I
But this is …
my $lef_filename = undef;
my $log_filename = undef;
my $default_log_filename = "lefPortStrip.log";
my $pin_names_r = [];
my $layer_names_r = [];
run_lef_import( $lef_filename ,
$log_filename ,
$default_log_filename ,
$pin_names_r ,
$layer_names_r );
Note how much easier it would be to spot the missing opening quote and the missing $
sign on lines 3 and 8 of the upper example.
Hints For Readable Code - II
Do put space both before and after a “,” when separating parameters and list items.
Do put space around most (all) operators.
Do put space around complicated subscripting code.
Don’t forget the semicolon in the one-line block case (the semicolon after the E in
FALSE). it is optional, but it shouldn’t be.
Hints For Readable Code - III
than this:
If ( ( $day == 6 ) &&
( $full_moon == 1 ) &&
{ $spring_equinox == 1 ) )
{
print “It’s Easter Sunday\n”;
}
pragma.
For example;
$array_r->[ 56 ] = PI;
my $number = 56;
$number->[ 0 ] = get_random_integer();
If your code uses references, make sure that the variable names that are used are tagged with
something that makes it obvious they’re references, like _r. If you do this consistently it then
becomes obvious when you try to use something that is/is not a reference in a dereference
operation. For example, in the code above it’s obvious that you should only be using the
dereference operator (the ->) on a reference.
Avoid Using Default Values
foreach ( @_ )
{
print; # By default this statement will print $_
}
When using a loop construct like foreach, don’t use the defaults allowed by Perl. I.e. it is
allowable to say remarkably little, (that doesn’t tell you much about what’s going on and why).
The second example tells you exactly what was/is intended.
Using default values leads to concise code that can be very difficult to read (even if
*you* wrote it). Don’t assume that your code will be debugged by you or that the
person debugging it will know what all the default values are. Keep it clear. Keep it
simple. It’s not the obfuscated Perl contest.
Distinguish Between For And Foreach
The Perl keywords, for and foreach are synonyms, so you can use either one to index through
lists or index through values. However, you will confuse others if you use them the wrong way
around (foreach with an index or for with a list).
Use Common Sense - I
Use meaningful variable and subroutine names. Don’t use variables with the names $a and
$b. See the man page for sort() to understand why. Name variables using my (i.e., use lexical
variables). Never use global variables and don’t be tempted in the heat of debugging to insert
just one or two to get around a problem. Use lots of comments. You’ll be amazed how quickly
you’ll forget just what it was you were trying to express in your code a day, a week, a month, a
year ago. When in doubt use parentheses. Just because you can omit them doesn’t mean you
should omit them. If your program is running for more than a few seconds, give your users
some feedback. If you’re programming a GUI in PerlTk, use a progress bar. If your program is
a command line driven program then always program a -help parameter to give users some
idea of what the program does and what to type. Make the invocation of the program with no
parameters display some help information. Give a user the option to get more help with a
–help parameter. Make error messages clear so a user knows what to fix when things don’t run
the way they expect.
Use Common Sense - II
exit( EXIT_OKAY );
Since many programs are often chained together or are run within a single controlling
program, make sure all scripts return an error or success code. Error codes for success are
always 0 (zero). If programs are designed to be chained together in a shell script, then follow
the Unix philosophy of having programs that complete successfully return no output at all (i.e.,
they are silent).
Always return a value from both your program and any subroutines in that program. If you
don’t use an explicit return statement then the value returned is the result of last statement
evaluated. This will change as you modify your code, and in particular since most code is
added at the end of a program, the return value from what you’re currently writing will be
changing what is seen by whatever wrapper is running your code.
If it’s vital that your code not return a value, because, say, you want to indicate that an error
occurred but it wasn’t a fatal error, then return undef. In Perl undef is a value that
represents not defined.
Common Traps
use IO::Handle;
Create the filehandle
use IO::File;
< = read
MAIN: > = write
{ >> = append
my $logfile = “log.log”;
# Then later . . .
# Then later
exit;
Note: no comma
}
The code shows how we can create a lexical variable that is a filehandle. We can pass
this to any subroutine at any stack depth and print information to it as shown. Note that
as with normal filehandles, there is no comma between the filehandle and the thing
that is being printed to it.
Unlike normal filehandles this filehandle is a lexical variable. You can explicitly close
the handle with close, or, you can just let the handle go out of scope at which point it
will be automatically closed.
In early versions of Perl (when machine speeds were 66MHz) there was a
considerable time overhead in loading the vast amount of code that is hidden behind
IO::Handle and IO::File. With modern machine speeds this is no longer an issue.
Add A Command Line To Your Program - I
Here’s a template:
sub Parse_Command_Line_Arguments( $ ) while ( $numargs-- )
{ {
my ( $arguments_r ) = @_; $next_arg = shift( @$arguments_r ); SWITCH:
{
my $usage = “my_prog -input <input filename> if ( $next_arg =~ m/^\-input/i )
-output <output filename> {
[-print_flag]”; $input_filename = shift( @$arguments_r ); $numargs-- ;
last SWITCH;
my $numargs = @$arguments_r; }
my $argument = undef; if ( $next_arg =~ m/^\-output/i )
{
foreach $argument ( @$arguments_r ) $output_filename = shift( @$arguments_r ); $numargs--;
{ last SWITCH;
if ( $argument =~ m/\-help/i ) }
{ if ( $next_arg =~ m/^\-print_flag/i )
# Help requested {
exit 0; $print_flag = TRUE;
} last SWITCH;
} }
if ( $next_arg =~ m/^\-/i )
if ( $numargs < 1 ) # Process all arguments {
{ croak( "Unknown command line switch $next_arg" ); }
print ( "\nUsage: $usage\n" );
print ( "\nUse my_prog -h to get more help\n\n" );
exit 0; }
} }
}
my $next_arg = undef; return ( $input_filename , $output_filename , $print_flag );
my $input_filename = undef; }
my $output_filename = undef;
my $print_flag = FALSE;
All of this code is in the file perl.template in the release directory of this course.
Add A Command Line To Your Program - III
All of this code is in the file perl.template in the release directory of this course.
Parsing Files - I
Use Netlist_Tools;
MAIN:
{
my $file_r = Read_File( “BigFile.txt” );
exit 0;
}
Note that each time through the foreach loop, $line is a reference, not a copy.
You want to load and loop through all the lines of a file performing some programming tasks
on some or all of the lines. You then want to write out a new file containing whatever
manipulations you’ve done.
This code will load one of (in this order) BigFile.txt, BigFile.txt.gz,
BigFile.txt.gzip. If you specify an output filename in Write_File that is suffixed in
either .gz or .gzip then the file will be compressed (with gzip) before it is written.
A major advantage of Read_File is that not only will it transparently read in the file via
gzip if necessary, all the lines are then formatted so that every line is in a list that can be
iterated, and every line is guaranteed to have no white-space before the first non-white-space
character. There will also be no white-space at the end of the line and all “words” on a line
will be separated by exactly one space.
If you don’t want the formatting that Read_File imposes then use
Read_File_Without_Formatting to get at the raw unaltered data. This will make the
regular expressions that detect the information you’re interested in finding, more
complicated, but, if the formatting is important, it will be preserved.
Parsing Files - II
This is a standard Perl idiom for reading, and then writing, a new file:
use Netlist_Tools;
MAIN:
{
my $in_file_r = Read_File( “BigFile.txt” );
my $out_file_r = [];
exit 0;
}
If, alternatively, you want to create a new file based on some or all of the contents of an input
file, you can re-write the body of the code in the previous program like this:
In this code we still read an input file, but, rather then altering the information in that file
(and thus destroying the original) we make a new file on-the-fly and then write that to
disk under a new name.
Interacting With A Compute Farm (The LSF Queue)
# This example shows how to use ELDO for which we have 4 licenses. We’ll limit ourselves to use 2 of them.
# While we’re limiting ourselves here because of scarce license resource, the same code can stop queues being
# flooded with jobs that are pending but consuming queue slots (and making yourself pretty damn unpopular).
my ( $running_jobs , $jobs_limit ) = ( 0 , 2 );
exit;
The LSF queuing system allows CPU intensive jobs to use the shared CPU resource of most of
the machines in this building. Here’s how to interface to that queuing system, while limiting
yourself to a predetermined number of jobs and adding new jobs to the queue as old jobs
complete:
Using Object Oriented (OO) Modules - I
For OO modules you really do need to read the documentation and look at examples.
OO programming is quite different from declarative programming. In OO you create
objects and then rather than call functions and procedures to manipulate (potentially
shared) data, you have objects send messages to each other to achieve the same
thing. If you’ve never done this before it can all seem a little weird.
Using Object Oriented (OO) Modules - II
#!/usr/local/bin/perl
use Bitmap;
use Palette;
use Colour;
MAIN:
{
my $palette_r = Palette->new( "Palette_1024.pal" );
my $colours_in_palette = $palette_r->get_palette_colour_count(); Start
my $bitmap_r = Bitmap->new( $colours_in_palette , BMP_HEIGHT );
foreach my $x ( 0 .. ( $colours_in_palette - 1 ) )
{
my $colour_r = $palette_r->get_indexed_colour( $x ); Do work
$bitmap_r->vline( 0 , ( BMP_HEIGHT - 1 ) , $x , $colour_r );
}
$bitmap_r->save( "Scale.bmp" );
End
exit;
}
So, we create a new colour palette by sending Palette the new message with the
name of a file that contains a description of our colour palette. In return we get a
palette object that we store in a variable called $palette_r. We can count how many
colours are in the palette we’ve just created by sending the newly created $palette_r
object the get_palette_colour_count() message - this returns a number, the number of
colours in the palette.
We next create a bitmap object by sending a Bitmap the new message. The two
parameters that new() requires are the X and Y dimensions of the bitmap.
Using Object Oriented (OO) Modules - III
Have you noticed that we’re not talking in Perl any more!
Using The Debugger
This file exists as a stand-alone .PDF file in the release area for this course.
Using Regular Expressions
This file exists as a stand-alone .PDF file in the release area for this course.
Perl For Beginners - September 2005 - Course Feedback Form
Yes No