Introduction to Perl language, with step by step examples

Visibility is very important for this site. If you like it please link to this URL or use our online form to add your reciprocal link. Learn more about reciprocal links

Links Home

Perl Tutorial

(C) S Projects. Any distribution of any parts of this site is strictly prohibited, unless explicitly stated in text.

All contents of this site is provided strictly on the AS IS basis. The author should not be held responsible for any negative effects that may result from reading or applying the information or using software provided here. The author does not make any medical, financial or other claims - all statements are author's opinions ONLY. Use your own judgment.

Link to us and make 20% commissions through our affiliate program

Introduction

What is this book?

This is a "practical guide" to Perl. In part 1 I am going to provide some Perl basics and then - in part 2 - we will use them to create few practical applications, such as web counters.

The book will help you to get hands-on experience, as well as some working code that you can use on your own sites and for your own projects.

When you read different online Perl (not only Perl) tutorials, you might notice, that they are incomplete - one tutorial is missing something, while the other tutorial has this "something" - and missing something else.

It is inevitable, and I am not going to tell you that this tutorial is complete because it is not. However in this book I pulled together the bits and pieces that I consider most important, so that - being incomplete - it is nevertheless SUFFICIENT for 99 (OK, 95) percent of the tasks the average Perl programmer is solving daily.

What is Perl?

I cannot think of the language that would allow you to create web site "engine" faster, then Perl. You can, literally, create a CGI empowered site in a day. Match making site? A day or two. Counter for the Web site, that has all features YOU need? Two days. Email autoresponder? Few days. I am going to use some of these examples later in this book.

Perl was created as the simple alternative to low-level languages, such ac C++, and it is important to keep in mind its limitations. Perl is NOT a computational language (one day it may become fast enough, and this statement will not be true anymore). But - unlike C++ or other general-purpose languages - it was created with certain tasks in mind.

I prefer to think of Perl as the slow language with powerful and very Well-designed libraries. When you perform tasks that Perl was intended for, it is a great tool. When you do something else, it is not that great anymore.

Comments in Perl program

You can comment lines in Perl program, which means that the Perl interpreter will ignore commented information. To comment a line, use the pound sign, all information from the "#" till the end of the line will become commented:

# This line is commented

Variables

Scalars

Perl stores single values in so-called scalar variables. For example, if we want to introduce a variable "income" and to set a value for it, for example 10,000, we write: $income = 10000;

The $ sign is used to distinguish between scalar variable and other Perl constructions. The "=" sign is used for assignment. Finally, the semicolon is used to close the statement, you can use multiple statements in Perl program, on the same line or on different lines, separated with semicolons:


$income = 10000; $ x = 2;
$double_income = $x * $income;

Perl stores all values in the same internal format, and then - depending on what is stored in the variable and what is the context, it can interpret it as a string or number:

$x = 3;
$y = "Hi";	# always a string
$str = "$y$x";	# $str equals Hi3, x treated as string
$z = 2 * $x;	# x treated as number

More about Strings

General information

Strings are very important in Perl, and we will find many advancements in the syntax used for handling strings.

Strings can be included in single or double quotes. In single quotes, the string in presented AS IS, no values substituted for the variables and no special characters inserted instead of the sequences like \n or \t.

When double quotes are used, Perl tries to perform all possible substitutions:

$str = '$x\n';	# $str equals $x plus \ plus n
$str = "$x\n";	# $str equals 3 plus a single "new line" symbol

To insert quotes inside the string, we have to precede them with backslash:

$str = "Hello, \"user\"";

Alternative quotation

Sometimes it is inconvenient, for example if the string contains many quoted sub strings. In this case we can use q and qq operators, to specify a separator, different from the quote:

$str = q[Hello, "user"];
$str = q{Hello, "user"};

Long strings

If the string is long, we can use a special format, specifying that the string is considered to be "everything from here and till this separator:

$str = << "EOF";
Hello, $name,
Welcome to our
Perl tutorial
EOF

Note the use of the $name in the example above. This is a variable that will be substituted with its current value, for example: "Hello, John".

String functions

substr

The substr function is used to retrieve part of a string.

$subs = substr($str, $offset, $count);

If $count is omitted, the sub string includes everything from the current position till the end of string.

Note: we can do similar things using patterns (below), but if we know the exact position, the substr is much faster.

The substr can also be used to change the string:

# Replace "John Smith" with "Billy Smith"
substr($str, 0, 4) = "Billy";

substr($str, 0, 1) = "";	# remove first character

substr($str, -2) = "";		# remove the last 2 characters

unpack

The function unpack can be used to retrieve few sub strings in a time:

# Get 4 characters, skip 2, get 4, skip 1, get the rest
($str1, $str2, $str3) = unpack("A4 x2 A4 x1 A*", $str);

# Get strings, 10 characters each
@result = unpack("A10" x (length($str) / 10), $str);

x (lowercase) skips characters to the right, X (uppercase) - to the left.

Control sequences

In Perl programs, you can use the CONTROL SEQUENCES, preceded by backslash special sequences. Some important ones:

\n (new line)
\r (carriage return)
\t (tab)
\f (new page)
\b (backspace)
\a (signal)
\e (escape)
\cC (control symbol, CTRL-C in this case)
\\ (backslash)
\l (make next character lowercase)
\L (make all characters until \E lowercase)
\u, \U - upper register
\E - end of \L, \U (and \Q)

undef

When there is no such variable, or when a variable does not have any sense, Perl uses "undef". As in many other cases in Perl, "undef" can be used in calculations without getting error messages, it will be treated as 0 or "" (empty string), depending on the context. The "undef" is very useful in hashes (see below), when you need to find out, if hash contains particular data or not.

$_

This variable is used by some functions if arguments are not provided. For example:

foreach (@income)
{
	do something with next element
	of @income, stored in $_
}

Operations

Numbers

Let's list main Perl operations for numbers:

+, -, *, /, ** (2 ** 3 means 2 * 2 * 2), % (remainder), <, <=, == (equal), >=, >, != (not equal).

Strings

String operations are different, and as a general rule, you should not use the operation listed above with strings.

To JOIN two strings, we can either place them in the variables and then put these variables inside quotes:

$str1 = "Hello ";
$str2 = "world!";
$result = "$srt1$str2";

Alternatively, we can use the "." operation:

$result = $str1.$str2;

What if we want to join the "wel" and "come", where "wel" comes from a variable and "come" is provided as a string? Can we do:

$x = "wel";
$y = "$xcome";

The answer is no. Perl will look for the $xcome variable, not for $x! To solve this kind of conflicts, we can use the "." operation:

$y = $x."come";

Or we can use the {} separator:

$y = "${x}come";

It is worth mentioning, that when Perl receives a number instead of string, it treats it as a string, as in one of examples above. Also, when Perl receives a string instead of a number, it is trying to convert it to number. "125", "Hello125" and "Hello125world" will be converted to 125 (prefix and suffix removed), and if conversion is not possible (like with "Hello world!"), the number will be 0.

Arrays

General information

Arrays are special kind of data storage variables. Lets say we need to keep incomes for 12 month together, and we do not want to use 12 variables for this purpose, because in future we may need to add more months. We use array:

@income = (1000, 1200, 800, 1000, 800, 1000, 700, 800, 2000, 1000, 1000, 600);

You can initialize an array with sequence:

@days = (1..20, 25, 32);

Now the @days contains elements 1, 2, 3 and so on until 20, 25 and 32.

There is a special function qw (quotation) that allows us to automatically place quotes around elements of an array:

@array = ("Hello", "and", "welcome");
# same as
@array = qw(Hello and welcome);

Array variables begin with @:

@income

To access array elements, we use the following syntax:

$n_th_element = $income[$n];

Notice, that there is a $ sign, not an @ in front of "income". This is because n-th element of an array is, after all, a scalar type of a variable.

Also notice, that arrays are 0-based, it means that the first element in our little example will be $income[0], and not $income[1]. The last (12-th) element will be $income[11].

To find out how many elements are currently in the array, we use a special variable, $#income. It holds the index of the last element of an array (in our case, 11). Another way of getting the size of an array is by using @income in so-called scalar context, as if it was a scalar variable:

$size = @income; # x equals 12

You can merge arrays together, or with new elements, by simply using:

@income = (@income, 1000); 	# Add 13-th element

# Insert 0-th element, and push all elements to the right
@income = (1000, @income);	

# Append @new_array to @income
@income = (@income, @new_array);	

There are also functions that allow you to work with an array as with the stack. In a stack, we can add elements to the right (add 13-th element) or to the left (insert element No 0, make former 0-th element 1st, 1st element - 2nd and so on).

# Same as @income = (@income, $new_element);
push(@income, $new_element);	

# Same as @income = ($new_element, @income);
unshift(@income, $new_element);	

There are also opposite operations:

# 12-th element removed and stored in $last_element
$last_element = pop(@income);	

# remove 0-th element and move the rest to the left
$first_element = shift(@income);	

It is interesting to mention, that you can use more than one element in push and unshift. For example:

# same as @income = (1000, 800, 1200, @income);
unshift(@income, 1000, 800, 1200);	

It is a general rule in Perl that if something can be done in theory, it is usually implemented practically (no unnecessary limitations). Sometimes it is not that obvious, especially for someone who had experience with other, much more restricting programming languages. Example:

($x, @income) = @income;	# same as $x = shift(@income);

Array can be reversed using reverse() function:

@reversed_income = reverse(@income);

Imagine that you have stock prices downloaded from Yahoo. They BEGIN with last day price and go backwards in time. If you want to plot this information, you will have to reverse the array, or your chart will go right to left!

Arrays are automatically adjusting their size. If we assign value to the element of an array outside the current size of an array, the array size will increase, and all intermediate values will be set to "undef".

Sorting

You can sort an array using sort() function. However there are some limitations. First of all, your array does not stay sorted. When you add a new element, it will go to the end or to the beginning, and your array is not sorted anymore!

Second, sort() is using strings, not numbers, so if we sort (1, 2, 5, 7, 12), we will get (1, 12, 2...

Note that sort() returns sorted array, without changing the original one.

What if we want to perform a sort operation based on numeric values? First of all, we need to provide our own sorting function (more about functions in the "subroutines" section):


sub by_number
{
	if($x < $y)
	{
		return -1;
	}
	elsif($a == $b)
	{ 
		return 0;
	}
	else
	{
		return 1;
	} 

}

There is no name convention in Perl sorting, we use a by_ in a function name for convenience only. Now we can call a sort function with the combined sort_by_number name:

The by_number function above can be improved using a very unusual Perl construction:

sub by_number
{ return $a <=> $b; } 

or even shorter:

@sorted = sort { $a <=> $b } @unsorted; 

The cmp function returns -1, 0 or 1 for strings, same way as <=> does it for numbers:

@sorted = sort { $str1 cmp $str2 } @unsorted; 

# or:

@sorted = sort_by_number @unsorted; 

If an array contains ONLY variables, we can use it at the left side of an assignment. There will be no array NAME at the left, so you should probably consider this approach as a way of multiple assignment, or as a way of modifying array at the right. It sounds complicated, but examples below will help:

($x, $y, $z) = (1, "hi", 2);	# multiple assignment
($x, $y) = ($y, $x);	# swap $x and $y
($x, @income) = @income;	# remove first element of @income
(@income, $x) = @income;	# @income not changed, $x == undef

The last example above is interesting. We should place an array at the rightmost position in the list of variables, as it will absorb ALL remaining values.

Slices

In Perl, you can work with more than one array element at once. The following examples illustrate this feature:

@income[0, 1]; # same as ($income[0], $income[1]);
@income[1, 2] = (100, 120);
@income[3, 4] = @income[4, 3];	# swap

@indexes = (0, 1, 2);
@income[@indexes] = (100, 120, 130);

Hashes

General information

Consider hash as two arrays combined. One array contains keys, it is always sorted. The other array contains values associated with these keys. Using key you can find corresponding value fast.

Name of the hash variable begins with %:

%income

To access value, associated with a particular key, use the following syntax:

$value = $hash{$key};

Same way, you can assign values to the elements of a hash:

$hash{$key} = "alpha";

Hash can be represented as an array, containing combinations of key and value. Using this representation, we can assign one hash variable to another:

@hash = %hash;
%hash_1 = @hash;

Same operation can be performed directly:

%hash_1 = %hash;

Also, we can use the "hash as an array" approach to initialize the hash:

%hash = ($key_1, $value_1, $key_2, $value_2);

Practical example: we have a hash, some of the VALUES of which are duplicated. We need to remove duplicated values.

Solution: use the reverse() function TWICE. First time the keys will become values and values will become keys. As duplicated keys are not allowed, the duplications will be discarded. The second call to reverse() will recreate the original hash, without duplicated values:

%r_hash = reverse(%hash);
%hash = reverse(%r_hash);

Functions used with hashes

keys

Returns the list of keys:

@list = keys(%hash);

Same function can be used to retrieve number of key-value pairs in the hash - obviously, as we have an array, and we already know that in scalar context array variable is interpreted as the size of an array:

$size = keys(%hash);

values

Returns an array of all values of the hash.

each

This function can be used in cycle to access all elements of the hash.

while(($key, $value) = each(%hash))
... do something with $key and $value

For more information on using cycles, see corresponding section.

delete

Removes the key-value pair from the hash:

delete $hash{$key};

Slices

Filling the hash like this:

%hash{"first"} = "John";
%hash{"second"} = "Bill";
...

is very time consuming, not to mention, error prone. In Perl, there is more compact way:

$hash{"first", "second"} = ("John", "Bill"};

In more general case, we can write:

@new_hash{keys %hash} = values %hash;

In the example above we took the existing %new_hash, interpreted it as an array and added to it keys and values from the %hash. We can also do it directly:

%new_hash = (%new_hash, %hash);

but as we are interpreting new hash as a sorted hash here, this operation will be slower.

print

There are two ways (more, but these two are the most important) for you to communicate with your program. First, you run "perl program_name" from the command line, and you read the program's output.

Second, program's output is sent to Web browser, and you read it there.

In both cases, you will probably use the "print" command for the output.

print "Hello\n";

Few most common sources of errors:

1. Make sure, the string is either quoted or not (you don't have to put quotes around a single variable):

print "Hello, $name\n";	# correct
print $name;			# correct
print "Hello, $name!\n	# incorrect

2. Make sure there is a semicolon at the end.

3. Make sure you use the escape character (\...) for the reserved symbols, for example, \n, \t and so on.

You can output more than one line, Perl will understand it, and Web Browser will absorb unnecessary spaces, too:

print "Hello, $name!\nIt's been a long time\n
	since your last login, almost $t2 - $t1 milliseconds!\n"	

You can create complete Web pages using the print function, and that's exactly how some of the examples below work.

Logical operators

First of all, there are two groups of comparison operators in Perl, one for strings and one for numbers:

$x != 2;	# $x not equal 2
$x == 2;	# $x equal 2
$str ne "hi";	# $str not equal "hi"
$str eq "hi"; 	# $str equal "hi"

Other than that, comparison operation are very straightforward.

Logical "or" (||) is used in expression that is considered to be true, if at least one of parts, separated by || is true:

if($x == 2 || $str eq "hi")

Logical and (&&) is used in expression that is considered to be true, if all parts, separated by && are true:

if($x == 1 && $str ne "HI")

In case of || operator, the expression is considered true AS SOON AS one of the parts, separated by the || is true. It means that the evaluation of the remaining part or the expression is not performed.

In the complex logical expressions, we can use brackets to make sure our logic is preserved:

if($x == 2 && ($y == 3 || $str ne "who"))

It is possible to use || and ||= operators to assign a value to the variable if it is not already assigned:

$x = $y || $z;
$x ||= $y;

In both cases we use the fact, that the "or" condition is "lazy" - as soon as we have a match (part of a condition is true), the rest is simply ignored. Consider the

$x = $y || $z;

If $y is 0 (or undefined), we have to check $z, and do assignment to $z.

If, on the other hand, $y is not 0, we will have $x equal $y, and $z never checked!

If you only want assignment to happen if the value is defined (and don't care about 0 or ""), use defined function:

$x = defined($y) ? : $z;

The condition ? expression : expression1 is the same as

if(condition)
{
	expression;
}
else
{
	expression1;
}

Note that both || and && operands return the successful operand (instead of true, as in other languages) if succeeded. This is why we can perform assignment in || examples above. Once again, the result will be equal to the FIRST expression that is true:

$x = $a || $b || $c || "unknown";

The && returns first operand if failed and second, if successful.

Logical operations with strings include:

eq (equal), ne (not equal), lt (less than), gt (greater than), le (less or equal), ge (greater or equal).

Flow control

This group on operations is used to alter the flow control of a program, which means that depending on certain conditions, the program will go different way, form cycles or even be terminated.

if

The "if" operator is used to run block of code only if certain condition is true. The syntax is:

if(condition_1)
{
	...do something
}
elsif(condition_2)	# (optional)
{
	...
}
else	# (optional)
{
	...
}

For example:

if($x < 0)
{
	$x = 0;
}
elsif($x == 0 && $y == 0)
{
	exit;
}
else
{
	$x--;
}

The "exit" function is used to unconditionally terminate the program.

The $x-- is the same as $x = $x - 1. There are postfix operations of this type in Perl, for example x++, x--, x -= 2... and prefix operations: ++x, --x...

The difference is illustrated in the following example:

$y = $x++;	# Increase $x, assign $y to $x. $x equals $y
$y = ++$x;	# assign, then increase. $x equals $y + 1

unless

As you already know, "if" allows an access to a particular block of code if the condition is true. The "unless" works the opposite way:

unless($x == 3)
{ 
	... 
}

The code in brackets will be executed only if $x NOT equal 3. Obviously, you can use

if(!($x == 3)) # if not #x equal 3

instead, or simply

if($x != 3)

for

The "if" statement executes some code once, if the condition is true. The "for" statement executes a block of code many times, as long as particular condition is true. As a condition, a simple counter is usually used:

for($i = 0; $i <= 100; $i++)
{ 
	... 
}

Using the "for" cycle, we can access all elements of an array, sequentially:

for($i = 0; $i <= $#array; $i++)
{ 
	do something with $array[$i] 
}

while

The "for" cycle works well with counter-like condition. With complex conditions, as well as with the string conditions it is more convenient to use the "while" cycle:

while($str ne "")
{ 
	... 
}

It is assumed in the example above, that the $str variable is modified somewhere in the cycle, because otherwise the cycle will never exit.

until

Same way as the "if" has a counterpart "unless", which is not really required, but sometimes convenient, the "while" has a counterpart "until":

until(expression)
{ 
	do something as many cycles as it takes 
	for an expression to become true 
}

last, next, redo and continue

The continue keyword is used to go to the next cycle, without completing the rest of the current cycle:

for($i = 0; $i < 100; $i++)
{
	if($i < 20)
	{
		continue;
	}
	... do something
}

The last keyword is used to exit the cycle, regardless what the value of a counter is. Note, that last ignores if and do constructions, it only works with for, foreach, while, until.

The next operator skips the remaining part of the block.

The redo operator moves the execution to the beginning of the block, WITHOUT checking the cycle condition:

condition	# continue goes here
{
		# redo goes here
	...
		# next goes here
}
		# last goes here

labels

What if we want to exit two nested cycles at once? We can use last, next and redo with labels:

my_label_1: while($x < 100)
{
	while($y < 120)
	{
		if($z == 3)
		{
			next my_label_1;
		}
	}
}

Note: you cannot go INSIDE the block using labels, as in this case you will have uninitialized cycle variables.

do...while

In the example above (while) we need to make sure the $str is not equal "" at the beginning, or the cycle will be skipped. Generally, we need some initialization before the "while" cycle. Sometimes this initialization is very similar to what happens inside the cycle, but we still have to place it outside, creating duplicated code.

Instead we can use the do { ... } while(expression); cycle, that works the same way as "while", except the condition is being evaluated at the end, so the cycle is guaranteed to be executed at least once.

Once again, the do...while construction has a do...until counterpart.

foreach

This type of cycle works with the array, it iterates through it, allowing us to access all elements, sequentially:

foreach $i (@income)
{ 
	$i now contains the next element on an array 
}

Alternative notation for conditions

There is an alternative syntax for if that can sometimes make your code easier to read:

expression if condition;

# for example:

print("end") if($x == 0);

# exit cycle when "end" line encountered:

lab_1: while (<>)
{
	last lab_1 if /end/;
}

Similar notation exists for other conditions:

expression unless condition;
expression while condition;
expression until condition;

# for example:

x++ until x == 20;

Date and time

The time function returns time, in seconds, since some moment, that is different for different operation systems. To transform it to more convenient format, use gtime (Greenwich) and ltime (local)

($sec, $min, $hours, $mday, $month, $year, $wday, $yday, 
    $is_summer_time) = 
	ltime();

($day, 4month, $year) = (localtime)[3, 4, 5];

Strings and patterns

Working with strings is one of the most powerful sides of Perl. You will be surprised when you realize how simple string search and replace operations are in this language, compared to even most powerful languages like C++. Of course, we should remember, that it is achieved through the libraries, and libraries are written in C or C++... ;)

Comparison and replacement

Let's say we need to find out, if particular string ($str) contains the 'a' character. In Perl we can write:

if($str =~ m/a/)

where 'm' stands for "match" and can be omitted:

if($str =~ /a/)

If we need to make sure that the $str variable DOES NOT contain 'a', we will use =! instead of =~:

if($str =! /a/)

Let's say we need to find out, if the $str contains the sequence 'abc'. We can use:

if($str =~ /abc/)

There are some special symbols that we can use, to make our search more powerful:

/a*bc/ - a, then any number of any characters (including 0 characters), bc

/a.bc/ - a, ONE character (except for new line), bc

/a?bc/ - a, 0 or 1 character, bc

/a[ef]bc/ - aebc or afbc, any one of the characters inside [] will do. [0,1,2,3,4,5,6,7,8,9] stands for any number, we can write it in a compact form: [0-9]. [a-zA-Z0-9_\-] stands for any small character, any capital character, underscore and dash (minus).

In the example above, we used \- instead of the simple -. The \ is used to precede the character, that otherwise would have a special meaning, like /, ?, * and so on.

[^0-9] - the ^ symbol stands for "not", the pattern will match any sequence, NOT containing 0, 1...9

\d - number, same as [0-9]

\D - non-number, same as [^0-9]

\w - same as [a-zA-Z]

\W - special symbol, same as [^a-zA-Z]

\s - space, same as [ \r\t\n\f], note, the first character in the brackets is space.

\S - non-space, [^ \r\t\n\f]

/a+bc/ - abc or aabc, or aaabc... + stands for "one or more" preceding symbols.

/a{2,15}bc/ - sequence of 2 to 15 a's: aa, aaa, aaaa...

/a{2,}bc/ - a repeated 2 or more times

/a{3}bc/ - aaabc

Consider the following string: "a bbb c bbbb c d"

As a general rule that we need to keep in mind, when we use search pattern like

/a.*c.*d/

the match will go as far as SECOND c character (maximum match). If we want a minimum match, we will use

/a.*?c.*d/

\b - sample matches only if the next symbol is space:

/alpha\b/ matches alpha one but not alphabeta

/\balpha\b/ matches alpha, but not betaalphabeta

As you can guess, \B requires that no word separator were present.

^ stands for the beginning of the line

$ stands for the end of the line:

/^alpha*beta$/

Using the "g" (global) key, we can iterate through the string, symbol by symbol:

while(/(.)/g)
{
	we now have the next character in $1 variable
}

# same thing:
@chars = split(//, $str);

# same thing:
@chars = unpack("C*", $str);

Replacement

In addition to the pattern matching, we can replace the sub string we found, with another string:

$str =~ s/abc/efg/; # replace all 'abc' with 'efg'

We can specify a number of character repetitions, for example:

$str =~ /\t/' 'x4/e;

Substitution

To replace all occasions for a particular character with another one, use tr operation:

tr/ab/ba/;	# replace a with b and b with a
$s =~ tr/a-z/A-Z/;	# capitalize
tr/a-z/s/;	# replace all with z
tr/a-z/s/d;	# replace a with s, remove the rest
$n = tr/a/a/;	# Count a characters 
tr/a-z/_/c;	# replace all NON-characters with _
tr/abc/ddd/s;	# replace and replace dd and ddd with single d

Note that the capitalization is safer to do using either uc (lc for "decapitalization"), or \U flag, because it will work with symbols from non-English language, such as umlaut:

$str = uc($str);
$str = "\U$str";

Brackets in patterns

Using brackets makes our search and replace tools even more powerful. Consider this:

/alpha_(\d\d\d)\s\S\1/

We are looking for something like "alpha_002 002" here. The (\d\d\d) evaluates exactly like \d\d\d (without brackets) except that the sub string found (002) is now stored and we can use it using \1. \1 stands for the first string in brackets (or to be more specific, for the string with the leftmost opening bracket). If we have more strings in brackets, we can use \2, \3...

Using the \1 above, we can match alpha_002 002, and not lapha_002 003.

Choosing one of alternative patterns

Consider this:

/color (blue|red)/

This pattern will match either color blue or color red.

Ignoring case

/abc/i - match abc or ABC (ignore case)

Special variables

When we use brackets to remember parts of the string, the resulting value is stored in the $1, $2 and so on variables. They can be used later in the same pattern or elsewhere in the program:

$str =~ s/abc_(\d)/efg_($1*2)/; # replace abc_1 with efg_2

Global replacement

# replace all occasions of abc, not just the first one
$str =~ s/abc/efg/g;	
$str =~ s/abc/efg/g;	# same, ignore case

split

This function can be used to extract parts of the string. Let's say we have a string $str "abc,1,12". Then

($abc, $x, $y) = split(/,/, $str);

$abc equals "abc", $x equals 1, $y equals 12. The /,/ is a pattern containing a separator.

join

This function does exactly the opposite.

$str = join(",", @income);

$str contains elements of the @income array, separated with commas.

Functions

Perl has functions, that's pretty much all I am going to tell you in this chapter. The list is too large to be explained or even quoted in this chapter, besides there are many tutorials and reference lists online. If you have Perl installed in your system, try "man function_name", or perform the Internet search.

In many cases, Perl functions can be used with arguments, and there are two types of syntax for it - with or without brackets. Compare exit; and exit();

I personally prefer brackets, because it makes (to my opinion) the text more readable.

Perl comes with some functions by default, like "print" function. Some other functions, as well as class libraries, require using additional modules. For example, if you want to process the user input through HTML form, you may want (just for your convenience) to use CGI module:

use CGI;

Then you can call functions, constructors and so on:

query = new CGI;

Some functions you already know, like join(), exit, and few others.

die and warn

This function can be used as a convenient safety tool in case your program is misbehaving (for example, file cannot be opened), and it also has some nice error reporting features, so you don't have to guess what went wrong:

unless(... whatever ...)
{
	die "Error: $!\n";
}

Two important things: first, if we use \n at the end of the error message, die prints not only our text, but also the line number (in program) where the error was encountered.

Second, $! contains the description of the error.

If you don't want your program to be terminated after the error, use the function called warn instead.

Subroutines

In addition to functions that come with Perl or additional modules, you can create your own functions. Let's say, you are writing the email autoresponder, that you are planning to offer as a free service from your site. The user can set up his own messages, and his users can subscribe to them. So the autoresponder will email messages to "users of your user", it will also email statistics to the user, and some extra statistics to you.

Do you have to repeat the code that does emailing few times? No. You can move this code to subroutine and save a lot of time, space, not to mention that your code will become more reliable.

To call the subroutine, precede its name with the & sign:

$z = &sub_sum($x, $y);

After the main part of your program, define the subroutine:

sub sub_sum
{
	my ($x, $y) = @_;
	return $x + $y;
}

First, we created temporary (local for the subroutine) copies of the arguments, that were passed to the function. it is done using the "my". The @_ array is a Perl variable that is used to pass arguments to the subroutine. Note: @_ is local for the current subroutine, therefore you can call one subroutine from another, each one will work with its own copy of @_.

We don't have to do it. Keeping in mind, that variables, unless you use "my" to define them, are global, available anywhere in the program, we can use them and not their copies:

sub sub_sum
{
	return $x + $y;
}

First disadvantage of this approach is the fact that sometimes $x and $y will be altered by the subroutine. Consider:

sub sub_sum
{
	return ++$x;
}

Now, if you don't keep this in mind, you may end up with unexpected results:

$x = 2;
$z = &sub_sum($x);
print $z, $x;

(prints 2, 3)

Sometimes this kind of behavior is exactly what you want...

Another disadvantage is in the name conflict. It is especially unpleasant, when your programming style includes using $i as a cycle variable. Now, if you call a subroutine from within the cycle, and there is another cycle in your subroutine... You have a mess.

The subroutine may return values, and therefore you can use the subroutine as part on an expression:

$z = $x * &sub_sum($x);

Perl does not really care about number of arguments that you pass to the subroutine, as all arguments are stored in a single @_ array. Therefore you can use a function (subroutine) with variable number of arguments, provided, of course, that it can handle any number you provide:

sub sum
{
	$retval = 0;
	for($i = 0; $i < @_; $i++)
	{
		$sum += $_;
	}

	return $sum;
}

Now you will get correct result, when you call

$x = &sum();
$y = &sum(1, 2);
$z = &sum(1..5);

and so on.

my and local

So far we had global variables, available for all functions of a module, and my declaration, that makes a variable local for a function. Another useful scope would be a variable, available from the function AND all functions (subroutines) called from the function (available inside the block). This kind of variables can be declared using local:

local ($x) = 0;
local $_;

Note the last line,

local $_;

It makes $_ local, which means that we create a copy of $_ and don't have to worry about changing its value outside the block. Sometimes it can be considered a good programming style.

Files

Files can be used as permanent data storage, so that your information is safe between sessions or in the case of server problems. The Perl program can access files for reading, for writing, or for reading and writing in the same time.

File access

Perl program can read input from so called Standard Input and it can send output to the Standard Output :

$a = ;
@income = ;
print "$a, @income";

We can determine if there are more lines available in the STDIN, by comparing the input with undef:

while (defined($x = ))
{
	...
}

We can also use a simplified notation, where information is stored in the "invisible" variable $_:

while 
{
	...
}

What if we want to read information from the file, specified in the program's command line? Once again, there is a simplified notation in Perl:

while (<>
{
	do something with $_
}

If we specify more than one file in a command line, the program will read all lines from the first one, then from the second one and so on:

myprog file1 file2 file3...

If there are no files in command line, the program will read the standard input.

Note, that we do not care at this point, what the "file" means. This allows us to read data from the file on disk, or from standard input, or even from the output of another program.

printf, sprintf

We are already familiar (a little) with the print function. Sometimes we need to create output that is formatted in a particular way. We use the printf function. The function receives a format string as the first argument, and list of variables:

printf "%s %10s %8.2f %d\n", $str, $str1, $x, $y;
# same as
printf("%s %10s %8.2f %d\n", $str, $str1, $x, $y);

The sprintf function works similarly, except the output is placed in string, not in file:

$str = sprintf("Hello, %s", $name); 

Files on disk

We can open file using the open() function. This function returns zero (undef) in the case file was not open, and a non-zero value otherwise:

if(open(FILE_HANDLE, $file_name)) 
{ 
	... 
	close(FILE_HANDLE);
}

The FILE_HANDLE can be used to access file for as long as it is open (until we call close(FILE_HANDLE)).

Perl is using a special syntax to open file in a different mode. For example:

$file_name = "my_file.txt";
open(FILE_R, $file_name);	# open file for reading
open(FILE_W, ">$file_name");	# open file for writing
open(FILE_A, ">>$file_name");	# appending at the end
open(FILE_RW, "+<$file_name");	# random reading and writing

Use the following list as a reference:

< : read
> : write, create, overwrite if exists
>> : append, create
+< : read, write
+> : read, write, create, overwrite
+>> : read, append, create

Cursor position

By definition, each file access mode will set a cursor to a specific position in file, so that the next read or write operation will begin at that position. Except for the append mode, it should be the beginning of the file. To move the file cursor, use the seek(FILE_HANDLE, position, from) function, where position means the offset, in bytes, and "from" can have one of three values: 0 - beginning of the file, 1 - current position and 2 - end of file.

File reading and writing

We can read and write using file handle same way we use standard input and output. In the program, we can write:

$x = <>;

Here <> stands for the "empty angle brackets", and $x will be retrieved from the program input (for example, you have to type it at program's prompt). When you use arrays:

@array = <>;

the input will be retrieved, line-by-line, from the standard input. To end the input cycle, type CTRL-D (if you are using standard shell).

Same way, to read data from the file, we place file handle inside the angle brackets:

$x = ;
@array = ;

In this case, the array will contain all lines from the current cursor position till the end of file (file mode must support reading, of course).

The output can be produced in a similar way:

print FILE_HANDLE "x equals $x\n";

Note: ALL information is retrieved, including the \n - so called end of line symbol. The problem is - 1 and 1\n means different things, as Perl will interpret the second one as a string and not a number. To remove the \n character use chomp($x) (removes \n ONLY). You can use chomp(@array), in which case all array elements are processed.

Note: a more generic chop() function will remove ANY last character, not just \n.

End of file

Let's say we want to remove 5th line from the file, containing 10 lines, and let's assume for the sake of simplicity, that all lines are the same length:

a
b
c
d
e
f
g
h
i
j

To do it, we need to:

if(open(FILE, "+<my_file.txt"))	# open for random writing and reading
{
	@array = ;	# read the data
	seek(FILE, 0, 0);	# move the cursor
	for($i = 0; $i <= $#array; $i++)
	{
		if($i != 4)	# 5th element
		{
			print FILE $array[$i];	
		}
	close(FILE);	# close the file
}

However, when we examine the resulting file, we will see an extra line at the end:

a
b
c
d
f
g
h
i
j
j

The reason is - we never changed the size of a file! To fix this problem, we can do one of two things. First, we can do writing and then reduce the file size to the new value. But in this case we have to do extra calculations to determine what this value should be.

Or we can make the file size equal 0 BEFORE we begin writing, so that the size grow automatically:

if(open(FILE, "+<my_file.txt"))
{
	@array = ;
	truncate(FILE, 0);	# reduce size to zero
	for($i = 0; $i <= $#array; $i++)
	{
		if($i != 4)
		{
			print FILE $array[$i];	
		}
	}
	close(FILE);
}

Locks

What if you are writing to a file, and someone else is trying to read - right when you set the file size to zero? What if you and someone else are writing in the same time?

To solve this type of conflict, Perl is using LOCKS, via the flock function. For example, the code above can be made safe, by a simple modification:

if(open(FILE, "+<my_file.txt"))
{
	flock(FILE, 2);	# lock the file exclusively
	@array = ;	# read the data
	truncate(FILE, 0);	# reduce size to zero
	for($i = 0; $i <= $#array; $i++)
	{
		if($i != 4)	# 5th element
		{
			print FILE $array[$i];
		}
	}
	close(FILE);	
}

Making sure file can be accessed

To make sure file exists, use the -e key:

if(-e $file_name)
{ ... }

There are many things we can learn about file:

we can read: -r
we can write: -w
we can execute: -x
file exists and has zero size: -z
-//- non-zero size: -s, returns size in bytes
it is a file: -f
it is a catalog: -d
time since modified, days: -M

and so on.

We can get even more information about a file, using the stat function:

($dev, $ino, $mode, $nlink, $uid, $gid, $rdev, $size, 
    $atime, $mtime, $ctime, 
	$blksize, $blocks) = stat($file_name);

# for example:

($size, $mtime) = stat($file_name[7, 9];

select

We can change the file used by the standard output, used by Perl functions, by calling the select() function:

$old_descriptor = select(NEW_DESCRIPTOR);

It is a VERY good idea to restore the old descriptor after you are done with the new one.

Formatted output

As Perl is used to produce documents, reports and HTML pages, it has a very well developed set of tools for a formatted output. The output format is specified like:


format format_name =
fields,
values,
fields,
values...
.

Think of the formats as of enhanced version of format string used in the sprintf() function: it has information about the data layout and about variables to insert into this layout.

Let's consider a simple example of a format:

format full_name = 
First name: @<<<<<<<<, Last name @<<<<<<<<
$first_name, $last_name
.

Formats can be located in any place of the program, for example, after the subroutines.

To perform the actual output, we call the write function, providing new values through the variables that the format is using:


format INCOME = 
First name: @<<<<<<<<
$first_name
Last name:  @<<<<<<<<
$last_name
Income:     @<<<<<<<<
$income
.

open(INCOME, ">income.txt") || die "cannot open file";
open(PEOPLE, "people.txt") || die "cannot open file";

while()		# John,Smith,10000\n
{
	chomp;
	($first_name, $last_name, $income) = split(/,/);

	write(INCOME);
}

If we need to left justify the output field, we use the @<<<<<...

To have right justified output, we use @>>>>>...

To have centered output, we use @|||||...

To have formatted numeric output, we use pound sign, like @#####.##

To have a string, containing new line characters, use @* instead of @<<<...:


format MULTILINE = 
@*
$multiline_str
.

Format description ends with line containing a single dot.

What if we want to have long string (a paragraph), nicely formatted, for example, with lines broken at spaces between words? In this case we can use ^ instead of @:


format PARAGRAPH = 
Name: @<<<<<<<<
$name
$Information: ^<<<<<<<<<<<<<<<<<<<<<<<<<
    $comment
    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
    $comment
~   ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
    $comment
~   ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
.

In the example above, text is copied from $comment, then the $comment is modified (the text that is already copied is removed), and then the process continues at the next line.

Note the ~ symbol, specifying that the string should only be printed if it is not empty.

Finally, if we have two tildas (~~), the string repeats until $comment has no more characters, therefore we don't have to repeat the last two lines twenty times.

What if we want to print our report on paper and we only want 60 lines per page, and we do not want our report to share pages with other reports AND we want page numbers?

We use page header format, that by default has the same name as a file descriptor, plus "_TOP":


format INCOME_TOP = 
Income report, Page No @<<
$%

Here $% is a Perl variable containing page number.

The default name of the format is the same as for the file used for output. This name is stored in a variable called $~ and can be changed:

$old_file = select NEW_FILE;
$~ = NEW_FORMAT;
select($old_file);
write(NEW_FILE);

The operations above make possible to write to NEW_FILE using NEW_FORMAT format.

The format of the page header can be changed through the $^ variable, $= contains page length (default is 60 lines), finally, the current line number is stored in $-.

Passing parameters to Perl programs

First of all, we can use CGI module, that already has a function called param(). This function works with arguments, passed in the following format: http://your_site/program.cgi?arg1=value1&arg2=value2...

To use CGI module we have to "include" it in our program:

use CGI;

# or if we only need a param():

use CGI qw(param);

$name = param("name");

Of course, the example above expects one of the parameters passed to our program to be "name":

name="John"

Another approach is using the environment string, containing a command line:


$command_line = $ENV{"QUERY_STRING"};

@parameters = split(/[&,=]/, $command_line);

Finally, sometimes parameters are passed as part of the @ARGV array.

Examples

Important !

In this section we will apply our knowledge of Perl programming to the real life tasks, creating the Perl scripts, step by step.

Please note, that these scripts are copyrighted. You can use them free of charge ONLY if you place the link to http://snowcron.com from your page(s) that use these scripts AS WELL as from all pages generated by these scripts. Depending on the contents of your web site and your generosity, you can decide what text to use for the link, but it should be visible and reasonably attractive. For example, let's say your web site is about NLP. There is an NLP section at http://snowcron.com, so the link should be something like NLP and Hypnosis tutorials. If, on the other hand, your web site has something that is not part of http://snowcron.com (for example, accounting), you might use "recommended link", "web site enhancement" (scripts ARE web site enhancements) or something like this.

You can, if you prefer, link directly to one of sub domains of http://snowcron.com, like http://nlp.snowcron.com, http://trader.snowcron.com and so on (see http://snowcron.com for the full list), or to one of sub domains of http://snowcron.com.

Note that placing a link like this is a common practice in the Net. People link to their Webmasters, sponsors and so on.

Note also, that by linking to a high traffic site, especially if you have a site with the similar topic, will increase your site's ranking for major search engines, as well. For more information, check out the eCommerce information Web site.

Web counter

Designing a counter we need

We need a web counter that is represented as a small piece of code included in the HTML page (via SSI, server side includes) and is dynamically replaced by the image or text representation of a number.

The CGI program (counter itself) is a Perl program running at our site.

The counter collects some additional data. It can be rather broad, but for this project let's just keep track of UNIQUE visitors, where UNIQUE means that there was no visitor with the same IP number within the last 24 hours.

The statistics (number of unique visits) is collected separately for each file within the web site, but the information is stored in one file, so that we don't have multiple files floating around.

Also, as we want to have historical data and we cannot monitor our statistics file constantly (we can go on vacations), we want to receive weekly emails with statistics.

Counter setup

First of all, we want our SSI to work with HTM and HTML files. If the defaults on your web server are different, you may need to create an .htaccess file, containing the following two lines:

AddHandler server-parsed .html
AddHandler server-parsed .htm

and to place it in the same directory.

Second, in the HTML file that you want to be "counted", place the following code:

< !--#exec cgi="/counter.cgi" -->

provided the counter file is called counter.cgi and its location is at /counter.cgi. Note: SSI will not allow you to access higher level directories, so it is better to place the counter at the root.

Counter Perl code

The following line should be included at the beginning of your program, to specify that it is a Perl program. If your Perl is located at a different place, you should edit this line:

#!/usr/bin/perl

Next line is included to let the system know, that this program will produce HTML at the browser's request:

print "Content-type:text/html\n\n";

The next few lines are used to fine-tune the counter: how many digits to display, where to create statistics file, and should graphical or text counter be used (or should it be invisible):

$number_of_digits = "";
$end = ".gif";

$pathtocounter = "/home/mydomain/counter_data";
$pathtoimages = "http://subdomain.mydomain.com/images/";

$graphics = "no";
$hidden = 1;

Get information about the visitor and about the file that called a counter:

$ref = $ENV{'DOCUMENT_NAME'};	# index.html
$addr = $ENV{'REMOTE_ADDR'};	# 64.230.106.19

# /home/mydomain/subdomain.mydomain.com
$root = $ENV{'DOCUMENT_ROOT'};	

# /subfolder/ or / if no http://.../, 
# like /subfolder/index.html if http://.../index.html

$uri = $ENV{'REQUEST_URI'};		

Get visitors and counter file names for this counter

$counter_file_name = "$pathtocounter/counter.txt";

# /subfolder/ or /subfolder/index.html ?
$file_name_pos = index($uri, "/$ref");	

There are two scenarios, in case the user is visiting the index.htm(l) file. First, he can type http://subdomain.mydomain.com/subfolder/index.html and second, he can type http://subdomain.mydomain.com/subfolder/ and let the server to substitute the default index file name.

In both cases, we are going to create a temporary file, called zzz_file_name.visitors to keep there list of IP addresses for the last 24 hours. For example, for the file perl_tutorial.htm, the name for the temporary file will be zzz_perl_tutorial.htm.visitors

The zzz_ prefix is very convenient, if you use FTP client to access your site - when sorted alphabetically, zzz_ files are grouped at the end. The .visitors extension will close the obvious security hole, when someone decides to create a .cgi file on your site.

if($file_name_pos == -1)	# /calendar/
{
	$counter_id = "$root$uri$ref";
	$visitors_file_name = "$root$uri" . "zzz_$ref.visitors";
}
else	# /calendar/index.html
{
	# extract /calendar/
    $subdir = substr($uri, 0, $file_name_pos);	
	
	$counter_id = "$root$subdir/$ref";
	$visitors_file_name = 
        "$root$subdir" . "/zzz_$ref.visitors";
}

There are faster ways of determining how old is a file, but we may need a nicely formatted date at some point, so let's get it:


@shortmonths = ("Jan","Feb","Mar","Apr","May","Jun",
    "Jul","Aug",
	"Sep","Oct","Nov","Dec");
($sec,$min,$hr,$mday,$mon,$year,$wday,$yday,$isdst) 
    = localtime(time);
$longyr = $year + 1900;
		
# 03-Oct-1999
$today = "";
$today = $mday."-".$shortmonths[$mon]."-".$longyr;

($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$atime,
    $mtime,$ctime,
	$blksize,$blocks) = stat($visitors_file_name);
($sec,$min,$hr,$mday,$mon,$year,$wday,$yday,$isdst) 
    = localtime($mtime);
$longyr = $year + 1900;

$modified = "";
$modified = $mday."-".$shortmonths[$mon]."-".$longyr;

Next, we need to open an existing statistics file, and to read the counter value. The statistics file looks like this:

/home/mydomain/subdomain.mydomain.com/subfolder/file_name.htm 405
... data for other files ...

A little explanation. There are two ways to specify a path to file. First, via the http://... if the file is accessible from the Internet. And second, through the Unix-type file path. It is important to keep in mind, that file i/o functions only work with the local Unix-type path. The advantage of this limitation is the fact, that the counter, located at subdomain.domain.com can write to domain.com. We use it, by keeping all statistics, for all sub domains in one file located at counter_data directory, which, by the way, is not accessible from the Internet.

It is possible, that your server has different path defaults, so you may need to modify the "home/... " line.


@counters = ();
$count = 0;
$counter_num = 0;
$counter_exists = 0;

if(open(COUNTER_F, "+<$counter_file_name"))
{
	&lock(COUNTER_F);

	@counters = ;
	chomp(@counters);

	for($i = 0; $i <= $#counters; $i++)
	{
		$space = index($counters[$i], " ");

		# you can use the split() instead
		$cnt_id = substr($counters[$i], 0, $space);	

		if($cnt_id eq $counter_id)
		{
			$counter_exists = 1;
			$count = substr($counters[$i], $space + 1);
			$counter_num = $i;
			last;
		}
	}
}
# if no file - create one
elsif(open(COUNTER_F, ">$counter_file_name"))	
{
	&lock(COUNTER_F);
}
else		# failed completely
{
	exit;
}

if($counter_exists == 0)
{
	push(@counters, "$counter_id 0");
	$counter_num = $#counters;
}

What we just did: first, we read the file and split the lines to path and counter value.

Second, if we found a record for the file that called a counter, we memorize its counter, otherwise we insert a new record for it. The counter is not yet increased, as we have to make sure this visitor is unique.

Get the daily (unique) visitors


$is_unique = 1;

@visitors = ();
if($modified eq $today and open(VISITORS_F, 
    "+<$visitors_file_name"))
{
	&lock(VISITORS_F);
	@visitors = <VISITORS_F>;

	chomp(@visitors);

	for($i = 0; $i <= $#visitors; $i++)
	{
		if($visitors[$i] eq $addr)
		{
			$is_unique = 0;
			break;
		}
	}
}
elsif(open(VISITORS_F, ">$visitors_file_name"))
{
	&lock(VISITORS_F);
}
else
{
	exit;
}

At this point we know if the visitor was unique. Then we can increase the counter.


if($is_unique == 1)
{
	push(@visitors, $addr);

	# Increase Count

	$count += 1;

	$counters[$counter_num] = "$counter_id $count";

	seek(COUNTER_F, 0, 0);
	for($i = 0; $i <= $#counters; $i++)
	{
		print COUNTER_F "$counters[$i]\n";
	}

	seek(VISITORS_F, 0, 0);
	for($i = 0; $i <= $#visitors; $i++)
	{
		print VISITORS_F "$visitors[$i]\n";
	}
}

Close files and release all locks.


close(VISITORS_F);
close(COUNTER_F);

If the counter is visible, we need to display it.


if($hidden == 0)
{
	# count existing digits (like 3 in 101, 5 in 10102...
	@digits = split(//, $count);

	if ($number_of_digits eq "") 
	{
		$n_digits = @digits;
	} 
	else 
	{
		$n_digits = $number_of_digits;
	}

	# Fill the remaining empty spaces

	$format = '%0' . $n_digits . 'd';
	$count = sprintf("$format", $count);

	@images = split(//, $count);

	if($graphics eq "yes") 
	{
		foreach $image (@images) 
		{
			$str_image = "< img src = 
                \"$pathtoimages" . "$image" . "$end\"> ";
			print ("$str_image");
		} 
	}
	else 
	{
		print ("$count");
	}
}	# if not hidden
	

Finally, the lock() subroutine:


sub lock
{
	my($file) = @_;

	# exclusive lock
	flock($file, 2);

	# Make sure we are at the beginning	
    # seek($file, 0, 0);
}

Cron job

Cron job is a Unix task that is performed regularly by the timer. If we want to receive counter statistics by email, weekly, then we have to create a program that will do mailing and to set it one of our cron jobs. I am not going to explain details on setting up a cron job here, as it is very simple and there are many tutorials in the Internet.

Let's take a look at the code, instead. We presume, that the program is located in the cron_job directory on our server. It is better, of course, to make it not accessible from the Internet.

Standard Perl header:


#!/usr/bin/perl
print "Content-type:text/html\n\n";

Specify the path to the directory, no trailing slash "/", to the file containing your email, and to the mail program:


$dir = "/home/snowcron/counter_data";
$email_file = "$dir/counter_mail.txt";
$mailprog = '/usr/sbin/sendmail';

Next, we need to read the statistics file, to open a pipe (a data feeding connection) to the Unix mailer, called sendmail, and to print data to this pipe. The moment we close the pipe, the sendmail will send a message to our mailbox.


if(open(MAIL_F, $email_file))
{
	$email = ;
	close(MAIL_F);

	chomp($email);

	$subject = "Counter report for my counters";

	$body = "";

	$counter_file = "$dir/counter.txt";

	if(open(COUNTER_F, $counter_file))
	{
		@counters = ;
		close(COUNTER_F);

		for($i = 0; $i <= $#counters; $i++)
		{
			$body = $body.$counters[$i];		
		}

		&sendmail($email, $subject, $body);
	}
}

Finally, the mailer itself:


sub sendmail
{
	my($recipient, $subject, $body) = @_;

	open(MAIL, "|$mailprog -t -rSnow") or return 0;

	print MAIL "To: $recipient\n";

	# Subject line. Note that the header section 
    # ends with two \n\n's

	print MAIL "Subject: $subject\n\n";

	print MAIL $body;

	close(MAIL);

	return 1;
}

That's it! Now, every time you call this program (for example, from the cron job), you will receive an email with your up to date statistics.

Simple link manager

In the following chapter I am going to create a very simple link manager. It does not work with reciprocal links, as it requires using Perl modules for reading files from the internet, which is beyond the scope of this book. instead, the visitor can use a simple form to add his URL to your list of "useful links".

With only slight modification, the same code will work for the guest book or Internet forum.

Design

There is a file links.htm somewhere on your site. In this file, you can use any HTML decorations you want. The only requirement is that you place the <!-- links_section --> text. The program will read the text, insert a new link (received from the form) after the "link section" tag, thank the user and exit.

The HTML form itself looks like this (you may want to take a look at the source code as well). Note, that more than one sub domain can be specified, and therefore, you can manage more than one links.htm files, like http://nlp.snowcron.com/links.htm for NLP - related links, http://trader.snowcron.com/links.htm for stock trading related links and so on.

Perl code

Standard Perl header:


#!/usr/bin/perl
print "Content-type:text/html\n\n";

Path information


$site_path = "mydomain.com";
$data_path = "/home/mydomain";

As we receive data from the form, we need to load it into a hash called FORM for further use. Note that some minimum security is also implemented.


read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
@pairs = split(/&/, $buffer);
foreach $pair (@pairs) 
{
	($name, $value) = split(/=/, $pair);
	$value =~ tr/+/ /;  
	$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/
        pack("C", hex($1))/eg;
	$FORM{$name} = $value;
}

Remove suspicious characters.


$link = &verify_url($FORM{'link'});

# something was removed during verification
if($link ne $FORM{'link'})	
{
	&wrong_url($link);
	exit;
}

$backlink = &verify_url($FORM{'backlink'});

# something was removed during verification
if($backlink ne $FORM{'backlink'})	
{
	&wrong_url($backlink);
	exit;
}

$label = &verify_text($FORM{'label'});

# something was removed during verification
if($label ne $FORM{'label'})	
{
	&wrong_text("text");
	exit;
}

$legend = &verify_text($FORM{'legend'});

# something was removed during verification
if($legend ne $FORM{'legend'})	
{
	&wrong_text("additional text");
	exit;
}

Now we need to read the file (one of the links.htm files) from a particular sub domain (specified by "category", and write it back, together with the new link that the visitor provided:


$category = $FORM{'category'};
$links_file_name = "$data_path/$category.$site_path/links.htm";

if(open(LINKS_F, "+<$links_file_name"))
{
	&lock(LINKS_F);

	@links = ;

	chomp(@links);

	@new_links = ();

	for($i = 0; $i <= $#links; $i++)
	{
		push(@new_links, $links[$i]);

		if($links[$i] eq "")
		{
			$new_link = "

$label $legend"; push(@new_links, $new_link); } } seek(LINKS_F, 0, 0); for($i = 0; $i <= $#new_links; $i++) { print LINKS_F "$new_links[$i]\n"; } close(LINKS_F); print "< head >\n</head >\n< body> \n Link added successfully\n < p> Use your browser's BACK button to return\n < /body> \n< /html> \n"; } else { &no_url($links_file_name); exit; }

Finally, subroutines to lock files, verify data and so on. Notice, that the verification subroutines are simply removing all "dangerous" characters from the input.


sub lock
{
	my($file) = @_;

	# This locks the file so no other CGI can 
    # write to it at the same time...
	flock($file, 2);

	# Reset the file pointer to the end of the 
    # file, in case 
	# someone wrote to it while we waited for the lock...
	seek($file, 0, 2);
}

####### verify_url($url)

sub verify_url
{
	my($url) = @_;

	$url =~ s/[\<\>\"\'\%\;\)\(\s\&\+]//g;

	return $url;
}

####### verify_text($text)

sub verify_text
{
	my($text) = @_;

	$text =~ s/[\<\>\"\'\%\;\)\(\&\+]//g;

	return $text;
}

########## Quit if wrong url

sub wrong_url
{
	my($url) = @_;

	print "< head> \n< /head> \n< body> \n
		Unrecognized characters in URL: $url\n
		< p> Please try again.\n
		< p> < a href=\"$site_path/link_ex.htm\"> 
        Link Exchange Page< /a> \n
		< /body> \n";
}

########## Quit if wrong text

sub wrong_text
{
	my($text) = @_;

	print "< head> \n< /head> \n< body> \n
		Unrecognized characters in $text\n
		< p> Please try again.\n
		< p> < a href=\"$site_path/link_ex.htm\"> 
        Link Exchange Page< /a> \n
		< /body> \n";
}

########## Quit if file not found

sub no_url
{
	my($url) = @_;

	print "< head> \n< /head> \n< body> \n
		Cannot find URL: $url\n
		< p> Please try again.\n
		< p> < a href=\"$site_path/link_ex.htm\"> 
        Link Exchange Page< /a> \n
		< /body> \n";
}

About adding a reciprocal link

In the real life you would probably want to place someone else's link on your page only if they link to you first. To do so you need to modify the code above, so that it loads this "someone else’s" links page and looks for the link to your site. Perl has modules to read files using HTML protocol, so this task is not that difficult. However as this tutorial does not cover Perl modules, it is not dealing with reciprocal links either.

Uploading files

No mater if you are working with resume processing center, or designing a dating web site, at some point you might want to allow the visitor to upload files to your site. Using Perl together with HTML forms allows you to do it.

Design issues

First of all, we need a form of particular type, one that allows to send data as binary. To do it, use something like:

< form action="http://subdomain.yourdomain.com/upload.cgi" method="POST" ENCTYPE="multipart/form-data" >

where the ENCTYPE="multipart/form-data" does the trick.

Let's say we need to upload an image. In the form, among other fields, include the following:

Photo 1 (there can be photo 2 and so on): < input type="file" name="photo_1" value="" size=50>

The input type = "file" will create a text field together with the file lookup button.

Let's take a look at the Perl code that works with this form.

Perl code

Standard Perl header:


#!/usr/bin/perl
print "Content-type:text/html\n\n";

We are going to use Perl module called CGI - it makes easier working with the forms. We also need to make sure that the visitor does not upload a 10-megabyte file, so we set the limit to 30 K and we also set a CGI::POST_MAX variable (the CGI:: simply means that we look in CGI module for this variable) to this limit.

Let's also provide some useful path information:


use CGI;

$MAX_SIZE_UPLOAD = 30; # Max size of image, Kb
$CGI::POST_MAX = 1024 * $MAX_SIZE_UPLOAD;

$site_path = "http://subdomain.mydomain.com";
$data_path = "/home/mydomain/data";
$data_site_path = "/home/mydomain/subdomain.mydomain.com";
$site_url = "mydomain.com";

$FORM_ADD_URL = 'http://subdomain.mydomain.com/add.htm';

# upload only these extensions
$extensions = "(\.gif|\.jpg|\.png)";

Now we need to create an object of type CGI, called query. By doing so we gain access to the functions of this object. The syntax for this access changes, compared to what you already know, it will be query->function() instead of simply function(). Other than that, their is no difference:

my $query = new CGI;

"Action" is specified in the form, on HTML file, it is associated with the submit button.

First of all, we do not want our CGI to process data from the wrong form. Second, sometimes it is convenient to process more than one form in the same CGI (for example, add image and delete image). We need a way to distinguish.

In our form, the submit button looks like this:

< input type="submit" Name="action" value="add">

The value equals "add".


$action = $query->param('action');

if($action eq "add")	# only image upload code provided
{
	### Uploading image with the name stored in $photo_1
	
	if(($photo_1_ex = &verify_file_name(
        $query->param('photo_1'))) ne "")
	{
		&upload($query, 'photo_1', 
            "$data_site_path/photos/$photo_1");
	}

	#### End of image upload

}	# end of if("add")

What happens here? First, we make sure that file name has proper extension. Then we call an upload() subroutine:


sub verify_file_name
{
	my($src_file_name) = @_;

	$subs = substr($src_file_name, length($str_file) - 4);

	if(($src_file_name =~ /[\<\>\"\'\%\;\)\(\s\&\+]/) 
        || ($subs !~ /$extentions/))
	{
		return "";
	}
	else
	{
		return $subs;
	}
}

###

sub upload  
{
	my($query, $src_file_name, $dst_file_name) = @_;
    	
	$size = $bytes = 0;
	$_ = $file_query = $query->param($src_file_name);

	s/\w://;
	s/([^\/\\]+)$//;
	$_ = $1;
	s/\.\.+//g;
	s/\s+//g;
	$file_name = $_;
		
	if (! $file_name) 
	{
		&error("Bad file name : $file_name", 1);
	}

    	open(FILE, ">$dst_file_name") || 
		&Error("Error opening file $dst_file_name, error $!", 1);
    	binmode FILE;
    	while($bytes = read($file_query, $buff, 2096)) 
	{
    		$size += $bytes;
        	print FILE $buff;
    	}
    	close(FILE);

    	if ((stat $dst_file_name)[7] <= 0) 
	{
		unlink($dst_file_name);
		return 0;
    	} 
	
	return 1;
}

###

sub error 
{
	my($errortext, $exit) = @_;
    	print "\n$errortext\nBack\n\n";

	if($exit) 
	{ 
		exit; 
	}
}

Conclusion

This was a brief introduction to the basics of Perl programming. You already know enough to create professional programs and to enhance your web site with scripts that are customized for you.

Keep in mind, however, that there is more in Perl. First of all, it supports multitasking, and you can create different threads of execution within one program.

Second, it has many additional libraries and modules, some of them are very powerful, and some of them are very specialized. For example, there is a large part of Perl dealing with network programming, sockets and protocols...

Finally, Perl is still evolving, and as new versions are released, more features will become available.

Good luck.

Link to us and make 20% commissions through our affiliate program

Free intros:

State of Power Tutorial

Hypnosis Tutorial

NLP Tutorial

Working with the Future Tutorial

Hypnotic Inductions

Working with Money

Working with Habits

Manipulation Tutorial


Karate online tutorial

Chi Gun online tutorial

Tai Chi 24 forms online tutorial

Tai Chi 40 forms online tutorial

Tai Chi 108 forms online tutorial

Tai Chi Chi Gun 18 forms online tutorial

Chi Gun (Dao In) Heart and Blood Vessels 8 forms online 
tutorial

Chi Gun (Dao In) Kidneys 8 forms online tutorial

Joints Gymnastics, Chi Gun warm-up online tutorial

Chi Gun tao of Dr. Shi online tutorial


Sore back treatment

Headache Relief Pressure Points Tutorial


Stock trading - technical analysis

Making a small profitable web site

Neural Networks

Flow Charts and decision trees for web sites and 
presentations

Another powerfull Flow Charts Designer

Site Downloader

Thumbnails Generator

Calendar Creator

Touch Typing


Photo gallery


Jewelry: Pearl. How to choose, price estimation,
					take care, history, legends, classification Pearl. How to choose, price estimation, take care, history, 
legends, classification Artifficial Pearl. Links to: How to choose pearl, estimate price, 
take care, history, legends, classification Buying Pearl. Links to: How to choose, estimate price, take care, 
history, legends, classification Jewelry: Pearl. Taking care. Links to: How to choose, estimate 
price, history, legends, classification Jewelry: Classification of Pearl. Links to: How to choose, 
estimate price, take care, history, legends Jewelry: Cultivated Pearl. Links to: How to choose, estimate 
price, take care, history, legends, classification Jewelry: Pearl. History and legends. Links to: How to choose, 
estimate price, take care, classification Jewelry: Pearl. Price estimation. Links to: How to choose, take 
care, history, legends, classification Jewelry: Pearl. What is it? Links to: How to choose, estimate 
price, take care, history, legends, classification

Building a Small Profitable Web Site FREE course
Delivered as 5 E.mails

Name:
E-Mail:

Free NLP, Hypnosis and State of Power Course
Delivered as 5 E.mails

Name:
E-Mail:

Free Stock trading Course
Delivered as 5 E.mails

Name:
E-Mail:

Free Habits Management Course
Delivered as 5 E.mails

Name:
E-Mail:

Free Modern Manipulation Techniques Course
Delivered as 5 E.mails

Name:
E-Mail:

Links Home


NLP, Hypnosis, Power, Manipulation Tai Chi, Chi Gun Neural Networks
Joints Gymnastics Photo album generator
Habit Management Karate tutorial Flow charts for Presentations and Web

Sore back treatment Calendar Creator
Building a small profitable site
Another powerfull Flow Charts Designer
Profitable web site in 9 days Web programming : Perl, XML Site Downloader
Web positioning Shareware Directory


Touch Typing
Stock and FOREX Trading Stock Photo Gallery


cgi form tutorial, free ware perl script, cgi and perl tutorial, cgi scripting tutorial, learn cgi programming, cgi script tutorial, access database from perl, perl script sample, perl content management, cgi programming, cgi scripts, cgi tutorial, common gateway interface, computer books, free cgi tutorial, learn how to create cgi scripts, learning perl, perl algorithms, perl books, perl cgi tutorial, perl developer, perl developers, perl programming, perl software, programming perl, regular expressions, write cgi scripts, writing cgi scripts