[ Pobierz całość w formacie PDF ]
frequencies in the string to be searched--you probably want to compare runtimes with and without
it to see which runs faster. Those loops which scan for many short constant strings (including the
constant parts of more complex patterns) will benefit most. You may have only one study active at
a time--if you study a different scalar the first is "unstudied". (The way study works is this: a
linked list of every character in the string to be searched is made, so we know, for example, where
all the k characters are. From each search string, the rarest character is selected, based on some
static frequency tables constructed from some C programs and English text. Only those places that
contain this "rarest" character are examined.)
For example, here is a loop which inserts index producing entries before any line containing a
certain pattern:
while () {
study;
print ".IX foo\n" if /\bfoo\b/;
print ".IX bar\n" if /\bbar\b/;
print ".IX blurfl\n" if /\bblurfl\b/;
...
print;
}
In searching for /\bfoo\b/, only those locations in $_ that contain f will be looked at, because f
is rarer than o . In general, this is a big win except in pathological cases. The only question is
whether it saves you more time than it took to build the linked list in the first place.
Note that if you have to look for strings that you don t know till runtime, you can build an entire
loop as a string and eval that to avoid recompiling all your patterns all the time. Together with
undefining $/ to input entire files as one record, this can be very fast, often faster than specialized
programs like fgrep. The following scans a list of files (@files) for a list of words (@words), and
prints out the names of those files that contain a match:
$search = while () { study; ;
foreach $word (@words) {
$search .= "++\$seen{\$ARGV} if /\\b$word\\b/;\n";
}
$search .= "}";
@ARGV = @files;
undef $/;
eval $search; # this screams
$/ = "\n"; # put back to normal input delim
foreach $file (sort keys(%seen)) {
print $file, "\n";
}
tr/SEARCHLIST/REPLACEMENTLIST/cds
y/SEARCHLIST/REPLACEMENTLIST/cds
Translates all occurrences of the characters found in the search list with the corresponding
character in the replacement list. It returns the number of characters replaced or deleted. If no
string is specified via the =~ operator, the $_ string is translated. (The string specified with =~
or !~
must be a scalar variable, an array element, or an assignment to one of those, i.e. an lvalue.) For
sed devotees, y is provided as a synonym for tr. If the SEARCHLIST is delimited by bracketing
quotes, the REPLACEMENTLIST has its own pair of quotes, which may or may not be bracketing
quotes, e.g. tr[A-Z][a-z] or tr(+-*/)/ABCD/.
If the c modifier is specified, the SEARCHLIST character set is complemented. If the d modifier is
specified, any characters specified by SEARCHLIST that are not found in REPLACEMENTLIST
are deleted. (Note that this is slightly more flexible than the behavior of some tr programs, which
delete anything they find in the SEARCHLIST, period.) If the s modifier is specified, sequences of
characters that were translated to the same character are squashed down to 1 instance of the
character.
If the d modifier was used, the REPLACEMENTLIST is always interpreted exactly as specified.
Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST, the final character is
replicated till it is long enough. If the REPLACEMENTLIST is null, the SEARCHLIST is
replicated. This latter is useful for counting characters in a class, or for squashing character
sequences in a class.
Examples:
$ARGV[1] =~y/A-Z/a-z/; \h |3i # canonicalize to lower case
$cnt = tr/*/*/; \h |3i # count the stars in $_
$cnt = tr/0-9//; \h |3i # count the digits in $_
tr/a-zA-Z//s; \h |3i # bookkeeper -> bokeper
($HOST = $host) =~tr/a-z/A-Z/;
y/a-zA-Z/ /cs; \h |3i # change non-alphas to single space
tr/\200-\377/\0-\177/;\h |3i # delete 8th bit
System Interaction Routines
alarm(SECONDS)
alarm SECONDS
Arranges to have a SIGALRM delivered to this process after the specified number of seconds
(minus 1, actually) have elapsed. Thus, alarm(15) will cause a SIGALRM at some point more than
14 seconds in the future. Only one timer may be counting at once. Each call disables the previous
timer, and an argument of 0 may be supplied to cancel the previous timer without starting a new
one. The returned value is the amount of time remaining on the previous timer.
chdir(EXPR)
chdir EXPR
Changes the working directory to EXPR, if possible. If EXPR is omitted, changes to home
directory. Returns 1 upon success, 0 otherwise. See example under die.
chroot(FILENAME)
[ Pobierz całość w formacie PDF ]