Removing trailing and leading spaces in PERL
This was hidden somewhere in my forum and thought I would post it here:
I have something like this:
a231aaaa 321bbbbbbbb cccccccccc123
I have designed a parser with the space as the delimiter to accept 3 tokens.
The grammer is something like this:
/s*(.*)/s+(.*)/s+(.*)
The problem is after parsing I get spaces included in my tokens (original strin g had some trailing spaces). How do I get to remove the spaces out of the tokens?
Here is a small piece of code that does the trick for both ends.
Method 1:
$string =~s/^\s+//; -----> Front
$string =~s/\s+$//; -----> End
Some books even have this code:
Method 2: $string =~s[^\s*(.*?)\s*$][$1];
Method 1 is faster and much better than Method 2. There's a very good reason for that. The method 1 does not require any backtracking and can execute very quickly. The method 2 can involve a great deal of backtracking and, in the worst case, could take a very long time indeed. As a contrived example, run this:
$string = ' a' . ' ' x 100000 . 'z ';
print "Starting first trim method\n";
$string =~ s/^ +//; - FRONT
$string =~ s/ +$//; - END
print "Finished\n"; # Instantly
$string = ' a' . ' ' x 100000 . 'z ';
print "Starting second trim method\n";
$string =~ s/^ *(.*?) *$/$1/;
print "Finished\n"; # Six minutes later... zzzzzzz...
I understand the natural desire to express the conceptually atomic trim operation as a single line. One single line the method is:
s/^ +//, s/ +$// for $string;
Even better, this generalises neatly for more than one string:
s/^ +//, s/ +$// for $string1, $string2, $string3;
s/^ +//, s/ +$// for @whole_file_of_lines;
Credit duely given to all those who helped posting the answer on some other perl forums, Thanks!
0 Comments:
Post a Comment
<< Home