Perl Weekly Challenge 96.
My solutions (task 1 and task 2) to the The Weekly Challenge - 096.
Task 1: Reverse Words
Submitted by: Mohammad S Anwar You are given a string $S.
Write a script to reverse the order of words in the given string. The string may contain leading/trailing spaces. The string may have more than one space between words in the string. Print the result without leading/trailing spaces and there should be only one space between words. Example 1:
Input: $S = "The Weekly Challenge"
Output: "Challenge Weekly The"
Example 2:
Input: $S = " Perl and Raku are part of the same family "
Output: "family same the of part are Raku and Perl"
This seems a very simple task using the Perl’s reverse
function. Then, the task
may be performed in a oneliner.
perl -E 'say "\"", join(" ", reverse split " ", $_),"\"" foreach(@ARGV)' \
"The Weekly Challenge" \
" Perl and Raku are part of the same family "
Results:
"Challenge Weekly The"
"family same the of part are Raku and Perl"
Here I used that leading spaces are removed when splitting on a whitespace
character. Nevertheless, I guess not
only spaces, but all punctuation (non-words) ought to be removed. Thus
I guess it is better to split on /\W+/
.
perl -E 'say "\"", join(" ", reverse split /\W+/, $_),"\"" foreach(@ARGV)' \
"The Weekly Challenge" \
" Perl and Raku are part of the same family "
Results:
"Challenge Weekly The"
"family same the of part are Raku and Perl "
Unfortunately, splitting this way produces a null first element when the string starts with whitespace. That is the reason the last result above ended in a space. Thus, I filter out null elements.
perl -E 'say "\"", join(" ", grep $_, reverse split /\W+/, $_),"\"" foreach(@ARGV)' \
"The Weekly Challenge" \
" Perl and Raku are part of the same family "
Results:
"Challenge Weekly The"
"family same the of part are Raku and Perl"
Thus the complete program would be
# Perl weekly challenge 096
# Task 1: Reverse words
# Print words in reverse order separated by one space
# Remove all non-word characters
# See https://wlmb.github.io/2021/01/18/PWC096/#task-1-reverse-words
use warnings;
use strict;
use v5.10;
say "\"", join(" ", grep $_, reverse split /\W+/, $_),"\"" foreach(@ARGV);
I try it out with a few examples:
./ch-1.pl \
"The Weekly Challenge" \
" Perl and Raku are part of the same family " \
" What happens with punctuation signs? "
Results:
"Challenge Weekly The"
"family same the of part are Raku and Perl"
"signs puntuation with happens What"
Task 2: Edit Distance
Submitted by: Mohammad S Anwar You are given two strings $S1 and $S2.
Write a script to find out the minimum operations required to convert $S1 into $S2. The operations can be insert, remove or replace a character. Please check out Wikipedia page for more information.
Example 1:
Input: $S1 = "kitten"; $S2 = "sitting"
Output: 3
Operation 1: replace 'k' with 's'
Operation 2: replace 'e' with 'i'
Operation 3: insert 'g' at the end
Example 2:
Input: $S1 = "sunday"; $S2 = "monday"
Output: 2
Operation 1: replace 's' with 'm'
Operation 2: replace 'u' with 'o'
This is a very interesting and useful task. As such, there are several modules in CPAN that perform it. For example, Text::Fuzzy returns the distance and the edits encoded as a string with the letters k for keep, d for delete, r for replace and i for insert. With it, the task becomes a one liner.
perl -MText::Fuzzy=distance_edits -E '($x,$y)=@ARGV; ($distance,$edits)=distance_edits($x,$y);' \
-E 'say "Strings: $x $y\nOutput: $distance\nOperations: $edits"' kitten sitting
perl -MText::Fuzzy=distance_edits -E '($x,$y)=@ARGV; ($distance,$edits)=distance_edits($x,$y);' \
-E 'say "Strings: $x $y\nOutput: $distance\nOperations: $edits"' sunday monday
Results:
Strings: kitten sitting
Output: 3
Operations: rkkkrki
Strings: sunday monday
Output: 2
Operations: rrkkkk
Nevertheless, it is a nice exercise to implement the algorithm from
scratch. I start by reading the strings from @ARGV
, splitting them
and counting their characters.
# Perl weekly challenge 096
# Task 2: Edit distance.
# Calculate the number of editions required to edit a string and convert it into another.
# See https://wlmb.github.io/2021/01/11/PWC096/#task-2-edit-distance
use warnings;
use strict;
use v5.10;
# Get the two strings to compare from @ARGV
sub usage {
say "Usage:\n\t./ch-2.pl from_string to_string\n\tComputes the cost to edit one string and produce the other";
exit 1;
}
usage() unless @ARGV==2;
my ($from, $to)=@ARGV;
my @from=split '', $from;
my @to=split '', $to;
my $from_size=@from;
my $to_size=@to;
I follow the Wagner–Fischer algorithm from the Wikipedia. To
that end I build a 2D array $costs
each of whose entries ij
correspond to the
optimal conversion of the leading i
characters of one string to the
leading j
characters of the other. I also build a 2D array
$operations
with an indicator for the last optimal operation
performed to convert the corresponding substrings. I first initialize
these arrays with the $costs->[$i][0]~ to convert substrings of $from
to a null
string, and the $costs->[0][$j]
to convert a null string to
substrings of $to
.
my $costs;
my $operations;
# initialize the costs of $from->'' and ''->$to
$costs->[$_][0]=$_ foreach(0..$from_size); #deletions
$operations->[$_][0]='d' foreach(1..$from_size);
$costs->[0][$_]=$_ foreach(0..$to_size); #insertions
$operations->[0][$_]='i' foreach(1..$to_size);
Then I iteratively choose the optimal strategy for converting a
substring of $from
to a substring of $to
#Build costs matrix and choose best operations for each pair of substrings.
foreach my $j(1..$to_size){
foreach my $i(1..$from_size){
my $subcost=$from[$i-1] eq $to[$j-1]?0:1; # substitution cost
my($cost,$operation);
($cost,$operation)=($costs->[$i-1][$j-1]+$subcost,
$subcost?"r":"k"); #for keep or substitute
($cost,$operation)=($costs->[$i][$j-1]+1, "i")
if $costs->[$i][$j-1]+1 < $cost; #insertion
($cost,$operation)=($costs->[$i-1][$j]+1, "d")
if $costs->[$i-1][$j]+1 < $cost; #deletion
$costs->[$i][$j]=$cost;
$operations->[$i][$j]=$operation;
}
}
The next step is to find the total cost and the sequence of operations
to perform by following the $operations
array backwards using the
fact that replacements don’t alter the size of strings, while
insertions and deletions change it by plus or minus 1.
my $total_cost=$costs->[$from_size][$to_size];
my @operations;
my ($i, $j)=($from_size, $to_size);
# Obtain optimum sequence of operations by examining the $operations array
while($i>0 || $j>0){
my $operation=$operations->[$i][$j];
if($operation eq 'k'){
unshift @operations, "(Keep $from[$i-1])";
--$i;
--$j;
next;
}
if($operation eq 'r'){
unshift @operations, "Replace $from[$i-1] by $to[$j-1]";
--$i;
--$j;
next;
}
if($operation eq 'i'){
unshift @operations, "Insert $to[$j-1]";
--$j;
next;
}
if($operation eq 'd'){
unshift @operations, "Delete $from[$i-1]";
--$i;
next;
}
die "Wrong operation!"; # Shouldn't happen
}
Finally, I output the results.
say "Inputs: \"$from\" -> \"$to\"\nOutput: $total_cost\n";
say "Operation $_: $operations[$_-1]" foreach 1..@operations;
say;
Example 1:
./ch-2.pl "kitten" "sitting"
Results:
Inputs: "kitten" -> "sitting"
Output: 3
Operation 1: Replace k by s
Operation 2: (Keep i)
Operation 3: (Keep t)
Operation 4: (Keep t)
Operation 5: Replace e by i
Operation 6: (Keep n)
Operation 7: Insert g
Example 2:
./ch-2.pl "sunday" "monday"
Results:
Inputs: "sunday" -> "monday"
Output: 2
Operation 1: Replace s by m
Operation 2: Replace u by o
Operation 3: (Keep n)
Operation 4: (Keep d)
Operation 5: (Keep a)
Operation 6: (Keep y)
Other examples to check on null strings
./ch-2.pl "" ""
./ch-2.pl "abc" ""
./ch-2.pl "" "abc"
Results:
Inputs: "" -> ""
Output: 0
Inputs: "abc" -> ""
Output: 3
Operation 1: Delete a
Operation 2: Delete b
Operation 3: Delete c
Inputs: "" -> "abc"
Output: 3
Operation 1: Insert a
Operation 2: Insert b
Operation 3: Insert c