Perl Weekly Challenge 92.

My solutions (task 1 and task 2) to the The Weekly Challenge - 092.

Task 1: Isomorphic Strings

Submitted by: Mohammad S Anwar

You are given two strings $A and $B.

Write a script to check if the given strings are Isomorphic. Print 1 if they are otherwise 0.

Example 1:

Input: $A = "abc"; $B = "xyz"
Output: 1

Example 2:

Input: $A = "abb"; $B = "xyy"
Output: 1

Example 3:

Input: $A = "sum"; $B = "add"
Output: 0

This task can be solved by mapping the strings to a canonical form before comparison. For example, each letter may be mapped to its order of appearance: "x" in "xyyxzx" would be mapped to 0, as it is the first letter of the string, while "z" would be mapped to 2, as it is the third letter to appear, although it appears at position 4, so the whole string may be mapped to the canonical list (0,1,1,0,2,0). If the input string is small and/or doesn’t contain too many unique characters, I could also map it back to a string using the letters of the alphabet, so that the canonical string would be "abbaca". Two strings are isomorphic if and only if their canonical representations are identical.

First, some pragmas and packages.

# Perl weekly challenge 092
# Task 1: Isomorphic strings.
# Test if two or more strings are isomorphic
# See https:/wlmb.github.io/2020/12/22/PWC92/#task-1-isomorphic-strings
use warnings;
use strict;
use v5.10;

Take the arguments from the command line.

die "Usage ./ch-1.pl s0 s1 ... to compare two or more strings s0 s1 ..." unless @ARGV>=2;

For simplicity I assume that the string has less than 27 different letters, so that the letters a..z are enough for its representation as a canonical string. Then a simple string comparison may be used.

my $first_string=shift @ARGV;
my $first_canon=canonical_form_of($first_string);
my $first_non_isomorphic;
while(my $string=shift @ARGV){
    $first_non_isomorphic=$string, last
		unless $first_canon eq canonical_form_of($string);
}
say $first_non_isomorphic
    ? "0: $first_string and $first_non_isomorphic are not isomorphic"
    : "1: All strings are mutually isomorphic";

The canonical form is constructed through a hash that is updated whenever a new letter appears in the string.

sub canonical_form_of {
    my %map;
    my $current_letter="a";
    my $canon=join '', map {$map{$_}//($map{$_}=$current_letter++)} split //, shift;
}

Example 1:

./ch-1.pl abc xyz 123 zyx

Results:

1: All strings are mutually isomorphic

Example 2:

./ch-1.pl abc xyz 123 z3x zyy

Results:

0: abc and zyy are not isomorphic

Example 3:

./ch-1.pl abc x0z x0z1

Results:

0: abc and x0z1 are not isomorphic

Task 2: Insert Interval

Submitted by: Mohammad S Anwar

You are given a set of sorted non-overlapping intervals and a new interval.

Write a script to merge the new interval to the given set of intervals.

Example 1:

Input $S = (1,4), (8,10); $N = (2,6)
Output: (1,6), (8,10)

Example 2:

Input $S = (1,2), (3,7), (8,10); $N = (5,8)
Output: (1,2), (3,10)

Example 3:

Input $S = (1,5), (7,9); $N = (10,11)
Output: (1,5), (7,9), (10,11)

Two open intervals (a,b) and (c,d) overlap and may be merged if (a<d) and (c<b) (strict inequalities). The resulting interval is then (min(a,c), max(b,d)).

First, the usual pragmas and packages.

# Perl weekly challenge 092
# Task 2: Insert interval.
# Make a sorted list of non-overlapping intervals by adding or merging new intervals.
# See https:/wlmb.github.io/2020/12/22/PWC92/#task-2-insert-interval
use warnings;
use strict;
use v5.10;
use Scalar::Util qw(looks_like_number);
use List::Util qw(first min max);

Take the argument from a single string in the command line. Add them one by one to the list of intervals.

my @intervals;
foreach(split ' ', shift @ARGV){
    usage() unless /^\((.*),(.*)\)$/
		       and looks_like_number($1) and looks_like_number($2)
		       and $1<$2;
    add_interval([$1,$2]);
}
# print the results
print_interval($_) foreach(@intervals);
say '';

We keep an ordered array of intervals @intervals. To add a new interval (a,b) we first search where to add it by searching the first and last interval with which it overlaps. It cannot overlap any interval before the first interval (c,d) with a<d, and it cannot overlap any interval after the last interval (e,f) with b>e. This defines the $first and $last indices. If no $first was found, the interval is added at the end of the array. Likewise, if no $last was found, the interval is added before the start of the array. If $last<$first, there is no overlap and the new interval is added between those positions. Otherwise, merge all the intervals between $first and $last with (a,b). Continue until all intervals have been added/merged/ordered.

# add interval to ordered @intervals array merging overlapping intervals
sub add_interval {
    my $interval=shift;
    my $first=first {$interval->[0]<$intervals[$_][1]} (0..$#intervals);
    my $last=first {$interval->[1]>$intervals[$_][0]} reverse(0..$#intervals);
    push(@intervals, $interval), return unless defined $first;
    unshift(@intervals, $interval), return  unless defined $last;
    splice(@intervals, $first, 0, $interval), return if $last<$first;
    splice @intervals, $first, $last+1-$first,
	[min($intervals[$first][0], $interval->[0]),
	 max($intervals[$last][1], $interval->[1])];
}

A couple of simple ancillary functions.

sub print_interval {
    my $interval=shift @_;
    print "($interval->[0],$interval->[1]) ";
}
sub usage {
    say <<'END_USAGE';
Usage: ./ch-2.pl "(l0,u0) (l1,u1),..."
where ln<un are numbers and there is no space within the parenthesis.
END_USAGE
    exit 1;
}

Example 1:

./ch-2.pl "(1,4) (8,10) (2,6)"

Results:

(1,6) (8,10)

Example 2:

./ch-2.pl "(1,2) (3,7) (8,10) (5,8)"

Results:

(1,2) (3,8) (8,10)

Notice that this result differs from the result stated in the description of the task

Input $S = (1,2), (3,7), (8,10); $N = (5,8)
Output: (1,2), (3,10)

I guess this is due to an interpretation of the notation. It is common to denote open intervals, i.e., intervals that do not include their end-points, using round parenthesis. Thus (a,b) is the set of all points x such that a<x and x<b. On the other hand, closed intervals, thus that due contain the end-points, are denoted with square parenthesis. Thus [a,b] is the set of of all points x such that a≤x and x≤b. Similarly, one can defined semiopen intervals on the right and left [a,b) and (a,b]. The program could be adapted to closed intervals by changing the intialization of $first and $last above to

my $first=first {$interval->[0]<=$intervals[$_][1]} (0..$#intervals);
my $last=first {$interval->[1]>=$intervals[$_][0]} reverse(0..$#intervals);

replacing the strict inequalities < and > by their counterparts <= and >=. Similarly, the program could be extended to handle any mixture of all kinds of intervals, by deciding which inequality to use according to the kind of parenthesis used at each end.

Example 3:

./ch-2.pl "(1,5) (7,9) (10,11)"

Results:

(1,5) (7,9) (10,11)

Example 4:

./ch-2.pl "(4,5) (2,4) (1,3) (3,5)"

Results:

(1,5)
Written on December 22, 2020