Perl Weekly Challenge 90.

My solutions to the The Perl Weekly Challenge - 090, task 1 and task 2.

Task 1: DNA Sequence

DNA is a long, chainlike molecule which has two strands twisted into a double helix. The two strands are made up of simpler molecules called nucleotides. Each nucleotide is composed of one of the four nitrogen-containing nucleobases cytosine (C), guanine (G), adenine (A) and thymine (T).

You are given DNA sequence, GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG.

Write a script to print nucleiobase count in the given DNA sequence. Also print the complementary sequence where Thymine (T) on one strand is always facing an adenine (A) and vice versa; guanine (G) is always facing a cytosine (C) and vice versa.

To get the complementary sequence use the following mapping:

T => A A => T G => C C => G

To solve this task I split the DNA sequence into its nucleotides and use a couple of hashes to keep a count of each base and to map them to their complement.

The code starts with the usual headers and some useful packages.

# For sequences of DNA get nucleotide counts and its complement
# See https:/wlmb.github.io/2020/12/07/PWC90/#task-1-dna-sequence
use strict;
use warnings;
use v5.10;

use List::Util qw(sum0);

Receive sequences as strings in @ARGV.

say('Usage: ch-1.pl sequence1 sequence2 ...'), exit 1 unless @ARGV;

Initialize the map from nucleotides to their complement.

my %complement_of; # Map dna nucleotide to its complement
@complement_of{my @nucleotides=qw(T A G C)}=qw(A T C G); #initialize

For each sequence, test that it only contains valid bases, complement it, count its nucleotides and report the results.

foreach my $sequence(map {uc} @ARGV){
    say("Wrong sequence $sequence"), next unless $sequence=~/^[@nucleotides]*$/;
    say "  Sequence: $sequence";
    say "Complement: ", complement($sequence);
    my %count_for=get_count_for($sequence);
    say "Nucleotide counts:";
    say "\t$_-$complement_of{$_} $count_for{$_}" for @nucleotides;
    say "\tTotal\t", sum0 values %count_for;
}

Complement a sequence by taking it apart, complementing each nucleatide using the $complement_of map and joining them back.

sub complement { # converts string with a DNA sequence to its complement
    my $sequence=shift;
    return join "", map {$complement_of{$_}} split //, $sequence;
}

Accumulate counts for each nucleotide using a hash.

sub get_count_for { # count nucleotides of a dna sequence
    my $sequence=shift;
    my @nucleotides=split //,$sequence;
    my %count_for; # counts of nucleotides
    @count_for{qw(T A G C)}=((0) x 4); #initialize with 0's
    map {$count_for{$_}++} split //, $sequence;
    return %count_for;
}

Run it providing both valid and invalid inputs.

./ch-1.pl hithere \
   GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG \
   TGCTGGTCGCCG

Results:

Wrong sequence HITHERE
  Sequence: GTAAACCCCTTTTCATTTAGACAGATCGACTCCTTATCCATTCTCAGAGATGTGTTGCTGGTCGCCG
Complement: CATTTGGGGAAAAGTAAATCTGTCTAGCTGAGGAATAGGTAAGAGTCTCTACACAACGACCAGCGGC
Nucleotide counts:
	T-A 22
	A-T 14
	G-C 13
	C-G 18
	Total	67
  Sequence: TGCTGGTCGCCG
Complement: ACGACCAGCGGC
Nucleotide counts:
	T-A 3
	A-T 0
	G-C 5
	C-G 4
	Total	12

Task 2: Ethiopian Multiplication

You are given two positive numbers $A and $B.

Write a script to demonstrate Ethiopian Multiplication using the given numbers.

It turns out that ethiopian multiplication is just ordinary multiplication in the binary system, i.e., we multiply the multiplicand by each digit of the multiplier and shift the result left the number of places corresponding to that digit. Thus, to multiply 6x5 in binary we write

         1  1  0
x        1  0  1
-----------------
=        1  1  0
+     0  0  0
+  1  1  0
-----------------
=  1  1  1  1  0
-----------------

which reads as 110x101=1x110+0x1100+1x11000=11110, or in base 10, 6x5=1x6+0x12+1x24=30. To get the succesive digits of the multiplier we can divide by two (shifth right) and test the least significant digit, i.e., test if the result is odd or even.

As computers use binary internally, the simplest program to perform Ethiopian multiplication would be something as

perl -E 'say $ARGV[0]*$ARGV[1]' 5 6

Result:

30

Nevertheless, to display the process, I did the following program. Start with the usual headers and packages. Use integer arithmetic to avoid fractional quotients.

# Multiply two numbers using the Ethiopian Multiplication
# See https:/wlmb.github.io/2020/12/07/PWC90/#task-2-ethiopian-multiplication
use strict;
use warnings;
use v5.10;
use integer;
use List::Util qw(all);

Receive the two numbers to be multiplied in @ARGV. Verify they are positive integers.

# receive the two numbers in @ARGV
die 'Usage: ./ch2.pl number1 number2' unless @ARGV==2;
my ($x, $y)= @ARGV; #
die 'Expected two non-negative integers'   # check they are postitive integers
    unless all {int($_) eq $_ && $_>=0} ($x, $y); #Use eq to avoid implicit int conversion

Perform initializations.

my $expected_result=$x*$y;
my $result=0;
my $result_string="$x x $y = ";
my $operator="";

Perform the actual multiplication. Divide $x by 2 (shift right) and multiply $y by two (shift left) succesively. Display the partial results, flagging the lines where $x is odd and $y should be added, and construct a string with the final result.

while($x){
    if($x&1){  # $x is odd, add $y to result
	print "->"; # flag line
	$result += $y;
	$result_string .= "$operator 1 x $y";
    } else {   # $x is even, don't add y
	$result_string.="$operator 0 x $y";
    }
    say "\tx=$x\ty=$y";
    $operator=" + ";
    $x>>=1;  # Divde $x by 2
    $y<<=1;  # Multiply $y by 2
}
say " $result_string = $result (Expected: $expected_result)";

Sample run:

./ch-2.pl 11 23

Results:

->	x=11	y=23
->	x=5	y=46
	x=2	y=92
->	x=1	y=184
 11 x 23 =  1 x 23 +  1 x 46 +  0 x 92 +  1 x 184 = 253 (Expected: 253)
Written on December 7, 2020