Home > programming > Naming and Some Common String and Array Patterns

Naming and Some Common String and Array Patterns

September 25th, 2008 Leave a comment Go to comments

How many times have you had to display a list of items like this:

1
apples - oranges - ququxes - bananas

The newbie programmer (like I myself was and still am in many respects) will often hammer out this piece of code:

1
2
3
4
5
6
7
String output = "";
for( int i = 0; i < fruits.size(); i++ ) {
  output += fruits.get(i);
  output += " - ";
}
// Now remove the last separator
output = output.substring( 0, output.length() - 3 );

Hey, this is convoluted and ugly. There should be a library function that does this for us. Everybody uses arrays and strings. This is such a basic problem that someone must have already implemented it in some library somewhere, if it’s not already in our language’s standard library.

So let’s turn to Google, to find a library that can save us.

But … what search terms should we use?

In other words, what is this problem’s name?

Once you can name something, you can master it. This is a thought that stuck with me while reading the classic introductory textbook to programming, Structure and Interpretation of Computer Programs.

We learn new words, and even make up new words, because it helps us communicate and understand ideas and thoughts of ever increasing complexity.

Take a simple sentence, like The government will lower taxes. Now try to express the same idea, with the same precision, but without using the words government, or taxes. You’ll end up with something like: “The body of individuals elected by the people of a country, with a limited-time mandate to run the country, will lower the amount that each individual must contribute to support the country’s shared expenses.” Not pretty, by any measure.

Enough with the theory, let’s return to the craft of programming.

There exist a bunch of simple but very common problems, involving arrays and strings. And I want to remind you that they have names, and how important it is to know these names. It will help you not only in finding the right library call to do what you are trying to do, but it will also help you recognise these problems in your day to day programming work. You’ll be able to think Hey, this looks like a map followed by a select, I know exactly how to solve it quickly and elegantly.

join

Let’s backtrack for a minute to the start of this post. What we have there is a problem that can be summarised as:

1
2
["apples", "oranges", "ququxes", "bananas"]
=> "apples - oranges - ququxes - bananas"

In plain English: we have an array of strings and we want to join together these items into a single string, with a separator string in between the items.

This problem has a classic name, it’s called a join.

Now we know how this problem is called, we can master it. We started out this example in Java, so we’ll have to find a Java library function that does join. And sure enough, commons-lang’s StringUtils class has a helpful little method called join().

Side note: If you use one of the more friendly languages out there, like Python or Ruby, you’ve got it easier. In Ruby it’s as simple as:

1
2
irb(main):001:0> ["apples", "oranges", "ququxes", "bananas"].join(" - ")
=> "apples - oranges - ququxes - bananas"

split

1
2
"apples - oranges - ququxes - bananas"
=> ["apples", "oranges", "ququxes", "bananas"]

This is the reverse of join. Take a string of items separated by a delimiter, and return an array of the individual item strings.

There’s a split function in the standard library of Python, Ruby, PHP, even Java. Oh yeah, in Java it’s as simple as string.split(pattern).

zip

A zip operation, also called combine, means pairing each corresponding item from two arrays:

1
2
3
4
5
6
7
8
["Jan", "Feb", "Mar", "Apr", "May", "Jun"]
[31,    28,     31,    30,     31,     30   ]
=> ["Jan", 31],
["Feb", 28],
["Mar", 31],
["Apr", 30],
["May", 31],
["Jun", 30]

It’s called array_combine() in PHP, and zip in Ruby.

flatten

Taking a nested array (aka array of arrays) and turning it into a, err, flat array:

1
2
["one", ["two", "three"], "four", ["five", ["six", "seven"]]]
=> ["one", "two", "three", "four", "five", "six", "seven"]

map

map, also called collect, transforms each item of an array through a function that you specify, and returns an array of the transformed values. Classic example: finding out the squares of the first positive integers:

1
2
[1, 2, 3, 4, 5, 6]
=> [1, 4, 9, 16, 25, 36]

To achieve this, you pass in a function that takes the argument x and returns x*x. Exactly how you do this varies according to your language. In some languages it’s braindead simple (Ruby), in others it requires a bit more work but still reasonably simple (PHP, Java)

reduce

reduce, also called inject, combines all the values of an array using a function that you specify, to create a single value. Simple example: the sum of all the elements of an array of numbers:

1
2
[1, 2, 3, 4, 5, 6]
=> 21

is a reduce operation where you pass a function of two arguments a and b that returns a+b.

Just like with map above, how you do it varies according to your language. Take a look at Ruby, PHP. I can’t find a similar function for Java, if you know any, please react in the comments section :)

select, reject

These are filters, they retain or remove items from an array according to a function you pass in.

Let’s filter only the items that start with a vowel:

1
2
["apple", "banana", "cherry", "orange", "plum"]
=> ["apple", "orange"]

All you do is call “select” and pass in a function that returns true if its argument starts with a vowel. That’s a function you’ll have to build for yourself, sorry :-P

reject works the same way but returns the items that do not match the function. Using the same function as above, a call to reject works this way:

1
2
["apple", "banana", "cherry", "orange", "plum"]
=> ["banana", "cherry", "plum"]

Of course, if your language or libraries don’t provide a reject call, you can use select and pass in a function that returns false when it was supposed to return true.

Have a look at array_filter if you’re using PHP, select or reject if you’re using Ruby, or commons-collections’ select and selectRejected if you’re using Java.

OK, kids, I hope you learned something today :-) Start recognising common patterns like join or map in the code that you write, and start thinking in terms of them.

Please feel free to suggest additional code samples and links for the languages I have not covered.

Categories: programming
  1. Dan
    September 25th, 2008 at 15:24 | #1

    The Perl way :

    1
    2
    3
    @fruits=('apples','oranges','ququxes','bananas');
    $" = '-';
    $output = "@fruits";

    using join :

    1
    $output = join '-', @fruits;

    the same thing with uppercase characters, using map {}

    1
    $output = join '-', map{ uc } @fruits;
  2. Lelela
    September 25th, 2008 at 16:33 | #2

    Hey, functional programming is very useful regarding vocabulary. I already knew the terms from that class.

  3. Iulian Dogariu
    September 25th, 2008 at 16:36 | #3

    Well yes, all these names come from FP indeed. But usually people freak out the moment they hear the words “functional programming”, and won’t continue listening on, to see how useful and intuitive FP really is.

  4. Alin
    September 25th, 2008 at 16:38 | #4

    A very simple example for *reduce* using Ruby:
    >> [1,4,5].inject{|sum,i| sum+i}

    More explanations from Iulian: This computes the sum of all the numbers in the array. The thing between curly braces is a “block”, which for all practical purposes is a function. The function has two arguments and gets called for each item of the array. The first param, called “sum”, is the value accumulated so far. That’s why they call it “accumulator” in general. The second param is the current item in the array.

  5. September 28th, 2008 at 20:58 | #5

    Hehe, ok, as much as I like the D&D analogy, I have to note that we’re talking about different things though, as /birth names/ are a given that the others *have to* accept as such (ok, even if you nick them), while these concepts don’t. For instance, I doubt there’s anyone else besides those from Ruby communities who would think that “inject” is the same thing as “reduce” ;-)

    ++ IIRC PHP’s functions are not as flexible as everyone else’s (and looking at array_combine’s spec seems to prove it).

    ++ Python has a different opinion about “zipping”, truncating the list of tuples to the length of the shortest argument.

    It’s a boring Sunday, so let’s give your readers some more examples (using your use cases)…

    $ cat ./yu_li_yan_p126.rb
    #!/usr/bin/ruby

    fruits = %w(apple banana cherry orange plum);
    fruits_str = fruits.join(‘ – ‘);
    months = %w(Jan Feb Mar Apr May Jun);
    days = [ 31, 28, 31, 30, 31, 30];
    numbers = (1..6).to_a;
    aoa = [ 'one', [ 'two', 'three' ], ‘four’, [ 'five', [ 'six', 'seven' ] ] ];

    def show(label, object)
    printf “%10s: %s\n”, label, object.inspect
    end

    show(‘join’, fruits_str);
    show(’split’, fruits_str.split(‘ – ‘));
    show(‘zip’, months.zip(days, numbers));
    show(‘flatten’, aoa.flatten);
    show(‘map’, numbers.map {|x| x**2});
    #show(‘reduce’, numbers.reduce(:+)); # needs v1.8.7 ;-)
    show(‘reduce’, numbers.inject {|x,y| x+y});
    #show(’select’, fruits.select {|x| x =~ /^[aeiou]/i});
    show(’select’, fruits.grep(/^[aeiou]/i));
    #show(‘reject’, fruits.reject {|x| x =~ /^[aeiou]/i});
    show(‘reject’, fruits.grep(/^[^aeiou]/i));

    $ ./yu_li_yan_p126.rb
    join: “apple – banana – cherry – orange – plum”
    split: ["apple", "banana", "cherry", "orange", "plum"]
    zip: [["Jan", 31, 1], ["Feb", 28, 2], ["Mar", 31, 3], ["Apr", 30, 4], ["May", 31, 5], ["Jun", 30, 6]]
    flatten: ["one", "two", "three", "four", "five", "six", "seven"]
    map: [1, 4, 9, 16, 25, 36]
    reduce: 21
    select: ["apple", "orange"]
    reject: ["banana", "cherry", "plum"]

    $ cat ./yu_li_yan_p126.py
    #!/usr/bin/python
    import re

    fruits = ['apple', 'banana', 'cherry', 'orange', 'plum']
    fruits_str = ‘ – ‘.join(fruits)
    months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
    days = [ 31, 28, 31, 30, 31, 30 ]
    numbers = range(1, 7)
    aoa = [ 'one', [ 'two', 'three' ], ‘four’, [ 'five', [ 'six', 'seven' ] ] ]

    def show(label,object):
    print ‘%10s: %s’ % (label, object)

    # ok, recursion in python seems to suck big time,
    # so here’s a fucked up iterative version ;-/
    def flatten(arr):
    i = 0
    while i new( [ \@_ ] )->Indent(0)->Terse(1)->Dump;
    }

    sub flatten { # it’s your job to flatten other kinds of refs ;-)
    return map { ref $_ ? flatten(@$_) : $_ } @_;
    }

    sub zip {
    map { my $i = $_; [ map { $_->[$i] } @_ ] } 0 .. max map { $#{$_} } @_;
    }

    show(‘join’, $fruits_str);
    show(’split’, split / – /, $fruits_str);
    show(‘zip’, zip(\@months, \@days, \@numbers));
    show(‘flatten’, flatten($aoa));
    show(‘map’, map { $_**2 } @numbers);
    show(‘reduce’, reduce { $a + $b } @numbers);
    show(’select’, grep { /^[aeiou]/i } @fruits);
    show(‘reject’, grep { !/^[aeiou]/i } @fruits);

    $ ./yu_li_yan_p126.pl
    join: ['apple - banana - cherry - orange - plum']
    split: ['apple','banana','cherry','orange','plum']
    zip: [['Jan',31,1],['Feb',28,2],['Mar',31,3],['Apr',30,4],['May',31,5],['Jun',30,6]]
    flatten: ['one','two','three','four','five','six','seven']
    map: [1,4,9,16,25,36]
    reduce: [21]
    select: ['apple','orange']
    reject: ['banana','cherry','plum']

    $ cat ./yu_li_yan_p126.js
    // use JS v1.8 please, I’m too lazy ;-)
    fruits = ['apple', 'banana', 'cherry', 'orange', 'plum']
    fruits_str = fruits.join(‘ – ‘)
    months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
    days = [ 31, 28, 31, 30, 31, 30]
    numbers = [ 1, 2, 3, 4, 5, 6]
    aoa = [ 'one', [ 'two', 'three' ], ‘four’, [ 'five', [ 'six', 'seven' ] ] ]

    function inspect(object) {
    if(typeof object == “undefined”) return ‘undefined’;
    if(object === null) return ‘null’;
    if(typeof object == “number”) return object
    if(object instanceof Array)
    return ‘[' + object.map(inspect).join(', ') + ']‘;
    return “‘” + object + “‘”;
    }

    function show(label, object) {
    label = new Array(11 – label.length).join(‘ ‘) + label + ‘:’;
    print(label, inspect(object));
    }

    function flatten(x) {
    var ret = [];
    x.forEach( function(i) {
    ret = ret.concat(i instanceof Array ? flatten(i) : [i]);
    });
    return ret;
    }

    function zip() {
    var arrays = Array.prototype.slice.apply(arguments, [0]);
    var max = Math.max.apply( null, arrays.map(function(x) x.length) );
    var ret = [];
    for(var i=0; i < max; i++)
    ret.push(arrays.map(function(x) x[i]));
    return ret;
    }

    show(‘join’, fruits_str)
    show(’split’, fruits_str.split(‘ – ‘))
    show(‘zip’, zip(months, days, numbers));
    show(‘flatten’, flatten(aoa))
    show(‘map’, numbers.map(function(x) x*x))
    show(‘reduce’, numbers.reduce(function(x,y) x+y))
    show(’select’, fruits.filter(function(x) x.match(/^[aeiou]/i)))
    show(‘reject’, fruits.filter(function(x) !x.match(/^[aeiou]/i)))

    $ js ./yu_li_yan_p126.js
    join: ‘apple – banana – cherry – orange – plum’
    split: ['apple', 'banana', 'cherry', 'orange', 'plum']
    zip: [['Jan', 31, 1], ['Feb', 28, 2], ['Mar', 31, 3], ['Apr', 30, 4], ['May', 31, 5], ['Jun', 30, 6]]
    flatten: ['one', 'two', 'three', 'four', 'five', 'six', 'seven']
    map: [1, 4, 9, 16, 25, 36]
    reduce: 21
    select: ['apple', 'orange']
    reject: ['banana', 'cherry', 'plum']

    Hopefully, your blogging software won’t funk up the above code too much ;-)

    Ah, and my apologies to the other languages, I’m too lazy ;-)

    cheers

  6. September 28th, 2008 at 21:06 | #6

    Bollocks, looks like Python and Perl sections were zipped somehow :) )
    After checking my cache, I can say I submitted the right thing. :(

    (removed by Iulian, same problem occurred :-/)

  7. September 28th, 2008 at 21:08 | #7

    and again, the same bullshit happened. :(

    once again, Perl only:

    $ cat ./yu_li_yan_p126.pl
    #!/usr/bin/perl
    use warnings;
    use strict;
    use Data::Dumper ();
    use List::Util qw(max reduce);
    use vars qw($a $b);

    my @fruits = qw(apple banana cherry orange plum);
    my $fruits_str = join ‘ – ‘, @fruits;
    my @months = qw(Jan Feb Mar Apr May Jun);
    my @days = ( 31, 28, 31, 30, 31, 30);
    my @numbers = ( 1 .. 6 );
    my $aoa
    = [ 'one', [ 'two', 'three' ], ‘four’, [ 'five', [ 'six', 'seven' ] ] ];

    sub show {
    printf “%10s: %s\n”,
    shift,
    Data::Dumper->new( [ \@_ ] )->Indent(0)->Terse(1)->Dump;
    }

    sub flatten { # it’s your job to flatten other kinds of refs ;-)
    return map { ref $_ ? flatten(@$_) : $_ } @_;
    }

    sub zip {
    map { my $i = $_; [ map { $_->[$i] } @_ ] } 0 .. max map { $#{$_} } @_;
    }

    show(‘join’, $fruits_str);
    show(’split’, split / – /, $fruits_str);
    show(‘zip’, zip(\@months, \@days, \@numbers));
    show(‘flatten’, flatten($aoa));
    show(‘map’, map { $_**2 } @numbers);
    show(‘reduce’, reduce { $a + $b } @numbers);
    show(’select’, grep { /^[aeiou]/i } @fruits);
    show(‘reject’, grep { !/^[aeiou]/i } @fruits);

    $ ./yu_li_yan_p126.pl
    join: ['apple - banana - cherry - orange - plum']
    split: ['apple','banana','cherry','orange','plum']
    zip: [['Jan',31,1],['Feb',28,2],['Mar',31,3],['Apr',30,4],['May',31,5],['Jun',30,6]]
    flatten: ['one','two','three','four','five','six','seven']
    map: [1,4,9,16,25,36]
    reduce: [21]
    select: ['apple','orange']
    reject: ['banana','cherry','plum']

  8. September 28th, 2008 at 21:10 | #8

    and Python:

    $ cat ./yu_li_yan_p126.py
    #!/usr/bin/python
    import re

    fruits = ['apple', 'banana', 'cherry', 'orange', 'plum']
    fruits_str = ‘ – ‘.join(fruits)
    months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
    days = [ 31, 28, 31, 30, 31, 30 ]
    numbers = range(1, 7)
    aoa = [ 'one', [ 'two', 'three' ], ‘four’, [ 'five', [ 'six', 'seven' ] ] ]

    def show(label,object):
    print ‘%10s: %s’ % (label, object)

    # ok, recursion in python seems to suck big time,
    # so here’s a fucked up iterative version ;-/
    def flatten(arr):
    i = 0
    while i < len(arr):
    while isinstance(arr[i], (list, tuple)):
    if not arr[i]:
    arr.pop(i)
    if not len(arr):
    break
    else:
    arr[i:i+1] = list(arr[i])
    i += 1
    return arr

    show(‘join’, fruits_str)
    show(’split’, fruits_str.split(‘ – ‘))
    show(‘zip’, zip(months, days, numbers))
    show(‘flatten’, flatten(aoa))
    show(‘map’, map(lambda x: x**2, numbers))
    show(‘reduce’, reduce(lambda x,y: x+y, numbers))
    show(’select’, filter(lambda x: re.compile(“^[aeiou]“, re.I).match(x), fruits))
    show(‘reject’, filter(lambda x: not re.compile(“^[aeiou]“, re.I).match(x), fruits))

    $ ./yu_li_yan_p126.py
    join: apple – banana – cherry – orange – plum
    split: ['apple', 'banana', 'cherry', 'orange', 'plum']
    zip: [('Jan', 31, 1), ('Feb', 28, 2), ('Mar', 31, 3), ('Apr', 30, 4), ('May', 31, 5), ('Jun', 30, 6)]
    flatten: ['one', 'two', 'three', 'four', 'five', 'six', 'seven']
    map: [1, 4, 9, 16, 25, 36]
    reduce: 21
    select: ['apple', 'orange']
    reject: ['banana', 'cherry', 'plum']

  9. Iulian Dogariu
    October 9th, 2008 at 10:53 | #9

    Wow, thanks, Marius! That is one hell of a contribution :-)

    I can’t seem to reproduce the problems you had while posting the Perl and Python examples, tho’ :-(

  1. No trackbacks yet.