Perl's versatile split function

I love Perl’s split function. Far more powerful than its feeble cousin join, split has some wonderful features that should make it a regular feature of any Perl programmer’s toolbox. Let’s look at some examples.

Split a sentence into words

To split a sentence into words, you might think about using a whitespace regex pattern like /\s+/ which splits on contiguous whitespace. Split will ignore trailing whitespace, but what if the input string has leading whitespace? A better option is to use a single space string: ' '. This is a special case where Perl emulates awk and will split on all contiguous whitespace, trimming any leading or trailing whitespace as well.

my @words = split ' ', $sentence;

Or loop through each word and do something:

use 5.010;
say for (split ' ', ' 12 Angry Men ');
# 12
# Angry
# Men

The single-space pattern is also the default pattern for split, which by default operates on $_. This can lead to some seriously minimalist code. For example if I needed to split every name in a list of full names and do something with them:

for (@full_names)
{
    for (split)
    {
        # do something
    }
}

And who says Perl looks like line noise?

Create a char array

To split a word into separate letters, just pass an empty regex // to split:

my @letters = split //, $word;

Parse a URL or filepath

It’s tempting to reach for a regex when parsing strings, but for URLs or filepaths split usually works better. For example if you wanted to get the parent directory from a filepath:

my @directories = split '/', '/home/user/documents/business_plan.ods';
my $parent_directory = $directories[-2];

Here I split the filepath on slash and use the negative index -2 to get the parent directory. The challenge with filepaths is that they can have n depth, but the parent directory of a file will always be the last but one element of a filepath, so split works well.

Extract only the first few columns from a separated file

How many times have you parsed a comma separated file, but didn’t want all of the columns in the file? Let’s say you wanted the first 3 columns from a file, you might do it like this:

while <$read_file>
{
    my @columns = split /,/;
    my $name    = $columns[0];
    my $email   = $columns[1];
    my $account = $columns[2];
    ...
}

This is all well and good, but split can return a limited number of results if you want:

while <$read_file>
{
    my ($name, $email, $account) = split /,/;
    ...
}

Or to revisit an earlier example, splitting on whitespace:

for (@full_names)
{
    my ($firstname, $lastname) = split;
    ...
}

Conclusion

These are just a few examples of Perl’s versatile split function. Check out the official documentation online or via the terminal with $ perldoc -f split.

Tags

David Farrell

David is the founder and editor of PerlTricks.com. An organizer of the New York Perl Meetup, he works for ZipRecruiter as a software developer.

Browse their articles