Introduction
Often a point of contention between developers, Perl’s default variable $_ can spark arguments about code readability and best practices. In this article, we will explore what Perl’s default variable $_ is, how it works, and the best practices when working with it.
As a developer, sooner or later you’ll reach a piece of code that won’t make sense, this is especially true in Perl (even if you wrote the code). This is due to the freedom that Perl offers you as a developer to code yourself into a corner. The fact that Perl comes with the motto:
There’s more than one way to do it.
should be a give away to this.
NOTE: When your options are open the possibility for creativity as well as disaster is greatly increased!
Although a lot of developers consider Perl’s default variable bad practice, unreadable and often just ugly, to me $_ is at the very core of the philosophy of Perl. It allows for great creativity but will cause you and your team pain if used flippantly. Working knowledge of $_ and knowing when to apply it is essential to master Perl.
What Is $_?
Perl’s default variable is exactly what it sounds, a default variable, it’s the variable used when no variable is supplied. This can be difficult for developers to work with, especially if this is a new concept to you. However, it’s well worth learning as it will give you a deeper understanding of Perl and enable you to better work with legacy code, where best practices are not always followed.
How Does It Work?
$_ can be used with the majority of Perl’s functions, both as an input and output. A basic example is a foreach loop. Lets consider a loop without using $_ and then replace the used variables with $_
First, we assign an array of values to the variable @values, then loop over the values printing them as we go. Here we’ve assigned the variable $value to a value in the array as we go, then directly printing it.
my @values = (1, 2 ,3); foreach my $value (@values) { print $value }
We can replace the variable $value with $_. Removing the my $value causes the foreach loop to assign values from the array to $_. $_ is implicit. Perl assumes you want to use it when no variable is supplied. This is also true for using variables, the $value can be removed from the print statement, making the section of code,
foreach (@values) { print; }
Edge Cases
Now, this is how a single loop works. We can make it a little more complex by using a nested loop. What do you expect the output from the following to be?
@a = (1, 2); @b = (3, 4); foreach (@a) { foreach (@b) { print; } }
Within the outer loop $_ is assigned to the current value of @a. However, in the inner loop its reassigned to the current value of @b, meaning the output is 3434.
This example shouldn’t be overly difficult to understand, but it should be obvious that using $_ can cause code to become difficult to read and result in unintended behavior.
Exceptions to the Rule
Shouldn’t we just avoid $_? In many cases, you should try to avoid $_ as you’ll make your code less readable. However, there are some situations where avoiding $_ isn’t possible. Functions such as a map and grep use the explicit form of $_ in their operation.
For example, let’s look at a map. The map takes an array as an input and returns a modified version of the array as an output. In this example, let’s consider a map used to generate squares.
my @numbers = (1, 2, 3, 4); my @squares = map { $_ * $_ } @numbers;
The code is straightforward, take each value in the @numbers array and multiply it by itself. Then, finally store the value in the output array @squares. The value from @numbers is passed into the map as $_, as well as the explicit form of $_ is required to calculate the square.
Final Thoughts
Perl’s $_ variable is powerful and versatile. It allows for great creativity and flexibility. However, overuse will over complicate your code. The choice boils down to sacrificing readability for reduced code clutter. If you find yourself using $_ explicitly outside of the exceptions mentioned above, you would most likely benefit from directly assigning the value to a named variable.