Understanding the Infamous “Just Another Perl Hacker”

As a developer, obfuscation is something that has always fascinated me. Not only do the developers create a functioning piece of code, but they have such a grasp on their chosen language that they are able to create such monstrosities of functioning code that are difficult to decipher to even a seasoned developer.

Many developers consider code obfuscation and coding tricks to be an abomination in software. However, obfuscation has its place and is not a frivolous activity. Its true that coding in an obfuscated way or using tricks in a shared repository will quickly make you become disliked by teammates. However, gaining a greater understanding of your language can be helpful to your progression as a software developer and invaluable when working with legacy code, where best practices aren’t always followed.

“Just Another Perl Hacker” script first appeared on the Perl forums as a way of writing a signature hidden in an obfuscated Perl script and later became an ongoing phenomenon within the Perl community using the text “Just Another Perl Hacker”.

This now infamous “Just Another Perl Hacker” (JAPH) first appeared on the Perl Monks forum by a user named Blokhead. The script contains many elements that can confuse even an experienced Perl developer – only Perl keywords are used to generate the desired output. The immediate question most developers have is “How can this script produce the desired output when the script clearly doesn’t contain the text?”. This can be the beginning of the descent into the Perl rabbit hole, for developers who take this question one step further and begin the process of deciphering the code.

not exp log srand xor s qq qx xor
s x x length uc ord and print chr
ord for qw q join use sub tied qx
xor eval xor print qq q q xor int
eval lc q m cos and print chr ord
for qw y abs ne open tied hex exp
ref y m xor scalar srand print qq
q q xor int eval lc qq y sqrt cos
and print chr ord for qw x printf
each return local x y or print qq
s s and eval q s undef or oct xor
time xor ref print chr int ord lc
foreach qw y hex alarm chdir kill
exec return y s gt sin sort split

What makes this script so impressive to even a non-Perl developer is not only the lack of anything Perl keywords, but the symmetry of the script. Every single line in the code finishes at the same position as every other line of code.

The first thing that should be noted is that the script contains no line endings (missing the end of line “;“) and therefore is a single line command. The second thing that should be noted is that the script contains multiple commands all chained together with logical operators. Here or and xor are used to chain elements together. However, other operators could be used.

The only restriction here is on how the operators chain together. For example, if an and is used, we know that the first and second input must be “true” in order for the output to be “true”. Because of this, if the first input is “false” there is no need to execute the second input as we know the output will be “false”. In this case, Perl skips the second input expression. Therefore, we must allow for all of our logical operators to at least allow for the second input to be evaluated.

This is one of the reasons why or and xor was used. Firstly no matter what the operator receives as the first input, its always required to evaluate the second input. And secondly, the use of or and xor allows for the line length to be balanced by adding or removing a single character. In cases where the line length is too short we can add a single character by changing or to xor and vice versa. and is used in some places, but when used the first input must be “true” to execute the second input.

Now that we’ve identified how the multiple statements chain together, we can mark these elements in the code.

not exp log srand xor s qq qx xor
s x x length uc ord and print chr
ord for qw q join use sub tied qx
xor eval xor print qq q q xor int
eval lc q m cos and print chr ord
for qw y abs ne open tied hex exp
ref y m xor scalar srand print qq
q q xor int eval lc qq y sqrt cos
and print chr ord for qw x printf
each return local x y or print qq
s s and eval q s undef or oct xor
time xor ref print chr int ord lc
foreach qw y hex alarm chdir kill
exec return y s gt sin sort split

What about Printing the Text?

The text is printed using a trick involving converting a word to an ASCII numeric value and back into an ASCII character. To convert an ASCII character into its numeric value the ord operator is used. The caveat to this operator is that if a “string” is supplied to its input it only returns the numeric value of the first character. This can then be converted back to an ASCII character using the chr operator. Therefore, if we do something like:

print chr ord “join”

The script will take the string “join” and with ord convert its first character into its numeric value “106”. Then chr will convert it back to the character “j”.

How Can We Print without Any Strings?

Now that we know how to print individual letters, the next question is how are strings passed into this if everything is a keyword. This is handled with the use of quote words. In Perl a list of words can be created using an array.

My @words = (‘foo’, ‘bar’);

However, this isn’t the only way to pass a list of words. The operator qw can be used in front of the parenthesis. This tells Perl that everything within the parenthesis should be treated as a quoted string, meaning that the quotes and commas are no longer required. Using this the code can be rewritten as the following,

My @words = qw(foo bar);

What’s more interesting is that parentheses are also optional in this case, but anything else can be used in their place, even keywords. The only restriction on this is that the character used to mark the start the quote word must also be used to mark the end of the quote word. For example, replace the parentheses with the keyword q.

My @words = qw q foo bar q;

The script uses this to generate the list of words. The only thing remaining is to loop over each word in the list and apply the chr ord operation to convert them into a single letter. This only requires a for placed in-between the chr ord and the list. So an example would be to print the string “perl”. Here we could use the keyword x as the parentheses replacement for the quote word and use the keywords printf each return local as a replacement of the text.

print chr ord for qw x printf each return local x
>>> perl

Below the code has been highlighted showing the places where printing takes places. Other operators such as int (convert to an integer) and lc (make lower case) are mixed in but used appropriately so they have no effect on the code.

not exp log srand xor s qq qx xor
s x x length uc ord and print chr
ord for qw q join use sub tied qx
xor eval xor print qq q q xor int
eval lc q m cos and print chr ord
for qw y abs ne open tied hex exp
ref y m xor scalar srand print qq
q q xor int eval lc qq y sqrt cos
and print chr ord for qw x printf
each return local x y or print qq
s s and eval q s undef or oct xor
time xor ref print chr int ord lc
foreach qw y hex alarm chdir kill
exec return y s gt sin sort split

How Are Spaces Printed?

Until now we’ve created letters using the first letter of a keyword, but how can we do this for a space character if no keyword starts with a space character. The solution comes from another operator, the quote operator qq. This operator can be used to replace double quotes around a string, so assigning a string to a variable:

My $text = “hello world”;

Is the same as:

My $text = qq(hello world);

Again the parentheses can be replaced with anything so long as the beginning and end match and the string can just be a space. So to produce a space we can replace the parentheses with the character q to make:

My $text = qq q q;
not exp log srand xor s qq qx xor
s x x length uc ord and print chr
ord for qw q join use sub tied qx
xor eval xor print qq q q xor int
eval lc q m cos and print chr ord
for qw y abs ne open tied hex exp
ref y m xor scalar srand print qq
q q xor int eval lc qq y sqrt cos
and print chr ord for qw x printf
each return local x y or print qq
s s and eval q s undef or oct xor
time xor ref print chr int ord lc
foreach qw y hex alarm chdir kill
exec return y s gt sin sort split

How Does Printing Text Fit into the Code?

Printing the text using chr ord isn’t the whole story in this script, surrounding the print elements of code there are other operations that allow the print statement to work. First the code surrounding the print statements contain eval. this is just taking the code after the eval (such as the print statement), wrapping it within boundary markers and reading it as a string. The eval then executes anything within the string and attempting to execute the string as Perl code. So, in this case, they will effectively work in the same way with/without this surrounding code. Other sections of code within the eval are effectively no operation code and just evaluate to either “true” or “false” and are combined with a logical operator (and or or).

not exp log srand xor s qq qx xor
s x x length uc ord and print chr
ord for qw q join use sub tied qx
xor eval xor print qq q q xor int
eval lc q m cos and print chr ord
for qw y abs ne open tied hex exp
ref y m xor scalar srand print qq
q q xor int eval lc qq y sqrt cos
and print chr ord for qw x printf
each return local x y or print qq
s s and eval q s undef or oct xor
time xor ref print chr int ord lc
foreach qw y hex alarm chdir kill
exec return y s gt sin sort split

How Does the First Print Statement Work?

The first print works slightly differently to the other print statements. You’ll notice that the eval happens after the print rather than before and if you attempt to execute the initial piece of code before the eval Perl will output nothing.

The code that prints the output “just” is in fact the majority of the four three lines. Here it is shown with the non-active parts greyed out and the printing part highlighted in red.

not exp log srand xor s qq qx xor
s x x length uc ord and print chr
ord for qw q join use sub tied qx
xor eval xor print qq q q xor int

The first thing that here should confuse you is the fact that the opening and closing markers for the quote words operator qw don’t match. As described above the markers must match for the code to execute correctly. However, in this case, the qx is not a single marker but two markers, used in a way that makes it look like another valid Perl operator qx.

Here, the q is the closure of the quote word, and the x is the closure of a substitution regular expression. To understand how this works a quick primer on Perl regular expressions and the Perl default variable $_ is required.

To start with the Perl default variable $_ is undefined at the beginning of the execution of a script. It’s used to hold the value of any operation that is not directly assigned to a variable. So, if we were to write code that loops over an array printing each value we could write.

@a = (1, 2, 3, 4);

for my $value (@a) {
    print $value;
}

However, using the Perl default variable we could shorten this to:

@a = (1, 2, 3, 4);

for (@a) {
    print $_;
}

And have the same functionality as Perl has assigned the value of the array element into $_. What makes this more interesting is that $_ is not just used by default as an assignable variable. It’s also used by default as the default input variable, therefore we need not include it in any code. So this code can be further shortened to the below:

@a = (1, 2, 3, 4);

for (@a) {
    print;
}

And the print statement will use $_ by default. This on its own is a big part of Perl obfuscation and Perl golf as it effectively hides variable assignment and use. This is used in combination of the substitution regex to print the first word “just”.

Perl substitution regular expression is the next part that needs to be understood. The substitution regex takes a string, matches part or all of the string using an input and replaces that part with something else. Below we use the substitution operator
s/{pattern}/{substitution}/ to replace the word “cat” with “dog”. The first input of the substitution operator is the regular expression pattern used to match text ( in this case just the word “cat”), and the second input is the texted that will replace the matched text within a string (in this case the word “dog”).

my $words = "Hello cat";
$words =~ s/cat/dog/;
print($words);
>>> Hello dog

This also works with $_ as the initial input and once again we can replace markers around the operator with anything else, so long as they match. At the beginning of a script, the $_ variable has yet to be defined. We can match on this by using an empty pattern // and by using a substitution operator we can replace it with whatever we like. The script below will take the empty $_ variable and replace it with the word “hello”, then use it as the default input to the print statement.

s//Hello/;
print;
>>> Hello

We can replace the forward slashes // with the marker q to make it further like the script of interest.

s qqHelloq;
print;
>>> Hello

This can appear confusing as it looks as if the marker letters are merged into the replacement string. However, Perl correctly identifies the boundaries of the operation and applies the substitution correctly.

Now we have all the pieces, we can fit them together to see how the word “JUST” is printed in the main script. First, take the $_ operator and define it as a space character.

Hidden Operations

not exp log srand xor s qq qx xor
s x x length uc ord and print chr
ord for qw q join use sub tied qx
xor eval xor print qq q q xor int

Here the qx is two operations: q is the end of the substitution and x is a modifier to the substitutions regular expression . The modifier x allows for white space within the regex. This can give the reader greater clarity of whats happening within a regular expression as it allows for comments. However, in this case its purely decorative.

Now generate the texts as we previously have. Here there are a few extra operations before the print. These evaluate to “true” and are passed into the and. This effectively does nothing and adds to the confusion of the code and helps pad the line to the correct length.

not exp log srand xor s qq qx xor
s x x length uc ord and print chr
ord for qw q join use sub tied qx
xor eval xor print qq q q xor int

Next, perform a substitution on the $_ operator and replace it with the print operation.

not exp log srand xor s qq qx xor
s x x length uc ord and print chr
ord for qw q join use sub tied qx
xor eval xor print qq q q xor int

As $_ at this point contains a single space, the match is done using an empty space (s x x), and as the print section is performed within the substitution regex the entire print section is treated as text and not code. This means that $_ is now set to the code within the substitution, making it.

$_ = length uc ord and print chr ord for qw q join use sub tied q

Which should make it more recognisable. Now that the code is stored within the $_ operator as a string we need to execute it. Here again we use the eval function to execute the string as Perl code. Again Perl uses the $_ variable as the input to this function, and its this eval that executes the print code that has been stored within the $_ that prints “just”.

not exp log srand xor s qq qx xor
s x x length uc ord and print chr
ord for qw q join use sub tied qx
xor eval xor print qq q q xor int

What About the Remaining Code?

For the remaining sections of code before and after these operations, they are not operation code. They just evaluate to “true” or “false”. They are purely decorative and used to further confuse the reader and pad the line length of the code.

not exp log srand xor s qq qx xor
s x x length uc ord and print chr
ord for qw q join use sub tied qx
xor eval xor print qq q q xor int
eval lc q m cos and print chr ord
for qw y abs ne open tied hex exp
ref y m xor scalar srand print qq
q q xor int eval lc qq y sqrt cos
and print chr ord for qw x printf
each return local x y or print qq
s s and eval q s undef or oct xor
time xor ref print chr int ord lc
foreach qw y hex alarm chdir kill
exec return y s gt sin sort split

Conclusion

There you have it, the infamous “Just Another Perl Hacker” script explained. If you found the formation of this script interesting, I highly recommend taking your interest further. Attempting the darker side of whichever programming language is good. Obfuscation will give you a greater understanding and appreciation of your language rather than causing you to develop a “bad” style.

1
0

Related Posts