My recent post on testing for negative 0 in JavaScript created a lot of interest. So today, I’m going to talk about another bit of JavaScript obscurity that was also inspired by a Twitter thread.
I recently noticed this tweet go by:
This was obviously a trick question. Presumably some programmer expected this expression to produce an array like [1, 2, 3]
and it doesn’t. Why not? What does it actually produce? I didn’t cheat by immediately typing it into a browser but I did quickly look up something in my copy of the ECMAScript 5 Specification. From the spec. it appeared clear that the answer would be:
I then typed the expression into a browser and that was exactly what I got. Before I explain why, you may want to stop here and see if you can figure it out.
OK, here is the explanation. parseInt
is the built-in function that attempts to parse a string as a numeric literal and return the resulting number value. So, a function call like:
should assign the numeric value 123
to the local variable n
.
You might also know, that if the string can’t be parsed as numeric literal, parseInt
will return as the result the value NaN
. NaN
, which is an abbreviation for “Not a Number”, is a value that generally indicates that some sort of numeric computation error has occurred. So, a statement like:
assigns NaN
to x
.
map
is a built-in Array method that is in ECMAScript 5 and which has been available in many browsers for a while. map
takes a function object as its argument. It iterates, over each element of an array and calls the argument function once for each element, passing the element value as an argument. It accumulates the results of these function calls into a new array. Consider this example,
[1,2,3].map(function (value) {return value+1})
it will return a new array [2,3,4]
. It is probably most common to see a function expression such as this passed to map
but it is perfectly valid to pass an already existing function object such parseInt
.
So, knowing the basics of parseInt
and map
it is pretty clear that the original expression was intended to take an array of numeric strings and to return a corresponding array containing the numeric value of each string. Why doesn’t it work? To find the answer we will need to look more closely at the definition of both parseInt
and map
.
Looking at the specification of parseInt you should notice that it is defined as accepting two arguments. The first argument is the string to be parsed and the second specifics the radix of the number to be parsed. So, parseInt(“ffff”,16)
will return 65535
while parseInt("ffff"”,8)
will return NaN
because "ffff"
doesn’t parse as an octal number. If the second argument is missing or 0
it defaults to 10
so parseInt("12",10)
, parseInt("12")
, and parseInt("12",0)
all produce the number 12.
Now look carefully at the specification of the map method. It refers to the function that is passed as the first argument to map
as the callbackfn. The specification says, “the callbackfn is called with three arguments: the value of the element, the index of the element, and the object that is being traversed.” Read that carefully. It means that rather than three calls to parseInt
that look like:
parseInt("1")
parseInt("2")
parseInt("3")
we are actually going to have three calls that look like:
parseInt("1", 0, theArray)
parseInt("2", 1, theArray)
parseInt("3", 2, theArray)
where theArray
is the original array ["1","2","3"]
.
JavaScript functions generally ignore extra arguments and parseInt
only expects two arguments so we don’t have to worry about the effect of the theArray
argument in these calls. But what about the second argument? In the first call the second argument is 0
which we know defaults the radix to 10 so parseInt("1",0)
will return 1
. The second call passes 1
as the radix argument. The specification is quite clear what happens in that case. If the radix is non-zero and less than 2 the function returns NaN
without even looking at the string.
The third call passes 2
as the radix argument. This means that the string to convert is supposed to be a binary number consisting only of the digit characters "0"
and "1"
. The parseInt
specification (step 11) says it only tries to parse the substring to the left of the first character that is not a valid digit of the requested radix. The first character of the string is "3"
which is not a valid base 2 digit so the substring to parse is the empty string. Step 12 says that if the substring to parse is the empty string, the function returns NaN
. So, the result of the three calls will be 1
, NaN
, and NaN
.
The programmer of the original expression made at least one of two possible mistakes that caused this bug. The first possibility is that they either forgot or never knew that parseInt
accepts an optional second argument. The second possibility is that they forgot or never knew that map
calls its callbackfn with three arguments. Most likely, it was a combination of both mistakes. The most common usage of parseInt
passes only a single argument and most functions passed to map
only use the first argument so it would be easy to forget that additional arguments are possible in both cases.
There is a straight forward way to rewrite the original expression to avoid the problem. Use:
["1","2","3"].map(function(value) {return parseInt(value)})
instead of:
["1","2","3"].map(parseInt)
This makes it clear that the callbackfn only cares about a single argument and it explicitly calls parseInt
with only one argument. However, as you can see it is much more verbose and arguably less elegant.
After I tweeted about this, there was an exchange about how JavaScript might be extended to avoid this problem or to at least make the fix less verbose. Angus Croll (@angusTweets) suggested the problem could be avoided simply by using the Number
constructor as the callbackfn instead of parseInt
. Number
called in this manner will also parse a string argument as a decimal number and it only looks at one argument.
@__DavidFlanagan suggested that perhaps a mapValues
method should be added which only passes a single argument to the callbackfn. However, ECMAScript 5 has seven distinct Array method that operate similarly to map
, so we would really have to add seven such methods.
I suggest the possibility of adding a method that might be defined like:
Function.prototype.only = function(numberOfArgs) {
var self = this; //the original function
return function() {
return self.apply(this,[].slice.call(arguments,0,numberOfArgs))
}
};
This is a higher order function that takes a function as an argument and returns a new function that calls the original function but with an explicitly limited number of arguments. Using only
, the original expression could have been written as:
["1","2","3"].map(parseInt.only(1))
which is only slight more verbose and arguably retains a degree of elegance.
This led to a further discussion of curry functions (really partial function application) in JavaScript. Partial function application takes a function that requires a certain number of arguments and produces a new function that takes fewer arguments. My only
method is an example of a function that performs partial function evaluation. So is the Function.prototype.bind
method that was added to ES5. Does JavaScript need such additional methods? For example, a bindRight
method that fixes the rightmost arguments rather than the leftmost. Perhaps, but what does rightmost even mean when a variable number of arguments are allowed? Probably bindStartingAt
that took an argument position would be a better match to JavaScript.
However, all this discussion of extensions really misses the key issue with the original problem. In order to use any of them, you first have to be aware of the optional argument mismatch between map
and parseInt
. If you are aware of the problem there are many way to work around it. If you aren’t aware then none of the proposed solutions help at all. This really seems to be mostly an API design problem and raises some fundamental questions about the appropriate use of optional arguments in JavaSript.
Supporting optional arguments can simplify the design of an API by reducing the total number of API functions and by allowing many users to only have to think about the details of the most common use cases. But as we see above, this simplification can cause problems when the functions are naively combined in unexpected ways. What we are seeing in this example is that there are really two quite different use cases for optional arguments.
One use case looks at optional arguments from the perspective of the caller. The other use case is from the perspective of the callee. In the case of parseInt
, its design assumes that the caller knows that it is calling parseInt
and has chosen actual argument values appropriately. The second argument is optional from the perspective of the caller. If it wants to use the default radix it can ignore that argument. However, the actual specification of parseInt
carefully defines what it (the callee) will do when called with either one or two arguments and with various argument values.
The other use case is more from the perspective of a different kind of function caller. A caller that doesn’t know what function it is actually calling and that always passes a fixed sized set of arguments. The specification of map
clearly defines that it will always pass three arguments to any callbackfn it is provided. Because the caller doesn’t actually know the identify of the callee or what actual information the callee will need, map
passes all available information as arguments. The assumption is that an actual callee will ignore any arguments that it doesn’t need. In this use case the second and third arguments are optional from the perspective of the callee.
Both of these are valid optional argument use cases, but when we combine them we get a software “impedance mismatch”. Callee optional arguments will seldom match with caller optional arguments. Higher order functions such as the bind
or only
methods can be used to fix such mismatches but are only useful if the programmer is aware that the mismatch exists. JavaScript API designers need to keep this mind and every JavaScript programmer needs to take extra care to understand what exactly will be passed to a function used as a “call back”.
Update 1: Correctly credit Angus Croll for map(Number) suggestion.