PHP: Benchmark isset() or array_key_exists() ?
The Twitter user caioariede postet that you should array_key_exists() instead of isset(). He did not say why so i tried to check performance first and came to interesting results...
Code
First the benchmark script:
/**
* PHP Array key exists
*/
$n = 1000000;
// First a test with a empty array
$array = array();
$time_start = microtime(true);
$i = 0;
while($i < $n){
$devnull = isset($array[$i++]);
//var_dump($devnull);
}
$time_end = microtime(true);
$time_while1= $time_end-$time_start;
echo number_format($time_while1, 3, '.', '')
." seconds - isset(array[i]) on empty array \n";
$time_start = microtime(true);
$i = 0;
while($i < $n){
$devnull = array_key_exists($i++, $array);
//var_dump($devnull);
}
$time_end = microtime(true);
$time_while1= $time_end-$time_start;
echo number_format($time_while1, 3, '.', '')
." seconds - array_key_exists(array,i) on empty array \n";
$time_start = microtime(true);
$i = 0;
while($i < $n){
$devnull = (bool)@$array[$i++];
//var_dump($devnull);
}
$time_end = microtime(true);
$time_while1= $time_end-$time_start;
echo number_format($time_while1, 3, '.', '')
." seconds - cast array[i] on empty array \n";
// Create test array
$i = 0;
$array = array();
while($i < $n) {
$array[$i++] = true;
}
$time_start = microtime(true);
$i = 0;
while($i < $n){
$devnull = isset($array[$i++]);
//var_dump($devnull);
}
$time_end = microtime(true);
$time_while1= $time_end-$time_start;
echo number_format($time_while1, 3, '.', '')
." seconds - isset(array[i]) on full array \n";
$time_start = microtime(true);
$i = 0;
while($i < $n){
$devnull = array_key_exists($i++, $array);
//var_dump($devnull);
}
$time_end = microtime(true);
$time_while1= $time_end-$time_start;
echo number_format($time_while1, 3, '.', '')
." seconds - array_key_exists(array,i) on full array \n";
$time_start = microtime(true);
$i = 0;
while($i < $n){
$devnull = (bool)@$array[$i++];
//var_dump($devnull);
}
$time_end = microtime(true);
$time_while1= $time_end-$time_start;
echo number_format($time_while1, 3, '.', '')
." seconds - cast array[i] on full array \n";
?>
Result
0.819 seconds - array_key_exists(array,i) on empty array
3.194 seconds - cast array[i] on empty array
0.298 seconds - isset(array[i]) on full array
0.815 seconds - array_key_exists(array,i) on full array
1.760 seconds - cast array[i] on full array
Interpretation
I tested 3 ways to check if a array key is used or not. isset($array[$key]) wich returns true if not NULL, array_key_exists($key, $array) which is also boolean and returns true if $key exists in $array. The last way is to fetch the variable in $array at $key and cast it to boolean. If a key is not used, NULL is returned which will cast to false.
I thought array_key_exists() should be much faster than isset() but it is about 60% faster. Why? I do not know for sure, hope some of you knows more.
For some reason the cast is also damn slow. Maybe is is more work to fetch the var instead of just searching for the key.
Conclusion
My conclusion is, use isset() because its faster and easier, i personally always forget if its first $key or $array in array_key_exists. Next to that is isset() less to type :D
Update
caioariede (Twitter) showed me that there IS a difference between isset() and array_key_exists(). Thanks for that hint.
isset() will return true if the variable that is accessible with that array key is not NULL. I you access a array key that is not defined yet will always return NULL.
array_key_exists searches in the key list for the key and returns true if a key was found. This will also work on something like this: "$array['key'] = NULL", isset will return false, array_key_exists will return true.
Now i could guess what is happening inside PHP.
isset() will just access the value of that array key and compare it internally with NULL and return this result. I thinks this could be done very fast and would also fit to my results.
array_key_exists() needs to search the whole list of keys till it finds a fitting one. This will take longer than just accessing the value because array_key_exists has to compare each key of the array with the searched one.
But i still do not really know why the cast is so slow. Maybe accessing the value, returning and casting it is a slower process than the other 2 techniques.
New Conclusion
I think this very small difference is not worth losing ~60% performance gain. The most situation will be done by isset(), the few scenarios where you need exactly this behaviour should be rare but now you know what to use - array_key_exists()
Code 2
Code to prove:
// Empty array
$array = array();
var_dump(isset($array['key'])); // false
var_dump(array_key_exists('key', $array)); // false
// Key is true
$array['key'] = true;
var_dump(isset($array['key'])); // true
var_dump(array_key_exists('key', $array)); // true
// Key is unset()
unset($array['key']);
var_dump(isset($array['key'])); // false
var_dump(array_key_exists('key', $array)); // false
// Key is NULL
$array['key'] = NULL;
var_dump(isset($array['key'])); // false
var_dump(array_key_exists('key', $array)); // true
?>
As I posted on Twitter now, the problem is this:
$ php -r '$foo['bar']=NULL;var_dump(isset($foo['bar']));'
When value is NULL, isset returns FALSE. But it's set! see?
Thats correct Caio.
It also explains why array_key_exists is slower than isset.
But i cant fairly remember a scenairo where i would have needed exactly that behaviour only array_key_exists() has.
I checked documentation, it is mentioned there http://php.net/array_key_exists need to look closer next time :D
I use to check if the value is usable. Null I know from the mysql definition reads “a missing unknown value”.
If I'm concerning the existance of a value I could surely use isset. If I use keys like an array, the importance is the being there, use array_key_exists is the function.
thanks for benchmarks, I thouht there is huge difference than your results, since one is function, doing many things, and other one is lang construct.
http://dev.mysql.com/doc/refman/5.0/en/working-with-null.html
You have written: "But i still do not really know why the cast is so slow".
The answer is pretty simple - not the cast is slow, but error reporting switch is.
Using @ works like:
$tmp = error_reporting();
error_reporting(0);
// do cast
error_reporting($tmp);
I have proposed a method by combining the isset() and array_key_exists() so that its performance is very close to isset() while providing the same result as array_key_exists() does.
If you're interested, just check it out here: http://www.zomeoff.com/php-fast-way-to-determine-a-keyelements-existance-in-an-array/
In fact, 'isset()' is not a function. It's parsed by Zend parser and executed internally. It doesn't need to convert to lower case or scan function table. But, 'array_key_exists' does.
[...] If you said “isset”, you’d be right! It is approximately 60% faster, as others have noted. [...]
Not sure you have explained very well why isset() is faster. You say isset() only needs to access it whilst array_key_exists() needs to search the entire array. How does isset() access without searching the entire array? Why doesn't array_key_exists() simply access?
The benchmark times are skewed by including the time it takes to run a for loop. If you take that out you will find that isset() is much faster, not only 60%.
Regarding your wondering why casting is so slow:
Casting is slow in nearly all programming languages, as this is a reflection. Depending on the language it is a class or object reflection. In most languages it is an object reflection as you reflect over the object to cast it.
You should avoid any kind of reflection when it comes to performance optimization. Typical (PHP) reflections: call_user_func, instanceof, new $classname, casting
Just in case you're curious the actual reason why array_key_exists is because it's a function call, while isset is a language construct.
isset being a language construct means it can bypass a lot of really slow internal function wrapping. The internal function wrapping is not exactly slow, but it's a constant time added to any function call.
cast is by far the slowest because of your implimentation, every time you write @something PHP creates a note of the current stack location, takes a copy of it's error handling settings, turns error handling off, evaluates the code, checking with every expression if it's reached the point it recorded earlier, then once it returns back to the point it needs to be at it then turns error handling back on. Without the @ symbol it'll run at less than half that time, and with error handling turned off explicitly before the loop you're looking at it being almost as fast as array_key_exists, which can be made to actually be almost as fast as isset by replacing the cast with a double negation.
$devnull = !!$array[$i++];
Little comment:
On your point that array_key_exists has to search the whole array (and Steffen Haugk) and isset do not:
Arrays (the keys) are organized as HashTables internally. HashTables have access time O(1), not O(n) (read "look on every key if it fits") like you said. HashTables "know" where the key would be if it exists and therefore do not have to search the whole array. Conclusion: This won't be the cause for the performance difference between isset and array_key_exists.