PHP, Arrays and URL matching

Well, Dave Shea just posted some PHP code that just begs to be worked on. It also raises a general issue that most beginning programmers face. If programming is not your piece of cake, then skip this ;)

The problem with most of the code aspiring programmers create is that the code they create is needlessly inefficient. While I agree that most optimization is often unnecessary, especially in this day and age, a basic knowledge of efficient programming practises is a leap forwards. Arrays in PHP are an extremely good example of this.

PHP handles all arrays as ordered maps so indexing according to some defining factor is much more efficient than creating an array without any specific indexes and searching for a given value by traversing the whole array. To illustrate this concept lets first take a look at the Dave’s code (my apologies for ripping it, but it makes a good example):

$item[0] = "1976design.com";
$item[1] = "andybudd.com";
$item[2] = "7nights.com";
$item[3] = "wholelottanothing.org";
$item[4] = "cssvault.com";
$item[5] = "designbyfire.com";

$itemCount = 6;

…
for ($i = 1; $i < $itemCount; $i++) {
  if (eregi(($item[$i]), $authorURL)) {
   if ($authorURL != "") return " voce";
  }
 }
…

He does other things in the code as well, but that isn't important for the sake of this discussion. The main issue here is that for every comment that is posted, the array $item is traversed from beginning to end, unless a match is found. The following code will be cleaner but still not as efficient as we could be.

$item = array('mezzoblue.com' => ' dave', 
                        '1976design.com' => ' voce', 
                        'andybudd.com' => ' voce', 
                        …
                        'http://none/' => ' troll');

foreach   ($item as $uri=>$class) {
    if (eregi($uri, $authorURL))
        return $class;
}

But an even better solution would be to use the absolute URL that an author would be likely to use as the key and just looking for that key.

$item = array('http://mezzoblue.com/' => ' dave', 
                        'http://1976design.com/' => ' voce', 
                        'http://andybudd.com/' => ' voce', 
                        …
                        'http://none/' => ' troll');

if (array_key_exists($authorURL, $item))
    return $item[$authorURL];

Since the URIs that authors will use are known (some simple parsing and fixing can be done before the comparison, i.e. removing or adding trailing slashes etc.), the last example is the most efficient. When you have a site with many readers and often large amounts of comments, thinking of efficency does save a lot of work.

As a final note, variable names should be more descriptive than what was used in the previous examples. $item as an arrays name is quite misleading in fact. A better approach would be to use the plural form to imply that the array may have more than one value, i.e. $items. In this case coming up with a more descriptive name would help in the readability of the code when Dave decides to work on it sometime in the future.