Checking information entered by users into a form is referred to as form validation. There are many different forms of validation, but the basic pattern match function in PHP is eregi, which stands for “evaluate regular expression, case insensitive”. However, as of PHP 5.3, eregi is deprecated, in favour of preg_match, strstr, and related functions, for technical reasons I won’t go into here. I’ll use preg_match for examples here: the function should work in both earlier and later versions of PHP, and you don’t want to be caught short if your hosting provider upgrades PHP to 5.3 or greater on their servers. Essentially the difference is that patterns used with preg_match must be surrounded by delimiter characters (/ /); in eregi, no delimiter character is used.

No matter which version of the function you choose to use, it is usually employed as the condition in an if statement.

For a very simple example, let’s say that we only wanted to allow people named Bob to be accepted on a form. Using the basic form we have previously introduced, let’s test the value the user enters for their first name, using preg_match, by writing the following code inside the body of formhandler.php:

<p>
<?php $pattern = "/Bob/";
if (preg_match($pattern, $_POST['firstname'])) { 
	echo "Welcome, Bob!";
} else { 
	echo "You’re not Bob!"; } ?>
</p>

(Note that both of our responses to someone entering their first name should be wrapped in a paragraph tag, so rather than placing the tags inside each echo() statement, I have placed the opening and closing paragraph tag around the PHP. The best way to understand this approach is to use View Source in your browser when looking at formhandler.php.)

You will find that “Bob” entered as a first name in form.php receives a positive response from formhandler.php when they press the submit button, but “Robert” receives a negative one.

This is good, but now try entering “Bobby” as the first name.

You will find that formhandler.php still greets us as “Bob”. The reason for that is simple: preg_match is looking for a pattern of characters, as defined by the value of the $pattern variable, rather than perfectly matching a string. formhandler.php will also respond positively to “robdomBobudob”, as it is insensitive to the position of a character string by default.

This is, of course, a silly example; let’s make it better. Rather than trying to match a set of characters, let’s make it a range:

<?php $pattern = "/[a-zA-Z]/";
if (preg_match($pattern, $_POST['firstname'])) { 
	echo "You have entered your first name";
		} else {
	echo "This is not a valid first name!";
} ?>

As a validation routine, this is better – now the user’s first name must at least contain a letter, which may be either uppercase or lowercase. “7” as a first name will not pass, but “7z” will.

Still better:

<?php $pattern = "/[a-zA-Z]{2}/";
	if (preg_match($pattern, $_POST['firstname'])) {
		echo "You have entered your first name";
	} else {
		echo "This is not a valid first name!";
} ?>

Now the first name must be at least two letters: “Al”, for instance. But “Al3” will still pass. We could reverse the logic around:

<?php $pattern = "/[0-9]/";
if (preg_match($pattern, $_POST['firstname'])) { 
	echo "This is not a valid first name"; }
else { 
	echo "You have entered your first name correctly."; 
} ?>

Now if the first name contains any numerals it will not pass. Realistically, we only want alphabetical characters in the first name, and hyphens. Note that in this case we reverse the logic (the "!" for "not" in front of preg_match).

<?php $pattern = "/^[[:alpha].’ -]{2,15}$/";
if (!preg_match($pattern, $_POST['firstname'])) {
	echo "This is not a valid first name"; 
		} else { 
	echo "You have entered your first name correctly."; 
} ?>

Our pattern, as used in the if statement, could be translated as “Your first name must contain between two and fifteen letters, with no other characters accepted other than hyphens and a period.” (Generally speaking an upper range value – 15, in this case – is redundant, as the number of characters that can be typed into a text field should be limited by a pattern on the input… but it is not wrong in any way to double check, and from a security standpoint, it’s a good idea to do so.)

Enjoy this piece? I invite you to follow me at twitter.com/dudleystorey to learn more.