Working with Forms

The following is Chapter 4 from Wicked Cool PHP by William Steinmetz with Brian Ward. Reprinted with permission.

Security Measures: Forms Are Not Trustworthy

A common mistake that novices make is to trust the data provided by an HTML form. If you have a drop-down menu that only allows the user to enter one of three values, you must still check those values. You also cannot rely on JavaScript to stop people from sending whatever they like to your server.

Your site’s users can write their own form in HTML to use against your server; users can also bypass the browser entirely and use automatic tools to interact with web scripts. You should assume that people will mess around with parameters when you put a script on the Web, because they might be trying to discover an easier way to use your site (though they could be attempting something altogether less beneficial).

To ensure that your server is safe, you must verify all data that your scripts receive.

Verification Strategies

There are two approaches to checking form data: blacklisting and whitelisting.

Blacklisting is the process of trying to filter out all bad data by assuming that form submissions are valid and then explicitly seeking out bad data. In general, this technique is ineffective and inefficient. For example, let’s say that you’re trying to eliminate all “bad” characters from a string, such as quotes. You might search for and replace quotation marks, but the problem is that there will always be bad characters you didn’t think of. In general, blacklisting assumes that most of the data you receive is friendly.

A better assumption to make about form data you’re receiving is that it’s inherently malicious; thus, you should filter your data in order to accept only valid data submissions. This technique is called whitelisting. For example, if a string should consist of only alphanumeric characters, then you can check it against a regular expression that matches only an entire string of A-Za-z0-9. Whitelisting may also include forcing data to a known range of values or changing the type of a value. Here is an overview of a few specific tactics:

  • If the value should be a number, use the is_numeric() function to verify the value. You can force a value to an integer using the intval() function. If the value should be an array, use is_array().
  • If the value should be a string, use is_string(). To force it, use strval().
  • If the value should be null, use is_null().
  • If the value should be defined, use isset().
WHITELISTING INTEGERS

Here’s a typical example of how you might whitelist for a numeric value. If the data is not numeric, then you use a default value of zero (of course, this assumes that zero is an acceptable value):

if (! is_numeric($data)) { //
   Use a default of 0.
   $data = 0;
}

In the case of integers, there is an alternative if you know that all integer values are safe. Using $data = intval($data); forces $data to its integral value. This technique is called typecasting.

Using $_POST, $_GET, $_REQUEST, and $_FILES to Access Form Data

In Chapter 2, we showed you how to turn off the register_globals setting that automatically sets global variables based on form data.

To shut down this dangerous setting, refer to “#14: Turning Off Registered Global Variables” on page 25. How do you use $_POST, $_FILES, and $_GET to retrieve form data? Read on.

#25: Fetching Form Variables Consistently and Safely

You should pull form data from predefined server variables. All data passed on to your web page via a posted form is automatically stored in a large array called $_POST, and all GET data is stored in a large array called $_GET. File upload information is stored in a special array called $_FILES (see “#54: Uploading Images to a Directory” on page 97 for more information on files). In addition, there is a combined variable called $_REQUEST.

To access the username field from a POST method form, use $_POST['username']. Use $_GET['username'] if the username is in the URL. If you don’t care where the value came from, use $_REQUEST['username'].

<?php

$post_value = $_POST['post_value'];
$get_value = $_GET['get_value'];
$some_variable = $_REQUEST['some_value'];

?>

$_REQUEST is a union of the $_GET, $_POST, and $_COOKIE arrays. If you have two or more values of the same parameter name, be careful of which one PHP uses. The default order is cookie, POST, then GET.

There has been some debate on how safe $_REQUEST is, but there shouldn’t be. Because all of its sources come from the outside world (the user’s browser), you need to verify everything in this array that you plan to use, just as you would with the other predefined arrays. The only problems you might have are confusing bugs that might pop up as a result of cookies being included.

Trimming Excess Whitespace

Excess whitespace is a constant problem when working with form data. The trim() function is usually the first tool a programmer turns to, because it removes any excess spaces from the beginning or end of a string. For example, “Wicked Cool PHP                  ” becomes “Wicked Cool PHP.” In fact, it’s so handy that you may find yourself using it on almost every available piece of user-inputted, non-array data:

$user_input = trim($user_input);

But sometimes you have excessive whitespace inside a string—when someone may be cutting and copying information from an email, for instance. In that case, you can replace multiple spaces and other whitespace with a single space by using the preg_replace() function. The reg stands for regular expression, a powerful form of pattern matching that you will see several times in this chapter.

<?php
function remove_whitespace($string) {
   $string = preg_replace('/\s+/', ' ', $string);
   $string = trim($string);
   return $string;
}
?>

You’ll find many uses for this script outside of form verification. It’s great for cleaning up data that comes from other external sources.

#27: Importing Form Variables into an Array

One of the handiest tricks you can use in PHP is not actually a PHP trick but an HTML trick. When a user fills out a form, you’ll frequently check the values of several checkboxes. For example, let’s say you’re taking a survey to see what sorts of movies your site’s visitors like, and you’d like to automatically insert those values into a database called customer_preferences. The hard way to do that is to give each checkbox a separate name on the HTML form, as shown here:

<p>What movies do you like?</p>
<input type="checkbox" name="action"  value="yes"> Action
<input type="checkbox" name="drama"   value="yes"> Drama
<input type="checkbox" name="comedy"  value="yes"> Comedy
<input type="checkbox" name="romance" value="yes"> Romance

Unfortunately, when you process the form on the next page, you’ll need a series of if/then loops to check the data—one loop to check the value of $action, one to check the value of $drama, and so forth. Adding a new checkbox to the HTML form results in yet another if/then loop to the processing page.

A great way to simplify this procedure is to store all of the checkbox values in a single array by adding [] after the name, like this:

<form action="process.php" method="post">
<p>What is your name?</p>
<p><input type="text" name="customer_name"></p>

<p>What movies do you like?</p>
<p>
   <input type="checkbox" name="movie_type[]" value="action">  Action
   <input type="checkbox" name="movie_type[]" value="drama">   Drama
   <input type="checkbox" name="movie_type[]" value="comedy">  Comedy
   <input type="checkbox" name="movie_type[]" value="romance"> Romance
</p>
<input type="submit">
</form>

When PHP gets the data from a form like this, it stores the checked values in a single array. You can loop through the array this way:

<?php
$movie_type = $_POST["movie_type"];
$customer_name = strval($_POST["customer_name"]);

if (is_array($movie_type)) {
   foreach ($movie_type as $key => $value) {
      print "$customer_name likes $value movies.<br>"; }
}

?>

Not only does this technique work for checkboxes, but it’s extremely handy for processing arbitrary numbers of rows. For example, let’s say we have a shopping menu where we want to show all the items in a given category. Although we may not know how many items will be in a category, the customer should be able to enter a quantity into a text box for all items he wants to buy and add all of the items with a single click. The menu would look like Figure 4-1.

Figure 4-1: A form with an array of checkboxes

Let’s access product name and ID data in the product_info MySQL table described in the appendix to build the form as follows:

<?php
/* Insert code for connecting to $db here. */

$category = "shoes";
/* Retrieve products from the database. */
$sql = "SELECT product_name, product_id FROM product_info
   WHERE category = '$category'";

$result = @mysql_query($sql, $db) or die;

/* Initialize variables. */
$order_form = ""; /* Will contain product form data */
$i = 1;

print '<form action="addtocart.php" method="post">';

while($row = mysql_fetch_array($result)) {
   // Loop through the results from the MySQL query.
   $product_name = stripslashes($row['product_name']);
   $product_id = $row['product_id'];

   // Add the row to the order form.
   print "<input type=\"hidden\" name=\"product_id[$i]\"
      value=\"$product_id\ ">";
   print "<input type=\"text\" name=\"quantity[$i]\"
   size=\"2\" value=\"0\"> $product_name<br />";

   +1
)
print '<input type="submit" name="add" value="Add to Cart"></form>'

?>

The processing script addtocart.php is as follows:

<?php

$product_id = $_POST["product_id"];
$quantity = $_POST[" quantity tyle='"];

if (is_array($quantity)) {
   foreach ($quantity as $key => $item_qty) {
      $item_qty = intval($item_qty);
      if ($item_qty > 0) {
         $id = $product_id[$key];
         print "You added $item_qty of Product ID $id.<br>"; }
   }
}

?>

As you can see, this script depends wholly on using the index from the $quantity array ($key) for the $product_id array.

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read