WEBINAR: On-demand webcast
How to Boost Database Development Productivity on Linux, Docker, and Kubernetes with Microsoft SQL Server 2017 REGISTER >
So, you've got your web site ready: The layout is jazzed up (round corners, pastel reflections, obnoxiously large text and the obligatory "Beta" in the logo), all the functionality is in place (configurable in 88 different ways, several different APIs for others to use), and a telephone hotline to your local Nescafé plant for an uninterrupted supply of power.
Before you give it that big push, though, it's always good to review your code one more time. And by review here, I really don't mean test it. Testing exists in all sorts of guises and forms; it's even one of the voices of your conscience. What I'm referring to is little, often-overlooked quirks and salient points that don't usually figure into web site building activities for the sake of expediency. No doubt, these are things without which your web site will still work, but it won't be the best it can be.
In this article, I will go over a few of what I consider the most important sanity checks for your web site. It doesn't have to be an ASP.NET web site, but some of the code in here will assume that you have a working knowledge of ASP.NET. The topics I will cover are:
- Lower-case URLs and canonicalization
- The Request Referrer Check
- AJAX and SEO
- Inline ASP.NET
Lower-Case URLs and Canonicalization
Although Google may be the best thing since sliced bread and the Ctrl+Shift+B key combo for your Visual Studio IDE, it unfortunately isn't perfect. Google's crawler is unable to distinguish between lower case and upper case URLs. It's a strange thing if you think about it.
If you're told to go to the Microsoft Headquarters in Redmond and upon arrival you see, in big bold letters "MICROSOFT HEADQUARTERS," would you think that you've arrived at the wrong place? (Hint: no) And so, any address on the Internet is unique in the series of alphanumeric values it uses, which Google's algorithms do not take into account; this is a limitation of the operating system that the underlying search engine lives on. And, because it's the world's largest search engine, you must work around this flaw. This is known as canonicalization.
There are two ways to canonicalize your URLs:
- Have it lead a pious life and make it perform several miracles before it dies and then lodge a petition with the Holy See of the Vatican. This option is ruled out because it's called canonization, not canonicalization.
- Use ASP.NET.
It's easier to do this in ASP.NET than by the first option or even compared to other languages such as PHP, because it gives you the global.asax class and allows you to hook into the processing pipeline. Perform a 301 redirect in the Application_BeginRequest event.Here is an example:
Dim currentURL As String = Request.RawUrl.ToLower() If currentURL <> Request.RawUrl Then Response.Status = "301 Moved Permanently" Response.AddHeader("Location", currentURL) End If
You're simply comparing the requested URL with its lower-case counterpart and, if they're the same, you allow processing to continue; otherwise, you return a 301, telling the crawler that the lowercase version of the URL is the one you prefer. Remember, don't do a Response.Redirect because this returns a 302 Temporary Redirect to the crawler, which isn't the same as a 301.
In addition, there's also the difference between www.yourdomain.com and yourdomain.com and www.yourdomain.com/default.aspx, which is brought up in several search engines. In other words, they can't tell the difference.
Again, in global.asax's Application_BeginRequest event, you can perform a check and force your visitors to use either one, but not both.
Dim currentHost As String = Request.Url.Host If currentHost.Contains(".com") Then If Not currentHost.Contains("www.") Then Response.Status = "301 Moved Permanently" Response.AddHeader("Location", "http://www." & _ Request.Url.Host & Request.RawUrl) End If End If
This particular code tells the crawler, "I want you to use this new URL from now on. So, don't use mydomain.com; use www.mydomain.com instead." And it should.