Wednesday, November 9, 2011

Complex Email Validation with Regular Expressions for Fun and Profit

a@b.com is a valid email address.

Well, not according to the regular expression pattern I've used to validate email addresses for last several years. And with new domain name changes coming in 2013, I thought it was time to reexamine and revise my email validation best-practices.

Here's my existing pattern, which works just fine about 90% of the time:
\w+([-+.']\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*

After reading an excellent (albeit ancient) blog post by Phil Haack (whose short and sweet pattern didn't accommodate a@b.com either), I found a good (and very legible) solution in a comment to said post by SeanG. What I like best about his solution is that he actually broke down the pattern into its component parts, instead of just slapping it down in onelongunintelligiblesinglestring. The commenting may seem a little excessive to some, but when it comes to regular expressions (which I don't tinker with very often), I prefer to be spoon fed.

The end result is a new static C# class based on that code.

One nice feature of this class is that not only can it be used for server-side validation, but it also exposes the pattern so it can be used for the ValidationExpression property on a RegularExpressionValidator control. And I've also handled the case where the email address is not required.


So now I've centralized the pattern in one place, and when those new domain names start showing up in 2013 I can make changes to accommodate them in a single place.


using System;
using System.Text.RegularExpressions;

namespace Web.Business.Validators
{
    public static class EmailValidator
    {
        #region Properties

        public static string RegexPattern
        {
            get
            {

                // <any CHAR excepting <">, "\" & CR, and including linear-white-space>
                string qtext = "[^\\x0d\\x22\\x5c\\x80-\\xff]";

                // <any CHAR excluding "[", "]", "\" & CR, & including linear-white-space>
                string dtext = "[^\\x0d\\x5b-\\x5d\\x80-\\xff]";
                // *<any CHAR except specials, SPACE and CTLs>
                string atom = "[^\\x00-\\x20\\x22\\x28\\x29\\x2c\\x2e\\x3a-\\x3c\\x3e\\x40\\x5b-\\x5d\\x7f-\\xff]+";
                // "\" CHAR 
                string quoted_pair = "\\x5c[\\x00-\\x7f]";
                // <"> *(qtext/quoted-pair) <">
                string quoted_string = string.Format("\\x22({0}|{1})*\\x22", qtext, quoted_pair);
                //atom / quoted-string
                string word = string.Format("({0}|{1})", atom, quoted_string);
                // "[" *(dtext / quoted-pair) "]"
                string domain_literal = string.Format("\\x5b({0}|{1})*\\x5d", dtext, quoted_pair);

                // atom
                string domain_ref = atom; 
                // domain-ref / domain-literal
                string sub_domain = string.Format("({0}|{1})", domain_ref, domain_literal);
                // sub-domain *("." sub-domain)
                string domain = string.Format("{0}(\\x2e{0})*", sub_domain);
                // word *("." word) 
                string local_part = string.Format("{0}(\\x2e{0})*", word);
                // local-part "@" domain
                string addr_spec = string.Format("{0}\\x40{1}", local_part, domain);                 // add starting position and ending position
                string regexPattern = string.Format("^{0}$", addr_spec);

                return regexPattern;
            }
        }

        #endregion

        #region Public Methods

        public static bool IsValid(string emailAddress)
        {
            return IsValid(emailAddress, true);
        }

        /// <summary>
        /// RFC822 complaint email address validation.
        /// see http://iamcal.com/publish/articles/php/parsing_email/ for explaination
        /// </summary>
        /// <param name="emailAddress">Email address to check.</param>
        /// <param name="isRequired">Is email address required?</param>
        /// <returns><c>false</c> if not valid email address, otherwise <c>true</c>.</returns>
        public static bool IsValid(string emailAddress, bool isRequired)
        {
            // Check to see if email address is required

            if (!isRequired && string.IsNullOrEmpty(emailAddress.Trim()))
            {

                // Email address not required
                return true;
            }

            return new Regex(RegexPattern).IsMatch(emailAddress);
        }

        #endregion
    }
}


And just for additional yucks, here's a unit test with some pretty wacky examples of both valid and invalid email addresses. (Test cases come from above blog post and Wikipedia.)

/// <summary>
/// A test for IsValid
/// </summary>
[TestMethod()]
public void IsValidTest()
{
    ValidEmailAttribute target = new ValidEmailAttribute();

    // Test valid email addresses
    Assert.AreEqual(true, target.IsValid(null, false));
   
Assert.AreEqual(true, target.IsValid(string.Empty, false));
   
Assert.AreEqual(true, target.IsValid("a@b.com"));
   
Assert.AreEqual(true, target.IsValid("a@b.co"));
   
Assert.AreEqual(true, target.IsValid("a@b.c"));
   
Assert.AreEqual(true, target.IsValid("a.b.c'@example.com"));
   
Assert.AreEqual(true, target.IsValid(@"""Abc\@def""@example.com"));
   
Assert.AreEqual(true, target.IsValid(@"""Fred Bloggs""@example.com"));
   
Assert.AreEqual(true, target.IsValid(@"""Joe\\Blow""@example.com"));
   
Assert.AreEqual(true, target.IsValid(@"""Abc@def""@example.com"));
   
Assert.AreEqual(true, target.IsValid("customer/department=shipping@example.com"));
   
Assert.AreEqual(true, target.IsValid("$A12345@example.com"));
   
Assert.AreEqual(true, target.IsValid("!def!xyz%abc@example.com"));
   
Assert.AreEqual(true, target.IsValid("_somename@example.com"));
   
Assert.AreEqual(true, target.IsValid("niceandsimple@example.com"));
   
Assert.AreEqual(true, target.IsValid("a.little.unusual@example.com"));
   
Assert.AreEqual(true, target.IsValid("a.little.more.unusual@dept.example.com"));
   
Assert.AreEqual(true, target.IsValid(@"much.""more\ unusual""@example.com"));
   
Assert.AreEqual(true, target.IsValid(@"very.unusual.""@"".unusual.com@example.com"));
   
Assert.AreEqual(true, target.IsValid(@"very.""(),:;<>[]"".VERY.""very\\\ \@\""very"".unusual@strange.example.com"));

    // character @ is missing
    Assert.AreEqual(false, target.IsValid("Abc.example.com"));
    // only one @ is allowed outside quotations marks

    Assert.AreEqual(false, target.IsValid("A@b@c@example.com"));
    // none of the characters before the @ in this example is allowed outside quotation marks
    Assert.AreEqual(false, target.IsValid(@"""(),:;<>[\]@example.com")); 

    // quoted strings must be dot separated or the only element making up the local-part
    Assert.AreEqual(false, target.IsValid(@"just""not""right@example.com")); 

    // spaces, quotes and slashes may only exist when within quoted strings and preceded by a slash 
    Assert.AreEqual(false, target.IsValid(@"this\ is\""really\""not\\allowed@example.com"));
}


Share & Enjoy!

Thursday, August 11, 2011

Open XML SDK and Excel Pivot Tables - Pulling Data

As you may already know, the new Office document formats are pretty much "ZIP" files with XML files inside. These XML documents are formatted according to the Open XML spec, which Microsoft has released version 2.0 of their SDK for interacting with using strongly typed classes in .net.

If you are interested in downloading the SDK it is here:
http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=5124
The documentation for the SDK is here (thought it is rather sparse):
http://msdn.microsoft.com/en-us/library/bb448854.aspx

Now, if all you want to do is read data out of particular cells, the best article I have found thus far explaining this is here:
There is also a useful MSDN article that has a method XLGetCellValue() that you can copy-and-paste into a class:

However, if you are working with pivot tables, this is probably not the case since you want the data and the presentation of the data in the cells may change if the user, say, reorders the spreadsheet.

First you need to get the PivotCacheDefinition from the WorkbookPart. A sample implementation of this might be...

The PivotCacheDefinition has an enumeration of CacheField's, which define each field in the dataset and also contain the values of a particular field in any given row. Since you will need to reference these often, I found it helpful to put them into a dictionary so that I could reference them by name and have the index handy.
I created a class to store the CacheElement and the index...


...and populated the a dictionary from the PivotCacheDefinition:

To get at the rows themselves, you enumerate the PivotTableCacheDefinitionPart.PivotCacheRecords property. Be careful not to confuse this with the "PivotCacheDefinition" we used earlier. Here is some sample code that would use the dictionary above to pull the address value for each row:


This can all be greatly simplified by putting a method that accepts a CacheElement type and a PivotCacheRecord and extracts the value for us. This will always be a type string (since this is XML) which we will have to cast into whatever type of value that we want.

Not all values are stored in the CacheField elements since some of them are strings that are stored in the row/field itself. For these you can use something like...
PivotCacheRecord.ElementAt(x).GetAttributes().First(a => a.LocalName == "v").Value;

I found it very helpful to look inside the XLSX file itself and dig around to make sure I was pulling the correct data and learn where everything is stored. Just use your favorite ZIP extractor and unzip the XLSX. Here is what I found:

(root)\xl\sharedStrings.xml – An index of strings
(root)\xl\worksheets\sheet1.xml – Sheet 1’s cells, which may reference strings in sharedStrings.xml
(root)\xl\pivotCache\pivotCacheRecords1.xml – Each record which indexes values in pivotCacheDefinition1.xml, unless the particular field is of special String type (then it has the value itself)
(root)\xl\pivotCache\pivotCacheDefinition1.xml – The columns in the pivot table data (not what is shown in the Excel file) and the actual values that are indexed

Friday, April 15, 2011

Forcing Line-Breaks in a Cross-Browser Manner

This works in IE versions 6-9 as well as Firefox and Chrome:

word-wrap:break-word;

word-break:break-all;



This will force linebreaks in strings of really-really-really-really long text so that they do not overflow their container (ie: a DIV).

Thursday, April 14, 2011

AJAX CDNs

Both Google and Microsoft have AJAX/JavaScript/jQuery CDNs.

Google works directly with the key stake holders for each library effort and accepts the latest stable versions as they are released. Once we host a release of a given library, we are committed to hosting that release indefinitely.
The Libraries API takes the pain out of developing mashups in JavaScript while using a collection of libraries. We make it easy for you to host the libraries, correctly set cache headers, and stay current with the most recent bug fixes.

By taking advantage of the Microsoft Ajax CDN, you can significantly improve the performance of your Ajax applications. The contents of the Microsoft Ajax CDN are cached on servers located around the world. In addition, the Microsoft Ajax CDN enables browsers to reuse cached JavaScript files for Web sites that are located in different domains.

Both Microsoft and Google’s CDNs support HTTPS as well as HTTP, so they can be included in both. They also include both minified and full versions of the jQuery scripts.

Documentation on Microsoft’s AJAX CDN is here:
http://www.asp.net/ajaxlibrary/cdn.ashx
An example of loading jQuery from Microsoft’s CDN:
http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.5.2.min.js

Documentation on Google’s AJAX CDN is here:
http://code.google.com/apis/libraries/
An example of loading jQuery from Google’s CDN:
http://ajax.googleapis.com/ajax/libs/jquery/1.5.2/jquery.min.js
Google also has a “libraries API” so that JavaScript files can be loaded using code rather than inserting additional script tags:
http://code.google.com/apis/libraries/devguide.html

Thursday, January 6, 2011

Android "Hello World": a Tale of Woe

I consider myself a fairly patient man, but if there's one thing on this good green earth that should be stupid simple to do for a development platform is to find the development tools, install them, and be able to run a well documented "Hello World" example. I've done this any number of times in the past with everything from FORTRAN (on punch cards) to Objective-C, and it's generally been a positive experience.

And then there's Android.

I work every day with Visual Studio and the .NET Framework, and I've dabbled with iPhone development, so perhaps I'm spoiled...the stuff "just works" (at least for simple stuff). But my attempts to install the Android tools were an exercise in frustration.

The Android developer site provides lots of great, well organized information...so I knew that I had to install 3 different items to get up and running:
It started to get interesting when I had to figure out exactly which version of each of these I needed (since they all need to play nicely together), and found that the current version of Eclipse (Helios v3.6) is incompatible with the Android SDK, so I needed to install the previous version (Galileo v3.5). No biggie. Install different version of Eclipse.

And I needed to install the ADT Plugin for Eclipse. OK, done.

And I need to configure Eclipse so it knows where the Android SDK lives. Gotcha. (C:\Program Files (x86)\Android\android-sdk-windows was my location, for those of you who may be playing along at home.)

Things got a little more interesting when fired up the Android SDK and AVD Manager to create my first virtual device (i.e. the phone emulator). Ah, but which of the 5 different available versions of Android should I target? The latest and greatest is Gingerbread (v2.3), so I figured "Cool, I'll go with that!" I gave it the very clever device name of "gingerbread", and all seemed right with the world. But when I tried to start my shiny new virtual device, I got an error: "could not find virtual device named 'gingerbread'".

Grrrr.

A little digging, and I found that for some inexplicable reason, the virtual device had been created under a different machine user account, so the AVD Manager couldn't find the files it needed. Well, I can just drag and drop the files to my user directory.

No joy.

Well, I can just change the default location the AVD Manager uses when it creates an AVD. Dig, dig, dig. Ah, I can specify the location of the AVD files...by using a command line utility. Seriously? A command line utility?!? It's 2011...the year AFTER we make contact. WTH?

I find the syntax, and after much frustration (apparently it's both case-sensitive, and the order of the switches must also be important...because the example I copied and pasted from the page didn't work) I get it to work. I also figure that maybe the latest and greatest Android version isn't quite ready for prime time, so I revert back to the previous version (Froyo v2.2). So I now have an AVD (called "froyo") created in the location I wanted using the following:


android create avd –t 4 –n froyo –p c:\users\wloescher\.android\avd\ --force

(And don't get me started about TargetID vs. Version Number...sheesh!)

But of course this AVD doesn't show up in the AVD Manager, so I can't just click a button to fire it up. So by this time I'm calling in some help from my co-worker (who actually owns an Android phone and has gotten all this stuff working just fine on his machine). He comes up with the following command line to actually launch the virtual device:


emulator -data c:\users\wloescher\.android\avd\userdata.img

A minute or two later, and the virtual device is booted up and I have a virtual Android phone screen. Great! Now all I have to do is write a little code, and I'm done...right?

Wrong. There seems to be a disconnect between Eclipse and my AVD. The code seems to run fine (i.e. no errors), but nothing happens. Nothing. Not an electronic sausage. Just this error in the console:

[2011-01-06 10:43:11 - HelloAndroid] ------------------------------
[2011-01-06 10:43:11 - HelloAndroid] Android Launch!
[2011-01-06 10:43:11 - HelloAndroid] adb is running normally.
[2011-01-06 10:43:11 - HelloAndroid] Performing com.example.helloandroid.HelloAndroid activity launch
[2011-01-06 10:43:11 - HelloAndroid] Automatic Target Mode: launching new emulator with compatible AVD 'froyo'
[2011-01-06 10:43:11 - HelloAndroid] Launching a new emulator with Virtual Device 'froyo'
[2011-01-06 10:43:11 - Emulator] emulator: ERROR: unknown virtual device name: 'froyo'
[2011-01-06 10:43:11 - Emulator] emulator: could not find virtual device named 'froyo'


So...what have we learned?

Well, first off I'd like to thank the fine folks at Microsoft and Apple for making development tools that have a low cost of entry. Secondly, I hate the command line. Thirdly, I picked up a few new (to me) command line trick from my (very patient) coworker Michael Sneade. Fourthly, don't get involved in a land war in Asia. (Or was it don't bring a knife to a gunfight...I can never remember. Maybe it's both. Probably both, yeah.)

And fifthly and finally...if you want people to embrace and be excited about your development platform, don't frustrate the hell out of them right out of the gate. Goodbye (at least for now) to the world of Android...let's see how quickly I can get up and running on Windows Phone 7 and XNA Game Studio.

------------------------------------------------------------

UPDATE:
I was finally able to get the "Hello World" sample app to run in the emulator by adding a new Windows user environment variable:
1. Start Menu > Control Panel > System > Advanced System Settings (on the left) > Environment Variables
2. Add a new user variable (at the top):
     Variable name: ANDROID_SDK_HOME
     Variable value: C:\Users\WLoescher

I good friend reminded about Google's App Inventor (still in beta), and I'll do a post about that experience soon...it's very cool!