Well, not according to the regular expression pattern I've used to validate email addresses for last several years. And with new domain name changes coming in 2013, I thought it was time to reexamine and revise my email validation best-practices.
Here's my existing pattern, which works just fine about 90% of the time:
\w+([-+.']\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*
After reading an excellent (albeit ancient) blog post by Phil Haack (whose short and sweet pattern didn't accommodate a@b.com either), I found a good (and very legible) solution in a comment to said post by SeanG. What I like best about his solution is that he actually broke down the pattern into its component parts, instead of just slapping it down in onelongunintelligiblesinglestring. The commenting may seem a little excessive to some, but when it comes to regular expressions (which I don't tinker with very often), I prefer to be spoon fed.
The end result is a new static C# class based on that code.
One nice feature of this class is that not only can it be used for server-side validation, but it also exposes the pattern so it can be used for the ValidationExpression property on a RegularExpressionValidator control. And I've also handled the case where the email address is not required.
So now I've centralized the pattern in one place, and when those new domain names start showing up in 2013 I can make changes to accommodate them in a single place.
using System;
using System.Text.RegularExpressions;
namespace Web.Business.Validators
{
public static class EmailValidator
{
#region Properties
public static string RegexPattern
{
get
{
// <any CHAR excepting <">, "\" & CR, and including linear-white-space>
string qtext = "[^\\x0d\\x22\\x5c\\x80-\\xff]";
// <any CHAR excluding "[", "]", "\" & CR, & including linear-white-space>
string dtext = "[^\\x0d\\x5b-\\x5d\\x80-\\xff]";
// *<any CHAR except specials, SPACE and CTLs>
string atom = "[^\\x00-\\x20\\x22\\x28\\x29\\x2c\\x2e\\x3a-\\x3c\\x3e\\x40\\x5b-\\x5d\\x7f-\\xff]+";
// "\" CHAR
string quoted_pair = "\\x5c[\\x00-\\x7f]";
// <"> *(qtext/quoted-pair) <">
string quoted_string = string.Format("\\x22({0}|{1})*\\x22", qtext, quoted_pair);
//atom / quoted-string
string word = string.Format("({0}|{1})", atom, quoted_string);
// "[" *(dtext / quoted-pair) "]"
string domain_literal = string.Format("\\x5b({0}|{1})*\\x5d", dtext, quoted_pair);
// atom
string domain_ref = atom;
// domain-ref / domain-literal
string sub_domain = string.Format("({0}|{1})", domain_ref, domain_literal);
// sub-domain *("." sub-domain)
string domain = string.Format("{0}(\\x2e{0})*", sub_domain);
// word *("." word)
string local_part = string.Format("{0}(\\x2e{0})*", word);
// local-part "@" domain
string addr_spec = string.Format("{0}\\x40{1}", local_part, domain); // add starting position and ending position
string regexPattern = string.Format("^{0}$", addr_spec);
return regexPattern;
}
}
#endregion
#region Public Methods
public static bool IsValid(string emailAddress)
{
return IsValid(emailAddress, true);
}
/// RFC822 complaint email address validation.
/// see http://iamcal.com/publish/articles/php/parsing_email/ for explaination
/// </summary>
/// <param name="emailAddress">Email address to check.</param>
/// <param name="isRequired">Is email address required?</param>
/// <returns><c>false</c> if not valid email address, otherwise <c>true</c>.</returns>
public static bool IsValid(string emailAddress, bool isRequired)
{
// Check to see if email address is required
if (!isRequired && string.IsNullOrEmpty(emailAddress.Trim()))
{
// Email address not required
return true;
}
return new Regex(RegexPattern).IsMatch(emailAddress);
}
#endregion
}
}
And just for additional yucks, here's a unit test with some pretty wacky examples of both valid and invalid email addresses. (Test cases come from above blog post and Wikipedia.)
/// A test for IsValid
/// </summary>
[TestMethod()]
public void IsValidTest()
{
ValidEmailAttribute target = new ValidEmailAttribute();
// Test valid email addresses
Assert.AreEqual(true, target.IsValid(null, false));
Assert.AreEqual(true, target.IsValid(string.Empty, false));
Assert.AreEqual(true, target.IsValid("a@b.com"));
Assert.AreEqual(true, target.IsValid("a@b.co"));
Assert.AreEqual(true, target.IsValid("a@b.c"));
Assert.AreEqual(true, target.IsValid("a.b.c'@example.com"));
Assert.AreEqual(true, target.IsValid(@"""Abc\@def""@example.com"));
Assert.AreEqual(true, target.IsValid(@"""Fred Bloggs""@example.com"));
Assert.AreEqual(true, target.IsValid(@"""Joe\\Blow""@example.com"));
Assert.AreEqual(true, target.IsValid(@"""Abc@def""@example.com"));
Assert.AreEqual(true, target.IsValid("customer/department=shipping@example.com"));
Assert.AreEqual(true, target.IsValid("$A12345@example.com"));
Assert.AreEqual(true, target.IsValid("!def!xyz%abc@example.com"));
Assert.AreEqual(true, target.IsValid("_somename@example.com"));
Assert.AreEqual(true, target.IsValid("niceandsimple@example.com"));
Assert.AreEqual(true, target.IsValid("a.little.unusual@example.com"));
Assert.AreEqual(true, target.IsValid("a.little.more.unusual@dept.example.com"));
Assert.AreEqual(true, target.IsValid(@"much.""more\ unusual""@example.com"));
Assert.AreEqual(true, target.IsValid(@"very.unusual.""@"".unusual.com@example.com"));
Assert.AreEqual(true, target.IsValid(@"very.""(),:;<>[]"".VERY.""very\\\ \@\""very"".unusual@strange.example.com"));
// character @ is missing
Assert.AreEqual(false, target.IsValid("Abc.example.com"));
// only one @ is allowed outside quotations marks
Assert.AreEqual(false, target.IsValid("A@b@c@example.com"));
// none of the characters before the @ in this example is allowed outside quotation marks
Assert.AreEqual(false, target.IsValid(@"""(),:;<>[\]@example.com"));
// quoted strings must be dot separated or the only element making up the local-part
Assert.AreEqual(false, target.IsValid(@"just""not""right@example.com"));
// spaces, quotes and slashes may only exist when within quoted strings and preceded by a slash
Assert.AreEqual(false, target.IsValid(@"this\ is\""really\""not\\allowed@example.com"));
}
Share & Enjoy!
i think this should be obvious, but i'm new to .net... how do you implement this with a custom validator?
ReplyDeletethanks!
1. Create a TextBox control on your page called "EmailTextBox".
ReplyDelete2. Cretae a RegularExpressionValidator control on your page called "EmailRegularExpressionValidator".
3. Set the ControlToValidate property of EmailRegularExpressionValidator to "EmailTextBox".
4. In the Page_Load event of your code behind, set the ValidationExpression property of your RegularExpressionValidator control to the RegexPattern proerty of the EmailValidator object:
EmailRegularExpressionValidator.ValidationExpression = EmailValidator.RegexPattern;
5. Make sure you check the Page.IsValid property at the beginning of the OnClick event of your form submission button.
protected void SubmitImageButton_Click(object sender, EventArgs e)
{
// Check for page validation
Page.Validate();
if (!Page.IsValid)
{
throw new ApplicationException("Validation error on page.");
}
// Data entered for all controls appears to be valid!
}
I hope that helps...if not, let me know.
RFC 822 is not the standard any more. It hasn't been for a very long time. You want to look at RFC 5321.
ReplyDeleteThanks Michael...will do, and then will post a code update. Cheers!
DeleteThis is an excellent component for verifying email addresses:
ReplyDeletehttp://www.kellermansoftware.com/p-37-net-email-validation.aspx
Thanks Asava - I'll check that out. Cheers!
Delete