Friday, October 24, 2008

Automatic Email-Address Control Verification with EAUT & OpenID

I recently read about an idea to accelerate OpenID adoption by turning-on RP's to the notion of using OpenID as a means to automatically verify email addresses.

As one of the author's of the EAUT Specification (EAUT Draft 5), I have some ideas on how to make this work. This blog-entry sums up my current thinking, so give it a read, and let me know if it could work, or if I'm just an idiot. ;)

Here goes....

Email Address Verification vs. Validation
Before diving in, it's important to clarify exactly what I'm talking about here. Please note that the definitions I'm presenting are completely made up by me, so it's possible you will have seen these terms and phrases elsewhere. However, for this blog-post, here's what they mean.
  • Email-address Verification merely ensures that the user[1] of a particular email address actually controls that email address.

  • Email Address Validation determines whether or not an email address works (i.e., does it properly receive email messages, and does the user respond). Validation can also provide insight into the "quality" of a particular email address, such as "Is this address controlled by a human or a robot?" and "Does a lot of spam come from this address?", etc.
To be clear, I'm mostly talking about email-address verification (of control), and not validation (though once an email address has been verified, some assumptions can be made about its validity -- more on that later).

Why is Verification Important?
Verification is important for a number of reasons. Most important, however, is that many public-facing websites (a form of Relying Party, or RP) use email to communicate with their users. These RP's know who each user is (based on the user's identifier), because that's how each user registered with the RP website (I should be more accurate here: each RP knows, at least, that the user logging into the RP controls the id used to login with -- who the actual human doing the login is....well, this is a different question).

At any rate, since many websites use email to communicate with their user-base, an email address is almost universally collected at registration. At this stage of the game (directly after the user has supplied an email address), the RP has no way to know that the person controlling the login also controls the email address that was just entered into the RP's system (e.g., a user named Beth might enter the email address 'gwb@whitehouse.gov', which she likely does not control. Most good RP's want to be sure they're sending emails to a person who wants to be receiving those email messages...otherwise it's spam).

Thus, RP's need to ensure with some high degree of certainty that the person who controls the userId used to login to the RP, also controls the email address that was supplied.

The most common form of email verification today is to send an email message with a secret code, and ask the user to either click a link (or enter the secret code directly) to verify that the email address is indeed controlled by the user who entered it. It is important to note that this also tends to semi-validate an email address as well, from the perspective that a successful email address verification also tells the RP that the email address in question can receive email messages (though it says nothing about the quality of the email address).

The Good Stuff: How to Create Automatic Email-Address Control Verification Service

Ok, so how to verify control of an email address? Follow these simple steps.

  1. Start with OpenID.
    OpenID was created for a variety of reasons, but one important reason was to allow Relying Parties (i.e., websites that we login to every day) to assert that the person logging into a particular site with a particular OpenID actually controls that OpenID. With OpenID, an RP can always be assured that the person logged-in controls a particular OpenID.

  2. Provide the RP with an Email Address.
    This part is not really defined for this article. There are lots of ways to do this, including "ask the user" for their email address, SREG, Attribute Exchange, and others like "make the email address a first-class OpenID". Additionally, there's the EAUT protocol, whereby a user can enter only their email address to login to an RP. The email address is then captured by the RP, transformed into an OpenID URL, and the user logs in via OpenID per usual (click here for an example of this translation in action). With EAUT, at least, the RP automatically "knows" the email address of the user, because that's what was used to login to the RP in the first place.

    That said, I'll mainly defer on these options for now, since the manner in which an RP gets a user's email address is tangential to verifying the email address, which is the subject of this blog entry.

  3. Utilize EAUT for Verification.
    The EAUT protocol specifies one method for mapping an email-address to an URL. The idea was originally proposed as a way utilize email-addresses in the OpenID authentication process. The rationale for such an idea was (and still is) that it's much more intuitive for a user to enter their email address, as opposed to having the user enter a URL that may be hard to remember, or un-familiar, etc.

    Ironically, EAUT is also applicable to many use-cases outside of plain-old-vanilla OpenID Authentication, as is the case here with Email-address Control Verification.

    It's key to note that EAUT offers only a one-way mapping, from email address to URL. So, one cannot take a URL and figure out what the email address is. This is a key privacy control.

    However, if an RP knows both a user's email address and an URL that the user controls (e.g., an OpenID), then EAUT can be used to verify whether or not that email address maps to the specified URL, and OpenID can be used to determine whether or not the two URL's are controlled by the same user (by checking to see if they match -- see more below).


    The basic collory is this: If an email address maps to a URL, and I can show (via OpenID) that I control that URL, then either: 1.) I control the email address, or 2.) the person who controls that email address wants me to also control it, or 3.) the person who controls the email address accidentally mapped it to a URL that I control, thus allowing me to register the email address on an RP site (This latter possibility is unlikely, especially if a mapping service is used).

    So, here's the workflow for automatic email address verification:

    1. User logs-in to an RP via OpenID (using URLs in this example, but see the "assumptions" section below for XRIs). Note that EAUT could be used to compliment OpenID, and in any case the end-result is that the RP knows that it's dealing with a person who controls the OpenID in question.
    2. RP receives the user's email address ("how" this is done is implementation defined - with EAUT-based login, the RP will automatically know the user's email address).
    3. Use EAUT to transform the email address to an Openid.
    4. If the EAUT-resulting OpenID matches the OpenID that the user logged-in with (or another OpenID that the user has demonstrated control over via OID Auth) then the Email is verified to be under the control of the person who also controls the OpenID.
    5. Otherwise, follow the OpenID auth flow on the EAUT-resulting URL. If OpenID auth is successful, then the RP now knows that the user in question controls both the OpenID URL resulting from EAUT, and the email address in question is thus assumed to be under this same user's control (otherwise, why would the email address map to a URL under this user's control?).
    6. If it cannot be demonstrated that the currently logged-in user controls the URL that the email address maps to, then fallback on "send me a secret token" email methodology in wide-use today.
Some Examples


Example 1: A Valid User Named Beth
  1. Assumption 1: Beth can send/receive messages with the email address 'beth@example.com'.
  2. Assumption 2: 'beth@example.com' maps (via EAUT) to the url 'http://beth.example.com'.
  3. Assumption 3: Beth can login to an RP with the OpenID 'http://beth.example.com'.
  4. Outcome: Beth very likely controls the email address 'beth@example.com'.
Example 2: A Spammer/Attacker
  1. Assumption 1: BadGuyBob cannot send/receive messages with the email address 'steve@apple.com' (i.e., this email address is not under BadGuyBob's control).
  2. Assumption 2: 'steve@apple.com' maps (via EAUT) to the url 'http://steve.apple.com'.
  3. Assumption 3: BadGuyBob cannot login to an RP using the OpenID URL 'http://steve.apple.com'.
  4. Assumption 3: BadGuyBob can login to an RP with the OpenID 'http://badguybob.example.com'.
  5. Outcome: BadGuyBob logs into the RP, but is unable to assert control over the email address 'steve@apple.com', so the RP does not allow him to enter this email address as his own on this RP.
Example 3: Should Never Happen
  1. Assumption 1: Dennis can send/receive messages with the email address 'dennis@example.com'.
  2. Assumption 2: 'dennis@example.com' maps (via EAUT) to the url 'http://dennis.example.com'.
  3. Assumption 3: Dennis does not control 'http://bill.microsoft.com'.
  4. Outcome: Dennis cannot even login to an RP with Bill's OpenID URL, so he will never even get to an email verification step on the RP's site.
---------------------------------------------------------------------------------------------
Assumptions
  1. The DNS system is not compromised. If DNS is compromised, then the mechanism in this proposal is not guaranteed to be accurate. However, if DNS is compromised, then traditional email verification (i.e., send a message with a secret code and have the user tell the RP what that code is) could be unrealiable as well. Thus, this proposal is no less-safe than the current system (it's arguably more safe), and is much easier on the end-user (email verification is automatic, instead of the manual process it is today).

  2. XRI OpenID's are not Discussed Here. The OpenID Authentication 2.0 spec allows both XRI's and URL's to be used as OpenID's. This article does not address how to verify email address control when an XRI is used as an OpenID, but EAUT supports XRI's, so it would seem trivial to use this method with XRI's as well.
-------------------------------------------------------------------------------------
[1] Owner/User/Controller is a tricky term, too. Who owns 'beth@big-web-co.com'? Does Beth own it, since it's the email address she uses to communicate with? Does big-web-co own that email address, since it sits in their domain, and is ultimately controlled by them and they're legal department? For purposes of this article, let's use the term "User", and take it to be Beth for the email address 'beth@big-web-co.com'. At some level, she both uses and controls this email address (at least until big-web-co tells her she can't).

2 comments:

  1. I also recently posted an entry on this subject, though yours is far more thoughrough and I like the idea of using the word "verification" to distinguish between this and actually proving a user can recieve mail at the address.

    I find it interesting that your proposal requires the RP to ask for both an "OpenID Identifier" (which you imply must be a HTTP URL) and an email address, and you then prove that the email address maps to the OpenID identifier.

    The model I'm imagining collapses this all into a single step: you ask the user for the email address at the outset, and then use that as the OpenID Identifier. OpenID can actually be used with any URI scheme for which a discovery mechanism is defined -- it supports XRI as well as HTTP, for example -- so we can simply define the discovery process for email addresses as "use EAUT to get an HTTP URL and do HTTP discovery as per the OpenID 2.0 specification". After you do this, everything else "just works", and your "primary key" at the RP is your email address.

    ReplyDelete
  2. Great Point in paragraph two of your comment. I have a slight case of the flu, so after re-reading my blog post, I think I was less than clear on that point.

    In an ideal world, people would login to their RP with EAUT, so the RP would have the email address in question (used to login) and the RP could assume that the email address is verified, since it maps to an OpenID that the user controls.

    I think the underlying thing to take from my blog entry is that if a user logs in with a URL or XRI OpenID, or a username/password even, then EAUT and OpenID could be used to verify that user's email address(s).

    The more I think about it, I think our two proposals are saying and doing the same thing, except that you are also advocating that an email-address be treated as a "native" (or 1st-Class) OpenID, whereas I am advocating that email-addresses merely "map" to an OpenID.

    Regardless of how we answer that question, I think email-address verification could work the same for both of our proposals.

    (Now we just need to figure out if email-addresses should be 1st or 2nd-class OpenID citizens, but that's a different arguement, I think).

    ReplyDelete