DKIM (DomainKeys Identified Mail) Explained
Want to see a fully working code example of the DKIM signature verification process?, click here.
Introduction
DKIM (DomainKeys Identified Mail) is a system that allows systems receiving emails to validate that the contents of the email have not been modified in transit or forged (sender changed or some other modification). At a high level this is accomplished by hashing parts of the email and the encrypting with the domains private key. Then the hash is recalculated by the verifier and checked against the decrypted hash that is retrieved by decrypting with the specific domain key for the subject provided.
Why did I dig so deep into DKIM?
I was onces asked to be an export witness in a local court case where a bill was in dispute. The sender of the bill had been inconsistent on the price of work already done, and had sent several invoices for the work in question. The first bill was much lower than the others, but the person who sent the invoice. I was asked if I could determine a way to prove who sent all of the invoices at dispute as the plaintiff was saying he never sent the original invoices and only sent the most recent. The invoices were sent with a online book keeping system that sends the emails on behalf of the user. I was able to see the reply-to header in the email was the plaintiff’s business email, and the reply-to header was signed via DKIM by two parties along the way. This was enough to get the plaintiff to drop the case!
I wrote up a report with the details and wrote an example program to reproduce the signature verification process so I would be able to talk to it from the position of an export!
Anatomy of the DKIM Signature header
DKIM signatures are headers in an email that indicate that the underlying email and specified headers have not been modified since signed by the signer
Here are some real DKIM signatures that I reviewed as an expert witness once.
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed;
d=notification.intuit.com;
h=content-type:from:mime-version:reply-to:subject:to; s=s1;
bh=1QvSCCr9Q9mtUf18zNMHrCPce9g=; b=XY8xuph5Sez9Vk2ZhyvMkyy96r3d3
c2ejRu2I5RL3DnrHA0cstcA1BWXQ6atnPXkBBg4ysBEityxFK/ERcDgUAsSeo96U
lJHwBT/ByRo9lVs3FZUoH/wkOQGqHlmzLQl5AX65/7b9ysQtgoHyMcPQzLPbUMLq
ekb9njkbdYogUM=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=sendgrid.info;
h=content-type:from:mime-version:reply-to:subject:to; s=smtpapi;
bh=1QvSCCr9Q9mtUf18zNMHrCPce9g=; b=j1XsB8i+baR1K5yHinQoFwldhBZ/j
WLjdMhXEZ8nzcvXQJJs3Ts1ZDPBj2LUzUUDdC9j/U2Pk8+4g4kG5LB85jujZv1dT
G1LhF6ojNIfsUqvuUsV7hHD9UmSEDQD0/eoEtYZB17uXNB4w71QyOfu9wpPNBZaa
sol4TWSqFSgoMM=
A single email message can contain multiple DKIM signatures from different
signers.
The DKIM header fields are a semi-colon separated list of values. These fields are defined in the RFC as follows:
v(REQUIRED)The version of DKIM being used. The RFC specifies the value be1
a(REQUIRED)The algorithm used for generating signatures. The RFC specifiesrsa-sha1andrsa-sha256as possible values.
b(REQUIRED)The base64 encoded signature for the message.
bh(REQUIRED)The hash of thecanonicalizedbody of the message. The number of bytes used to produce this hash can be limited by thelfield.
c(OPTIONAL)The messagecanonicalizationalgorithm. This is used to tell theverifierhow thesignercanonicalized the data of the message before signing. The format is<<header canonicalization>>/<<body canonicalization>>alternatively a single value likerelaxedequates torelaxed/relaxed
d(REQUIRED)TheSigning Domain Identifierof the entity that signed the email. This MUST be a valid DNS name which theDKIMpublic key is published. This value is recommended to be in the form of an email addressuser@doamin.tld.
h(REQUIRED)The signed headers field. This is a colon separated list of header fields that were included in the data used to produce the email signature. This is a ordinal list, and indicates the order that the header fields were hashed in. Headers can be specified in this list do not necessarily have to be present in the email to be used in the signature. According toRFC-6376Page 21non existent header fields are treated as null input to the signature algorithm.
i(OPTIONAL)TheAgent or User Identifierwho signed the message on behalf of theSDID. This value is recommended to be in the form of an email addressuser@doamin.tld.
l(OPTIONAL)The body length count. This is the number of octets from the body are used in the production of the message signatures. If this field is not present the entire message body is used in the resulting signature and hashing.
q(OPTIONAL)A colon separated list of query methods used to retrieve the public key. The default value isdns/txt, and in practice ive never seen this specified.
s(REQUIRED)Theselectorfor the public key. This is relevant for selecting the correct public key them performing signature verification. Theselectorcombined with the value of thedfield and theqfield can be used to retrieve the public key for signature verification. The public key is typically stored in atxtrecord under the following domain<<s>>._domainkey.<<d>>where<<s>>and<<d>>are the value of theselectorandSDID. Theqheader field specified where the key can be retrieved, but the default is storing the public key data as a DNS txt record.
t(RECOMMENDED)The signature timestamp.
x(RECOMMENDED)The signature expiration.
z(OPTIONAL)The copied header fields. A|separated list of the selected header fields (specified in thehDKIM header field). The header and value for each is separated by a colon. ex:z=From:foo@bar.com|To:my@email.com|Subject:demo=20subject;. Thedemo=20subjectis an encoded specified in the RFC.
Anatomy of DKIM public key records
A fun tool for looking at DNS records is DNS Checker. You can use this to look at DNS records generally, and specifically the DKIM public key records
As described above the domain for the DKIM public key can be pieced together from the DKIM signature header from the email message. Lets take one of the DKIM signatures above:
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed;
d=notification.intuit.com;
h=content-type:from:mime-version:reply-to:subject:to; s=s1;
bh=1QvSCCr9Q9mtUf18zNMHrCPce9g=; b=XY8xuph5Sez9Vk2ZhyvMkyy96r3d3
c2ejRu2I5RL3DnrHA0cstcA1BWXQ6atnPXkBBg4ysBEityxFK/ERcDgUAsSeo96U
lJHwBT/ByRo9lVs3FZUoH/wkOQGqHlmzLQl5AX65/7b9ysQtgoHyMcPQzLPbUMLq
ekb9njkbdYogUM=
At the time of this writing this url should show you the DKIM signature key for the DKIM signature header in the example above.
The domain used in that tool can be constructed as follows:
<<s>>._domainkey.<<d>> or in this case s1._domainkey.notification.intuit.com
The _domainkey piece is defined in the RFC-6376 Section 3.6.2.1
The txt record for this domain / selector is:
k=rsa; t=s; p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA4aoZo1M5VtzeUJuFcWLOtJ0z1T97lEmNK9LSbL3T/2/2UxbG5ruLjiNxpEqDlgc/jd4uGc9BMfk6M7sRslRAb/blVx6VqgE6+wqZTg+4K4/kGtFbNQdJMwIfQ4cnhl5gGJTyAvqUQIz+TQW6s9RaauABu4pWSTAXEwvxVUMl9+M2qc+r79NXA3fZXsUzeVgy0X5s3 615DuW0zSFYkRlW7j0hp7Dnx+boFQu134PxIW0xeKN0SNwsgz+O6OF86x19JYui5Xek8pQItWNx61xhLiHiVV689C4xf8jHVmkmeThOd1VrZ9K3lUSRwvwCtCIRlcBKQBgVIvZoAC/HvKarEQIDAQAB
The RFC-6376 Section 3.6.1 defines the fields in the public key record.
The public key fields are:
v(RECOMMENDED)The version of the DKIM key record. According the the RFC this must be set toDKIM1.
h(OPTIONAL)Acceptable hash algorithms. A colon separated list of algorithms that might be used.
k(OPTIONAL)The key type. The default isrsaand the RFC says that any verifier must supportrsakey types.
n(OPTIONAL)Human readable notes. These are not processed in any way.
p(REQUIRED)The base64 encoded public key.
s(OPTIONAL)A colon separated list of services for which this record applies.*matches all services. Defaults toemail
t(OPTIONAL)Flags. There are two defined flagsyindicates that this domain is resting DKIM. From a practical perspective this means any verifier must fail signature verification even if the signature is valid according to the RFC.sindicates that antDKIMsignature header field using theiflag anddflag must have the same domain after the @ symbol.
Canonicalization Algorithms
Details around the canonicalization algorithms used in DKIM are defined in RFC-6376 Section 3.4
Different email systems can modify messages in transit and that can break DKIM signatures. A method to address this is header and body canonicalization. There are two canonicalization algorithms defined in simple and relaxed
The canonicalization method is specified for the headers and message body. The DKIM signature header field c value is in the format of <<header canonicalization>>/<<body canonicalization>> for example c=relaxed/relaxed; indicates the headers and body should use the relaxed canonicalization algorithm when computing the message signature. Alternatively c=relaxed; is equivalent to c=relaxed/relaxed;.
- simple-header algorithm
- Reads the headers as is. No changes are made to the headers. If anything changes headers later in the chain including non material changes like header field case will beak the DKIM signature
- simple-body algorithm
- Only ensures that the message ends with a
CRLFsequence. This means removing anyempty lines at the end of the body if present. If there is no body then a singleCRLFsequence is used as the body for producing a hash.
- Only ensures that the message ends with a
- relaxed-header algorithm
- Includes the following steps:
- Make all header field names lower case. Ex.
DKIM-Signaturebecomesdkim-signature. - All
CRLForcontinuation linesmust be removed from header values except for those that fall at the end of a value The trailing white space after theCRLFsequence must be preserved. This process is calledunfoldingand is defined in RFC-5322 Section 2.2.3.CRLFsequences at the end of header values MUST be preserved. - Convert all instances of multiple white space characters to a single space character.
- Delete all white space characters at the end of unfolded header field values.
- Delete any white space characters before and after the colon separating the header field name and value. You MUST leave the colon in place.
- Make all header field names lower case. Ex.
- Includes the following steps:
- relaxed-body algorithm
- Includes the following steps:
- Remove all whitespace at the end of lines. Must retain the
CRLFsequence at the end of the line. - Reduce all white space in lines to a single space character.
- Remove all empty lines at the end of the body message. If the message does not end with a
CRLFadd one. The message MUST end with oneCRLFsequence
- Remove all whitespace at the end of lines. Must retain the
- Includes the following steps:
Empty line definition
“An empty line is a line of zero length after removal of the line terminator” per RFC-6376 Section 3.4.3
Folding White Space
A
folding white spaceis defined in RFC-5322 Section 3.2.2 as aCRLFsequence followed by white space
Computing Message Hashes
This algorithm is defined in RFC-6376 Section 3.7. Both signers and verifiers perform this algorithm the same way. Canonicalization is performed in preparation for generating the message hashes and in no way modified the underlying email message.
Signers and verifiers MUST compute two hashes:
- One over the body of the message.
- One over the selected header fields of the message.
In step 1 the body of the message is hashed after it is run through the specified canonicalization algorithm specified and truncated to a length specified by the signer or in the DKIM-Signature l= field.
That hash value is converted to Base64 and stored or compared to the bh= DKIM-Signature field.
l=DKIM-Signature Reminderthe
l=DKIM-Signature field is the number of octets not characters. this is important to remember for proper hashing. See RFC-6376 Section 4 for details
In step 2 the following data is hashed in the order listed using the hash algorithm specified in the DKIM-Signature a= field.
- The canonicalized headers from the DKIM-Signature
h=field with each header terminated by aCRLFsequence. Using the canonicalization algorithm specified. - The canonicalized DKIM-Signature header with the
b=field value set to an empty string. Using the canonicalization algorithm specified.
Note on DKIM-Signature
h=fieldA DKIM-Signature header must not be in its own
h=field. Other DKIM-Signature headers can be included in a DKIM signature.
Computing verification steps: RFC-6376 Section 6.1.3
Signer Algorithm
This algorithm is defined in RFC-6376 section 5
- Select the private key corresponding to the public key for the chosen
selectorof the signature. - Canonicalize message body with the appropriate
canonicalization algorithm. - Truncate canonicalized body to length that will be stored in the
DKIM Signature l= field. - Calculate body hash for the
bh=field of theDKIM signature. - Build list of headers used for signing the message for the
h=field. - Canonicalize header with the appropriate
canonicalization algorithm.
- This includes the
DKIM signaturebeing constructed as part of the signing process.
- Calculate the message hash and sign it with your private key and put the signed hash into the
DKIM signatureb=field. - Insert the DKIM header into the header section of the message.
- Some detail on inserting the
DKIM signaturein RFC 6376 5.6
- You are done!
Verifier Algorithm
This algorithm is defined in RFC-6376 section 6
- Extract
DKIM signaturesfrom the message. The RFC says it is at the discression of the implementor as to how many signatures they verify if multiple are present.
- Each subsequent step will apply to as many
DKIM signaturesas you descide to validate
- Validate the
DKIM Signaturebeing validated including items listed in RFC 6370 6.1.1 - Retrieve the public key (
domain key) for the signature - Canonicalized the body data using the appropriate
canonicalization algorithmand truncate the the length specified by thel=field. - Compute the body hash using the algorithm specified by the
a=field. - Compare to the
bh=field. If they match the process continues. - Pull and canonicalized header from the
h=field using the specifiedcanonicalization algorithm. - Canonicalize the DKIM header being validated with its
bhfield having its contents empied (i,e,bh="").
- NOTE: this includes the body hash field
b=and since we validated that first we can be sure the how message is being validated.
- Calculate the hash of the following data all concatenated together into an array. The canonicalized header data, the canonicalized
DKIM signaturebeing validated. - Verify your computed hash against the decoded hash that was stored in the
bh=field. - If signatures match the message is validated! Otherwise the validation failed…
How Multiple Header Occurrences are Handled
This is described in detail in RFC 6376 5.4.2
In short if there are multiple of a header that is signed then you should interpret each instance of that header (for as many time as it shows in the signed headers) from the last instance to the first as they appear in the message headers.
For example:
GIVEN: h=meta,other headers...
If there are three meta headers
meta: 1
meta: 2
meta: 3
Then the instance meta=3 would be selected. If the meta header was in the h= list twice then meta=3 then meta=2 would be included in that order.
Example
I have a working message verification app here It would be trivial to implement signing, and one day I just might 😉