Calvin's Blog

DKIM (DomainKeys Identified Mail) Explained

∙ rfc∙ email∙ dkim
article meme

Want to see a fully working code example of the DKIM signature verification process?, click here.

Introduction

DKIM (DomainKeys Identified Mail) is a system that allows systems receiving emails to validate that the contents of the email have not been modified in transit or forged (sender changed or some other modification). At a high level this is accomplished by hashing parts of the email and the encrypting with the domains private key. Then the hash is recalculated by the verifier and checked against the decrypted hash that is retrieved by decrypting with the specific domain key for the subject provided.

Why did I dig so deep into DKIM?

I was onces asked to be an export witness in a local court case where a bill was in dispute. The sender of the bill had been inconsistent on the price of work already done, and had sent several invoices for the work in question. The first bill was much lower than the others, but the person who sent the invoice. I was asked if I could determine a way to prove who sent all of the invoices at dispute as the plaintiff was saying he never sent the original invoices and only sent the most recent. The invoices were sent with a online book keeping system that sends the emails on behalf of the user. I was able to see the reply-to header in the email was the plaintiff’s business email, and the reply-to header was signed via DKIM by two parties along the way. This was enough to get the plaintiff to drop the case!

I wrote up a report with the details and wrote an example program to reproduce the signature verification process so I would be able to talk to it from the position of an export!

Anatomy of the DKIM Signature header

DKIM signatures are headers in an email that indicate that the underlying email and specified headers have not been modified since signed by the signer

Here are some real DKIM signatures that I reviewed as an expert witness once.

DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; 
	d=notification.intuit.com; 
	h=content-type:from:mime-version:reply-to:subject:to; s=s1; 
	bh=1QvSCCr9Q9mtUf18zNMHrCPce9g=; b=XY8xuph5Sez9Vk2ZhyvMkyy96r3d3
	c2ejRu2I5RL3DnrHA0cstcA1BWXQ6atnPXkBBg4ysBEityxFK/ERcDgUAsSeo96U
	lJHwBT/ByRo9lVs3FZUoH/wkOQGqHlmzLQl5AX65/7b9ysQtgoHyMcPQzLPbUMLq
	ekb9njkbdYogUM=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=sendgrid.info; 
	h=content-type:from:mime-version:reply-to:subject:to; s=smtpapi; 
	bh=1QvSCCr9Q9mtUf18zNMHrCPce9g=; b=j1XsB8i+baR1K5yHinQoFwldhBZ/j
	WLjdMhXEZ8nzcvXQJJs3Ts1ZDPBj2LUzUUDdC9j/U2Pk8+4g4kG5LB85jujZv1dT
	G1LhF6ojNIfsUqvuUsV7hHD9UmSEDQD0/eoEtYZB17uXNB4w71QyOfu9wpPNBZaa
	sol4TWSqFSgoMM=

A single email message can contain multiple DKIM signatures from different signers.

The DKIM header fields are a semi-colon separated list of values. These fields are defined in the RFC as follows:

Anatomy of DKIM public key records

A fun tool for looking at DNS records is DNS Checker. You can use this to look at DNS records generally, and specifically the DKIM public key records

As described above the domain for the DKIM public key can be pieced together from the DKIM signature header from the email message. Lets take one of the DKIM signatures above:

DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; 
	d=notification.intuit.com; 
	h=content-type:from:mime-version:reply-to:subject:to; s=s1; 
	bh=1QvSCCr9Q9mtUf18zNMHrCPce9g=; b=XY8xuph5Sez9Vk2ZhyvMkyy96r3d3
	c2ejRu2I5RL3DnrHA0cstcA1BWXQ6atnPXkBBg4ysBEityxFK/ERcDgUAsSeo96U
	lJHwBT/ByRo9lVs3FZUoH/wkOQGqHlmzLQl5AX65/7b9ysQtgoHyMcPQzLPbUMLq
	ekb9njkbdYogUM=

At the time of this writing this url should show you the DKIM signature key for the DKIM signature header in the example above.

https://dnschecker.org/all-dns-records-of-domain.php?query=s1._domainkey.notification.intuit.com&rtype=TXT&dns=google

The domain used in that tool can be constructed as follows:

<<s>>._domainkey.<<d>> or in this case s1._domainkey.notification.intuit.com

The _domainkey piece is defined in the RFC-6376 Section 3.6.2.1

The txt record for this domain / selector is:

k=rsa; t=s; p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA4aoZo1M5VtzeUJuFcWLOtJ0z1T97lEmNK9LSbL3T/2/2UxbG5ruLjiNxpEqDlgc/jd4uGc9BMfk6M7sRslRAb/blVx6VqgE6+wqZTg+4K4/kGtFbNQdJMwIfQ4cnhl5gGJTyAvqUQIz+TQW6s9RaauABu4pWSTAXEwvxVUMl9+M2qc+r79NXA3fZXsUzeVgy0X5s3 615DuW0zSFYkRlW7j0hp7Dnx+boFQu134PxIW0xeKN0SNwsgz+O6OF86x19JYui5Xek8pQItWNx61xhLiHiVV689C4xf8jHVmkmeThOd1VrZ9K3lUSRwvwCtCIRlcBKQBgVIvZoAC/HvKarEQIDAQAB

The RFC-6376 Section 3.6.1 defines the fields in the public key record.

The public key fields are:

Canonicalization Algorithms

Details around the canonicalization algorithms used in DKIM are defined in RFC-6376 Section 3.4

Different email systems can modify messages in transit and that can break DKIM signatures. A method to address this is header and body canonicalization. There are two canonicalization algorithms defined in simple and relaxed

The canonicalization method is specified for the headers and message body. The DKIM signature header field c value is in the format of <<header canonicalization>>/<<body canonicalization>> for example c=relaxed/relaxed; indicates the headers and body should use the relaxed canonicalization algorithm when computing the message signature. Alternatively c=relaxed; is equivalent to c=relaxed/relaxed;.

Empty line definition

“An empty line is a line of zero length after removal of the line terminator” per RFC-6376 Section 3.4.3

Folding White Space

A folding white space is defined in RFC-5322 Section 3.2.2 as a CRLF sequence followed by white space

Computing Message Hashes

This algorithm is defined in RFC-6376 Section 3.7. Both signers and verifiers perform this algorithm the same way. Canonicalization is performed in preparation for generating the message hashes and in no way modified the underlying email message.

Signers and verifiers MUST compute two hashes:

  1. One over the body of the message.
  2. One over the selected header fields of the message.

In step 1 the body of the message is hashed after it is run through the specified canonicalization algorithm specified and truncated to a length specified by the signer or in the DKIM-Signature l= field.

That hash value is converted to Base64 and stored or compared to the bh= DKIM-Signature field.

l= DKIM-Signature Reminder

the l= DKIM-Signature field is the number of octets not characters. this is important to remember for proper hashing. See RFC-6376 Section 4 for details

In step 2 the following data is hashed in the order listed using the hash algorithm specified in the DKIM-Signature a= field.

  1. The canonicalized headers from the DKIM-Signature h= field with each header terminated by a CRLF sequence. Using the canonicalization algorithm specified.
  2. The canonicalized DKIM-Signature header with the b= field value set to an empty string. Using the canonicalization algorithm specified.

Note on DKIM-Signature h= field

A DKIM-Signature header must not be in its own h= field. Other DKIM-Signature headers can be included in a DKIM signature.

Computing verification steps: RFC-6376 Section 6.1.3

Signer Algorithm

This algorithm is defined in RFC-6376 section 5

  1. Select the private key corresponding to the public key for the chosen selector of the signature.
  2. Canonicalize message body with the appropriate canonicalization algorithm.
  3. Truncate canonicalized body to length that will be stored in the DKIM Signature l= field.
  4. Calculate body hash for the bh= field of the DKIM signature.
  5. Build list of headers used for signing the message for the h= field.
  6. Canonicalize header with the appropriate canonicalization algorithm.
  1. Calculate the message hash and sign it with your private key and put the signed hash into the DKIM signature b= field.
  2. Insert the DKIM header into the header section of the message.
  1. You are done!

Verifier Algorithm

This algorithm is defined in RFC-6376 section 6

  1. Extract DKIM signatures from the message. The RFC says it is at the discression of the implementor as to how many signatures they verify if multiple are present.
  1. Validate the DKIM Signature being validated including items listed in RFC 6370 6.1.1
  2. Retrieve the public key (domain key) for the signature
  3. Canonicalized the body data using the appropriate canonicalization algorithm and truncate the the length specified by the l= field.
  4. Compute the body hash using the algorithm specified by the a= field.
  5. Compare to the bh= field. If they match the process continues.
  6. Pull and canonicalized header from the h= field using the specified canonicalization algorithm.
  7. Canonicalize the DKIM header being validated with its bh field having its contents empied (i,e, bh="").
  1. Calculate the hash of the following data all concatenated together into an array. The canonicalized header data, the canonicalized DKIM signature being validated.
  2. Verify your computed hash against the decoded hash that was stored in the bh= field.
  3. If signatures match the message is validated! Otherwise the validation failed…

How Multiple Header Occurrences are Handled

This is described in detail in RFC 6376 5.4.2

In short if there are multiple of a header that is signed then you should interpret each instance of that header (for as many time as it shows in the signed headers) from the last instance to the first as they appear in the message headers.

For example:

GIVEN: h=meta,other headers...

If there are three meta headers

meta: 1

meta: 2

meta: 3

Then the instance meta=3 would be selected. If the meta header was in the h= list twice then meta=3 then meta=2 would be included in that order.

Example

I have a working message verification app here It would be trivial to implement signing, and one day I just might 😉

Glossary