Notification of a delivery failure in Mailgun

When you use an email service provider like Gmail, you get delivery failure emails. These emails are great because it enables you to reach out to your intended recipient in another way, or just retry the email later on.

Because I use Mailgun with Gmail’s Send Mail As feature for my custom domains, I needed to implement a solution that handles delivery failures.

Background

I’m cheap, so I host my domains with Cloudflare as they don’t charge a markup – I pay what they pay. Cloudflare recently introduced email forwarding which is great. I used to send and forward emails with Mailgun on their free plan, but they removed the forwarding feature right around the time Cloudflare introduced theirs, lucky me.

I ran into the situation of sending multiple chaser emails to someone because I thought they’d just not got back to me. It turned out my emails just weren’t getting through but I didn’t know. I then started down this path.

Mailgun offers webhooks for many different events. Now I just had to build something to handle the delivery failed event. I’d built some Lambda functions before and Netlify functions but Cloudflare Workers was new at the time, so I decided to build it on that instead.

How I built it

To start, I installed Wrangler, Cloudflare’s development CLI, and got a project set up. A Cloudflare Worker runs a short snippet of code on the edge. I wanted to use Node, because JavaScript is a language I’m most comfortable in, so had to enable the node_compat flag on the worker to allow various methods inside the crypto functions to work. I also set up CI using a GitHub action so each time I push changes to GitHub, the worker gets built and deployed to the global Cloudflare network. I used the cloudflare/wrangler-action package to do this.

A simple overview of how it works is the worker receives a request and parses its body. If there is no request body, then the worker returns a 405 bad request status code. With the body, we create a hmacDigest from the body timestamp, token, and a special signing key that we get from the Mailgun dashboard. I’m using the crypto-js library to do this. We compare this calculated digest with the signature in the request body to make sure they match. This authenticates the request to ensure only our Mailgun account can send webhooks to this Worker. We also cache this digest, along with the URL so that we can see whether it has been used before. Mailgun wouldn’t send a webhook event twice, so it must be someone trying to get information by replaying an old webhook event. By validating whether we’ve processed the digest before, we can prevent replay attacks.

// Make sure a body is included
let body
try {
    body = await request.json()
} catch(e) {
    return new Response("Bad request", { status: 405 })
}
// Verify that the Mailgun Signature matches the one that they sent us
const hmacDigest = hex.stringify(hmacSHA256(body.signature.timestamp + body.signature.token, env.MAILGUN_SIGNING_KEY))
// Load Cloudflare Cache
const cache = caches.default
// Set Cache Key for this signature = https://worker.domain/signature
const cacheKey = request.url + hmacDigest
// Ensure the signature has not been used already
const alreadyUsedSignature = await cache.match(cacheKey)
if (alreadyUsedSignature !== undefined) {
    return new Response("This is a replay attack. The signature has been used before", { status: 401 })
}
if (hmacDigest !== body.signature.signature) {
    return new Response("Could not verify signature", { status: 406 })
}

Now we have all of the security out of the way, we get to actually processing the error message received from Mailgun. First, we get the recipient, so we know who we need to email back seeing as they didn’t receive our message. Then we grab the sender, as it may be you manage more than 1 domain so you need to know who sent the email. The email subject helps too as it may be you sent more than 1 email to the same person. We also want to be informed of the error message, so we get that too.

To allow us to send mail, we use Mailgun’s HTTP API, configured using env variables. We have one for what email should send these notifications, and one for who the notifications should go to. We then just need a Mailgun API key to allow us to send the email. This API key should match the domain you’d like to send email from.

// Set up the email to send
const mailOptions = {
    from: `Galexia Mail Reporting <info@${env.DOMAIN}>`,
    to: env.REPORTING_ADDRESS,
    subject: "New delivery failure in Mailgun",
    text: `
An email to:
${body['event-data'].recipient}

From:
${body['event-data'].envelope.sender}

With a subject of:
${body['event-data'].message.headers.subject}

Has failed.

The error message was:
${body['event-data']['delivery-status'].description || body['event-data']['delivery-status'].message}
`
};
// Convert the email JSON to FormData
const form_data = new FormData()
for (var key in mailOptions) {
    form_data.append(key, mailOptions[key]);
}
// Send the email
const sendEmail = await fetch(`https://api.eu.mailgun.net/v3/${env.DOMAIN}/messages`, {
    method: 'POST',
    body: form_data,
    headers: {
        'authorization': `Basic ${new Buffer('api' + ':' + env.MAILGUN_API_KEY).toString('base64')}`,
        'accept': 'application/json'
    }
})

Setting this all up gets me delivery failure messages like this:

Of course, if you get an error sending that email, you could be stuck in a loop and still get silent failures, so I’ve set my recipient to a Gmail address which has pretty good availability.

In summary, having this worker set up allows me to catch any emails that don’t get delivered, and then be able to dive into the Mailgun logs to debug the issue. On the flex plan, logs are only stored for 5 days, so having this webhook set up gives me near instant error messages and allows me to action any issues quickly.

To build on this, I’d want to set up something to notify me of forwarding failures from Cloudflare’s free email forwarding.