From Code to Community: Sponsoring The Perl and Raku Conference 2025 Learn more

/* MIT License
*
* Copyright (c) 2024 Brad House
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
* SPDX-License-Identifier: MIT
*/
/* DNS cookies are a simple form of learned mutual authentication supported by
* most DNS server implementations these days and can help prevent DNS Cache
* Poisoning attacks for clients and DNS amplification attacks for servers.
*
* A good overview is here:
*
* RFCs used for implementation are
* [RFC7873](https://datatracker.ietf.org/doc/html/rfc7873) which is extended by
*
* Though this could be used on TCP, the likelihood of it being useful is small
* and could cause some issues. TCP is better used as a fallback in case there
* are issues with DNS Cookie support in the upstream servers (e.g. AnyCast
* cluster issues).
*
* While most recursive DNS servers support DNS Cookies, public DNS servers like
* Google (8.8.8.8, 8.8.4.4) and CloudFlare (1.1.1.1, 1.0.0.1) don't seem to
* have this enabled yet for unknown reasons.
*
* The risk to having DNS Cookie support always enabled is nearly zero as there
* is built-in detection support and it will simply bypass using cookies if the
* remote server doesn't support it. The problem arises if a remote server
* supports DNS cookies, then stops supporting them (such as if an administrator
* reconfigured the server, or maybe there are different servers in a cluster
* with different configurations). We need to detect this behavior by tracking
* how much time has gone by since we received our last valid cookie reply, and
* if we exceed the threshold, reset all cookie parameters like we haven't
* attempted a request yet.
*
* ## Implementation Plan
*
* ### Constants:
* - `COOKIE_CLIENT_TIMEOUT`: 86400s (1 day)
* - How often to regenerate the per-server client cookie, even if our
* source ip address hasn't changed.
* - `COOKIE_UNSUPPORTED_TIMEOUT`: 300s (5 minutes)
* - If a server responds without a cookie in the reply, this is how long to
* wait before attempting to send a client cookie again.
* - `COOKIE_REGRESSION_TIMEOUT`: 120s (2 minutes)
* - If a server was once known to return cookies, and all of a sudden stops
* returning cookies (but the reply is otherwise valid), this is how long
* to continue to attempt to use cookies before giving up and resetting.
* Such an event would cause an outage for this duration, but since a
* cache poisoning attack should be dropping invalid replies we should be
* able to still get the valid reply and not assume it is a server
* regression just because we received replies without cookies.
* - `COOKIE_RESEND_MAX`: 3
* - Maximum times to resend a query to a server due to the server responding
* with `BAD_COOKIE`, after this, we switch to TCP.
*
* ### Per-server variables:
* - `cookie.state`: Known state of cookie support, enumeration.
* - `INITIAL` (0): Initial state, not yet determined. Used during startup.
* - `GENERATED` (1): Cookie has been generated and sent to a server, but no
* validated response yet.
* - `SUPPORTED` (2): Server has been determined to properly support cookies
* - `UNSUPPORTED` (3): Server has been determined to not support cookies
* - `cookie.client` : 8 byte randomly generated client cookie
* - `cookie.client_ts`: Timestamp client cookie was generated
* - `cookie.client_ip`: IP address client used to connect to server
* - `cookie.server`: 8 to 32 byte server cookie
* - `cookie.server_len`: length of server cookie
* - `cookie.unsupported_ts`: Timestamp of last attempt to use a cookies, but
* it was determined that the server didn't support them.
*
* ### Per-query variables:
* - `query.client_cookie`: Duplicate of `cookie.client` at the point in time
* the query is put on the wire. This should be available in the
* `ares_dns_record_t` for the request for verification purposes so we don't
* actually need to duplicate this, just naming it here for the ease of
* documentation below.
* - `query.cookie_try_count`: Number of tries to send a cookie but receive
* `BAD_COOKIE` responses. Used to know when we need to switch to TCP.
*
* ### Procedure:
* **NOTE**: These steps will all be done after obtaining a connection handle as
* some of these steps depend on determining the source ip address for the
* connection.
*
* 1. If the query is not using EDNS, then **skip any remaining processing**.
* 2. If using TCP, ensure there is no EDNS cookie opt (10) set (there may have
* been if this is a resend after upgrade to TCP), then **skip any remaining
* processing**.
* 3. If `cookie.state == SUPPORTED`, `cookie.unsupported_ts` is non-zero, and
* evaluates greater than `COOKIE_REGRESSION_TIMEOUT`, then clear all cookie
* settings, set `cookie.state = INITIAL`. Continue to next step (4)
* 4. If `cookie.state == UNSUPPORTED`
* - If `cookie.unsupported_ts` evaluates less than
* `COOKIE_UNSUPPORTED_TIMEOUT`
* - Ensure there is no EDNS cookie opt (10) set (shouldn't be unless
* requestor had put this themselves), then **skip any remaining
* processing** as we don't want to try to send cookies.
* - Otherwise:
* - clear all cookie settings, set `cookie.state = INITIAL`.
* - Continue to next step (5) which will send a new cookie.
* 5. If `cookie.state == INITIAL`:
* - randomly generate new `cookie.client`
* - set `cookie.client_ts` to the current timestamp.
* - set `cookie.state = GENERATED`.
* - set `cookie.client_ip` to the current source ip address.
* 6. If `cookie.state == GENERATED || cookie.state == SUPPORTED` and
* `cookie.client_ip` does not match the current source ip address:
* - clear `cookie.server`
* - randomly generate new `cookie.client`
* - set `cookie.client_ts` to the current timestamp.
* - set `cookie.client_ip` to the current source ip address.
* - do not change the `cookie.state`
* 7. If `cookie.state == SUPPORTED` and `cookie.client_ts` evaluation exceeds
* `COOKIE_CLIENT_TIMEOUT`:
* - clear `cookie.server`
* - randomly generate new `cookie.client`
* - set `cookie.client_ts` to the current timestamp.
* - set `cookie.client_ip` to the current source ip address.
* - do not change the `cookie.state`
* 8. Generate EDNS OPT record (10) for client cookie. The option value will be
* the `cookie.client` concatenated with the `cookie.server`. If there is no
* known server cookie, it will not be appended. Copy `cookie.client` to
* `query.client_cookie` to handle possible client cookie changes by other
* queries before a reply is received (technically this is in the cached
* `ares_dns_record_t` so no need to manually do this). Send request to
* server.
* 9. Evaluate response:
* 1. If invalid EDNS OPT cookie (10) length sent back in response (valid
* length is 16-40), or bad client cookie value (validate first 8 bytes
* against `query.client_cookie` not `cookie.client`), **drop response**
* as if it hadn't been received. This is likely a spoofing attack.
* Wait for valid response up to normal response timeout.
* 2. If a EDNS OPT cookie (10) server cookie is returned:
* - set `cookie.unsupported_ts` to zero and `cookie.state = SUPPORTED`.
* We can confirm this server supports cookies based on the existence
* of this record.
* - If a new EDNS OPT cookie (10) server cookie is in the response, and
* the `client.cookie` matches the `query.client_cookie` still (hasn't
* been rotated by some other parallel query), save it as
* `cookie.server`.
* 3. If dns response `rcode` is `BAD_COOKIE`:
* - Ensure a EDNS OPT cookie (10) is returned, otherwise **drop
* response**, this is completely invalid and likely an spoof of some
* sort.
* - Otherwise
* - Increment `query.cookie_try_count`
* - If `query.cookie_try_count >= COOKIE_RESEND_MAX`, set
* `query.using_tcp` to force the next attempt to use TCP.
* - **Requeue the query**, but do not increment the normal
* `try_count` as a `BAD_COOKIE` reply isn't a normal try failure.
* This should end up going all the way back to step 1 on the next
* attempt.
* 4. If EDNS OPT cookie (10) is **NOT** returned in the response:
* - If `cookie.state == SUPPORTED`
* - if `cookie.unsupported_ts` is zero, set to the current timestamp.
* - Drop the response, wait for a valid response to be returned
* - if `cookie.state == GENERATED`
* - clear all cookie settings
* - set `cookie.state = UNSUPPORTED`
* - set `cookie.unsupported_ts` to the current time
* - Accept response (state should be `UNSUPPORTED` if we're here)
*/
#include "ares_private.h"
/* 1 day */
#define COOKIE_CLIENT_TIMEOUT_MS (86400 * 1000)
/* 5 minutes */
#define COOKIE_UNSUPPORTED_TIMEOUT_MS (300 * 1000)
/* 2 minutes */
#define COOKIE_REGRESSION_TIMEOUT_MS (120 * 1000)
#define COOKIE_RESEND_MAX 3
static const unsigned char *
ares_dns_cookie_fetch(const ares_dns_record_t *dnsrec, size_t *len)
{
const ares_dns_rr_t *rr = ares_dns_get_opt_rr_const(dnsrec);
const unsigned char *val = NULL;
*len = 0;
if (rr == NULL) {
return NULL;
}
if (!ares_dns_rr_get_opt_byid(rr, ARES_RR_OPT_OPTIONS, ARES_OPT_PARAM_COOKIE,
&val, len)) {
return NULL;
}
return val;
}
static ares_bool_t timeval_is_set(const ares_timeval_t *tv)
{
if (tv->sec != 0 && tv->usec != 0) {
return ARES_TRUE;
}
return ARES_FALSE;
}
static ares_bool_t timeval_expired(const ares_timeval_t *tv,
const ares_timeval_t *now,
unsigned long millsecs)
{
ares_int64_t tvdiff_ms;
ares_timeval_t tvdiff;
ares_timeval_diff(&tvdiff, tv, now);
tvdiff_ms = tvdiff.sec * 1000 + tvdiff.usec / 1000;
if (tvdiff_ms >= (ares_int64_t)millsecs) {
return ARES_TRUE;
}
return ARES_FALSE;
}
static void ares_cookie_clear(ares_cookie_t *cookie)
{
memset(cookie, 0, sizeof(*cookie));
cookie->state = ARES_COOKIE_INITIAL;
}
static void ares_cookie_generate(ares_cookie_t *cookie, ares_conn_t *conn,
const ares_timeval_t *now)
{
ares_channel_t *channel = conn->server->channel;
ares_rand_bytes(channel->rand_state, cookie->client, sizeof(cookie->client));
memcpy(&cookie->client_ts, now, sizeof(cookie->client_ts));
memcpy(&cookie->client_ip, &conn->self_ip, sizeof(cookie->client_ip));
}
static void ares_cookie_clear_server(ares_cookie_t *cookie)
{
memset(cookie->server, 0, sizeof(cookie->server));
cookie->server_len = 0;
}
static ares_bool_t ares_addr_equal(const struct ares_addr *addr1,
const struct ares_addr *addr2)
{
if (addr1->family != addr2->family) {
return ARES_FALSE;
}
switch (addr1->family) {
case AF_INET:
if (memcmp(&addr1->addr.addr4, &addr2->addr.addr4,
sizeof(addr1->addr.addr4)) == 0) {
return ARES_TRUE;
}
break;
case AF_INET6:
/* This structure is weird, and due to padding SonarCloud complains if
* you don't punch all the way down. At some point we should rework
* this structure */
if (memcmp(&addr1->addr.addr6._S6_un._S6_u8,
&addr2->addr.addr6._S6_un._S6_u8,
sizeof(addr1->addr.addr6._S6_un._S6_u8)) == 0) {
return ARES_TRUE;
}
break;
default:
break; /* LCOV_EXCL_LINE */
}
return ARES_FALSE;
}
ares_status_t ares_cookie_apply(ares_dns_record_t *dnsrec, ares_conn_t *conn,
const ares_timeval_t *now)
{
ares_server_t *server = conn->server;
ares_cookie_t *cookie = &server->cookie;
ares_dns_rr_t *rr = ares_dns_get_opt_rr(dnsrec);
unsigned char c[40];
size_t c_len;
/* If there is no OPT record, then EDNS isn't supported, and therefore
* cookies can't be supported */
if (rr == NULL) {
return ARES_SUCCESS;
}
/* No cookies on TCP, make sure we remove one if one is present */
if (conn->flags & ARES_CONN_FLAG_TCP) {
ares_dns_rr_del_opt_byid(rr, ARES_RR_OPT_OPTIONS, ARES_OPT_PARAM_COOKIE);
return ARES_SUCCESS;
}
/* Look for regression */
if (cookie->state == ARES_COOKIE_SUPPORTED &&
timeval_is_set(&cookie->unsupported_ts) &&
timeval_expired(&cookie->unsupported_ts, now,
COOKIE_REGRESSION_TIMEOUT_MS)) {
ares_cookie_clear(cookie);
}
/* Handle unsupported state */
if (cookie->state == ARES_COOKIE_UNSUPPORTED) {
/* If timer hasn't expired, just delete any possible cookie and return */
if (!timeval_expired(&cookie->unsupported_ts, now,
COOKIE_REGRESSION_TIMEOUT_MS)) {
ares_dns_rr_del_opt_byid(rr, ARES_RR_OPT_OPTIONS, ARES_OPT_PARAM_COOKIE);
return ARES_SUCCESS;
}
/* We want to try to "learn" again */
ares_cookie_clear(cookie);
}
/* Generate a new cookie */
if (cookie->state == ARES_COOKIE_INITIAL) {
ares_cookie_generate(cookie, conn, now);
cookie->state = ARES_COOKIE_GENERATED;
}
/* Regenerate the cookie and clear the server cookie if the client ip has
* changed */
if ((cookie->state == ARES_COOKIE_GENERATED ||
cookie->state == ARES_COOKIE_SUPPORTED) &&
!ares_addr_equal(&conn->self_ip, &cookie->client_ip)) {
ares_cookie_clear_server(cookie);
ares_cookie_generate(cookie, conn, now);
}
/* If the client cookie has reached its maximum time, refresh it */
if (cookie->state == ARES_COOKIE_SUPPORTED &&
timeval_expired(&cookie->client_ts, now, COOKIE_CLIENT_TIMEOUT_MS)) {
ares_cookie_clear_server(cookie);
ares_cookie_generate(cookie, conn, now);
}
/* Generate the full cookie which is the client cookie concatenated with the
* server cookie (if there is one) and apply it. */
memcpy(c, cookie->client, sizeof(cookie->client));
if (cookie->server_len) {
memcpy(c + sizeof(cookie->client), cookie->server, cookie->server_len);
}
c_len = sizeof(cookie->client) + cookie->server_len;
return ares_dns_rr_set_opt(rr, ARES_RR_OPT_OPTIONS, ARES_OPT_PARAM_COOKIE, c,
c_len);
}
ares_status_t ares_cookie_validate(ares_query_t *query,
const ares_dns_record_t *dnsresp,
ares_conn_t *conn, const ares_timeval_t *now)
{
ares_server_t *server = conn->server;
ares_cookie_t *cookie = &server->cookie;
const ares_dns_record_t *dnsreq = query->query;
const unsigned char *resp_cookie;
size_t resp_cookie_len;
const unsigned char *req_cookie;
size_t req_cookie_len;
resp_cookie = ares_dns_cookie_fetch(dnsresp, &resp_cookie_len);
/* Invalid cookie length, drop */
if (resp_cookie && (resp_cookie_len < 8 || resp_cookie_len > 40)) {
return ARES_EBADRESP;
}
req_cookie = ares_dns_cookie_fetch(dnsreq, &req_cookie_len);
/* Didn't request cookies, so we can stop evaluating */
if (req_cookie == NULL) {
return ARES_SUCCESS;
}
/* If 8-byte prefix for returned cookie doesn't match the requested cookie,
* drop for spoofing */
if (resp_cookie && memcmp(req_cookie, resp_cookie, 8) != 0) {
return ARES_EBADRESP;
}
if (resp_cookie && resp_cookie_len > 8) {
/* Make sure we record that we successfully received a cookie response */
cookie->state = ARES_COOKIE_SUPPORTED;
memset(&cookie->unsupported_ts, 0, sizeof(cookie->unsupported_ts));
/* If client cookie hasn't been rotated, save the returned server cookie */
if (memcmp(cookie->client, req_cookie, sizeof(cookie->client)) == 0) {
cookie->server_len = resp_cookie_len - 8;
memcpy(cookie->server, resp_cookie + 8, cookie->server_len);
}
}
if (ares_dns_record_get_rcode(dnsresp) == ARES_RCODE_BADCOOKIE) {
/* Illegal to return BADCOOKIE but no cookie, drop */
if (resp_cookie == NULL) {
return ARES_EBADRESP;
}
/* If we have too many attempts to send a cookie, we need to requeue as
* tcp */
query->cookie_try_count++;
if (query->cookie_try_count >= COOKIE_RESEND_MAX) {
query->using_tcp = ARES_TRUE;
}
/* Resend the request, hopefully it will work the next time as we should
* have recorded a server cookie */
ares_requeue_query(query, now, ARES_SUCCESS,
ARES_FALSE /* Don't increment try count */, NULL);
/* Parent needs to drop this response */
return ARES_EBADRESP;
}
/* We've got a response with a server cookie, and we've done all the
* evaluation we can, return success */
if (resp_cookie_len > 8) {
return ARES_SUCCESS;
}
if (cookie->state == ARES_COOKIE_SUPPORTED) {
/* If we're not currently tracking an error time yet, start */
if (!timeval_is_set(&cookie->unsupported_ts)) {
memcpy(&cookie->unsupported_ts, now, sizeof(cookie->unsupported_ts));
}
/* Drop it since we expected a cookie */
return ARES_EBADRESP;
}
if (cookie->state == ARES_COOKIE_GENERATED) {
ares_cookie_clear(cookie);
cookie->state = ARES_COOKIE_UNSUPPORTED;
memcpy(&cookie->unsupported_ts, now, sizeof(cookie->unsupported_ts));
}
/* Cookie state should be UNSUPPORTED if we're here */
return ARES_SUCCESS;
}