View Issue Details

IDProjectCategoryView StatusLast Update
0008892GNUnetGNSpublic2024-06-08 10:33
Reporterbellebaum Assigned To 
PrioritylowSeverityminorReproducibilityN/A
Status newResolutionopen 
Summary0008892: Design Considerations for future *KEY types
DescriptionThe Design of GNS thus far could be improved to allow easier cryptographic analysis. Each consideration here is of minor severity, i.e. there is no immediate consequence of not implementing the proposed changes, to the best of my knowledge. Nevertheless, these (in some cases breaking) changes might be worth considering for a cleaner design.
I am currently working off RFC 9498 rather than the actual implementation, so I apologize should my analysis not reflect changes made in the meantime.

# Making Full Use of HKDF

To derive a bunch of random looking bits from a zone key (zkey below) and a label, one could take a hash function H and output

H(zkey || label || 0) || H(zkey || label || 1) || H(zkey || label || 2) || ...

where || denotes concatenation.
This is well justified if one is willing to model H as an ideal random function, whose output may only be correlated with its input only by plugging in values and observing what comes out. (This is routinely called a Random Oracle.)
However, algorithms are not random oracles, and in particular cannot always be trusted to generate arbitrarily many independent output bits from highly correlated inputs. For this very reason, modern key derivation functions like HKDF derive only a single key from a keying distribution using a dedicated component called a strong randomness extractor ("Extract") and then use this one key to derive independent looking key streams using another dedicated component called pseudorandom function ("Expand"). In HKDF, these are both implemented using HMAC, which is secure almost independent of where you plug in parameters, IF you are willing to model HMAC or the underlying hash function as a random oracle (which is why the current implementation is "probably" fine). A modular analysis of schemes however requires using HKDF using its extract and expand functionality in the sense above.

For this reason, it may be good practice to first derive blinding key material bkm as follows:

salt := "gns-extract-" || zone_type
bkm := HKDF-Extract(salt, zkey || label)

zkey is assumed to have fixed length, a length encoding or some sort of final delimiter.
The salt here is merely used as a domain separation tag. It includes the zone type so that the encoding of zkey may differ between zone types without having to worry about collisions.
The initial keying material (zkey || label) is chosen such that both the combination of a sufficiently unknown zkey with a predictable label (as required by GNS for unlinkability in a web-surfing-like scenario) as well as the combination of a known zkey with a sufficiently random label (as required for censorship resistance and privacy with re:claim) result in input keying material distributions of sufficient entropy for the extractor to do its job and extract a bkm indistinguishable from random.

bkm may then be used to derive arbitrary random data using HKDF-Expand. For example, to derive a symmetric key for (say) AES-GCM and a blinding factor for (say) EdDSA key blinding one could use

K := HKDF-Expand(bkm, "gns-encrypt-aes-gcm-key", 256/8)
h := HKDF-Expand(bkm, "gns-blind-eddsa", 512/8)

In terms of the RR data confidentiality, query privacy and censorship resistance, this should simplify the analysis a decent bit.

# Simplifying and Future-Proofing the Symmetric Encryption

Contrary to the terminology in RFC 9498, the actual nonce ("number used once") is not the NONCE derived using HKDF (which is always the same, since it is derived deterministically for each zkey-label pair), but rather the expiration time of the record. Encryption algorithms typically only provide any guarantees if this value is only used once.

There are two approaches to "guaranteeing" this. One could write a "MUST" into a specification detailing that expiration times shall be strictly increasing. This approach might however leave the not cryptographically inclined reader (or non-reader. How many people using DNS have read the specifications? :D) with the impression that violating this constraint heuristically to support distributed signing infrastructures, temporary setups and backup solutions might only lead to temporary interoperability problems, rather than a leak of confidential information. Even if all users and implementations were to understand this, it would complicate the above use cases and, as a result, make GNS less flexible.

Alternatively, one could take out the problem. That is, ideally generating a random (at least) 64-bit nonce each time and placing it next to the ciphertext. If one is worried about accidental nonce reuse, there are AEAD schemes like AES-GCM-SIV to handle this exact case. If one does not want to store ciphertexts locally and rather re-encrypt plaintexts on demand, one may either just store the nonce values alongside the expiration timestamp or derive them deterministically by applying an independently keyed pseudorandom function to the plaintext, expiration and blinded public key.

Speaking of AEAD schemes, with the internet moving away from giving users access to low-level cryptographic primitives, many libraries will primarily expose, review, and also optimize, AEAD schemes. It does little harm to use these for encryption, even though integrity is already protected by a digital signature. The AD may also be used to bind the ciphertext to a particular blinded key.

To illustrate the full RRBlock creation, assume a layout such as (bitlengths in brackets, "varies" depends on zone_type or data length):

size(32) || zone_type(32) || expiration(64) || blind_zone_key(varies) || nonce(64) || bdata(varies) || aead_tag(128) || signature(varies)

Here I have left size and zone_type unchanged for compatibility with existing *KEY types, expiration has moved slightly to the front to have fixed size key types in the header as much as possible. Doing so for the nonce would make AD-calculation and deterministic nonce-calculation more cumbersome.

Creation steps:
1. Fill in size, zone_type and expiration
2. Do the KDF extract step above
3. Use the KDF expand to derive randomness for blinding zkey, and fill in blind_zone_key
4. Fill in the nonce randomly, using a stored value, or by applying a PRF to the parts already filled in
5. Use the KDF expand to derive a symmetric encryption key
6. Fill in bdata and aead_tag by encrypting RDATA with everything from size to blind_zone_key as AD and the chosen nonce
7. Fill in a signature over everything before it

If you need every last bit of speed in creating the RRBlock, keep a hash of everything you have filled in so far. You can pass this hash to the nonce-PRF, use it as an effective AD and have it ready for the signature. I do not think that this makes much of a difference, but some to-be-defined *KEY types might make use of such optimizations.
Tagslsd0001

Relationships

related to 0008915 new Use libsodiums HKDF 

Activities

bellebaum

2024-06-04 11:44

reporter   ~0022503

Correction: In the last sentence, the hash should of course also go over the plaintext before being passed to the nonce-PRF. This is usually not a problem, since you may duplicate the hash state.

bellebaum

2024-06-05 16:58

reporter   ~0022518

After thinking about it some more, the extraction phase above still attempts to derive too many bits from the same zkey.
Doing something similar to the TLS 1.3 key schedule, one may derive bkm as

salt := "gns-extract-" || zone_type
pbkm := HKDF-Extract(salt, zkey)
bkm := HKDF-Extract(pbkm, label)

Calling HKDF-Extract several times might seem strange at first, but if the first extraction yields a uniform key unknown to an adversary, the second extraction (which is HMAC internally) with pbkm acts as a pseudorandom function. On the other hand, if zkey (and thus pbkm) is known, the second extraction may still extract randomness from a high-entropy label distribution, which is the best we might hope for.
For implementation ease with HKDF libraries not distinguishing between extract and expand, one may also put an expansion between the extractions with an arbitrary (but fixed) info. This does not affect the security much, but it does potentially hurt performance.

Issue History

Date Modified Username Field Change
2024-06-04 11:39 bellebaum New Issue
2024-06-04 11:40 schanzen Tag Attached: lsd0001
2024-06-04 11:44 bellebaum Note Added: 0022503
2024-06-05 16:58 bellebaum Note Added: 0022518
2024-06-08 10:33 schanzen Relationship added related to 0008915