View Issue Details
ID  Project  Category  View Status  Date Submitted  Last Update 

0008892  GNUnet  GNS  public  20240604 11:39  20240608 10:33 
Reporter  bellebaum  Assigned To  
Priority  low  Severity  minor  Reproducibility  N/A 
Status  new  Resolution  open  
Summary  0008892: Design Considerations for future *KEY types  
Description  The Design of GNS thus far could be improved to allow easier cryptographic analysis. Each consideration here is of minor severity, i.e. there is no immediate consequence of not implementing the proposed changes, to the best of my knowledge. Nevertheless, these (in some cases breaking) changes might be worth considering for a cleaner design. I am currently working off RFC 9498 rather than the actual implementation, so I apologize should my analysis not reflect changes made in the meantime. # Making Full Use of HKDF To derive a bunch of random looking bits from a zone key (zkey below) and a label, one could take a hash function H and output H(zkey  label  0)  H(zkey  label  1)  H(zkey  label  2)  ... where  denotes concatenation. This is well justified if one is willing to model H as an ideal random function, whose output may only be correlated with its input only by plugging in values and observing what comes out. (This is routinely called a Random Oracle.) However, algorithms are not random oracles, and in particular cannot always be trusted to generate arbitrarily many independent output bits from highly correlated inputs. For this very reason, modern key derivation functions like HKDF derive only a single key from a keying distribution using a dedicated component called a strong randomness extractor ("Extract") and then use this one key to derive independent looking key streams using another dedicated component called pseudorandom function ("Expand"). In HKDF, these are both implemented using HMAC, which is secure almost independent of where you plug in parameters, IF you are willing to model HMAC or the underlying hash function as a random oracle (which is why the current implementation is "probably" fine). A modular analysis of schemes however requires using HKDF using its extract and expand functionality in the sense above. For this reason, it may be good practice to first derive blinding key material bkm as follows: salt := "gnsextract"  zone_type bkm := HKDFExtract(salt, zkey  label) zkey is assumed to have fixed length, a length encoding or some sort of final delimiter. The salt here is merely used as a domain separation tag. It includes the zone type so that the encoding of zkey may differ between zone types without having to worry about collisions. The initial keying material (zkey  label) is chosen such that both the combination of a sufficiently unknown zkey with a predictable label (as required by GNS for unlinkability in a websurfinglike scenario) as well as the combination of a known zkey with a sufficiently random label (as required for censorship resistance and privacy with re:claim) result in input keying material distributions of sufficient entropy for the extractor to do its job and extract a bkm indistinguishable from random. bkm may then be used to derive arbitrary random data using HKDFExpand. For example, to derive a symmetric key for (say) AESGCM and a blinding factor for (say) EdDSA key blinding one could use K := HKDFExpand(bkm, "gnsencryptaesgcmkey", 256/8) h := HKDFExpand(bkm, "gnsblindeddsa", 512/8) In terms of the RR data confidentiality, query privacy and censorship resistance, this should simplify the analysis a decent bit. # Simplifying and FutureProofing the Symmetric Encryption Contrary to the terminology in RFC 9498, the actual nonce ("number used once") is not the NONCE derived using HKDF (which is always the same, since it is derived deterministically for each zkeylabel pair), but rather the expiration time of the record. Encryption algorithms typically only provide any guarantees if this value is only used once. There are two approaches to "guaranteeing" this. One could write a "MUST" into a specification detailing that expiration times shall be strictly increasing. This approach might however leave the not cryptographically inclined reader (or nonreader. How many people using DNS have read the specifications? :D) with the impression that violating this constraint heuristically to support distributed signing infrastructures, temporary setups and backup solutions might only lead to temporary interoperability problems, rather than a leak of confidential information. Even if all users and implementations were to understand this, it would complicate the above use cases and, as a result, make GNS less flexible. Alternatively, one could take out the problem. That is, ideally generating a random (at least) 64bit nonce each time and placing it next to the ciphertext. If one is worried about accidental nonce reuse, there are AEAD schemes like AESGCMSIV to handle this exact case. If one does not want to store ciphertexts locally and rather reencrypt plaintexts on demand, one may either just store the nonce values alongside the expiration timestamp or derive them deterministically by applying an independently keyed pseudorandom function to the plaintext, expiration and blinded public key. Speaking of AEAD schemes, with the internet moving away from giving users access to lowlevel cryptographic primitives, many libraries will primarily expose, review, and also optimize, AEAD schemes. It does little harm to use these for encryption, even though integrity is already protected by a digital signature. The AD may also be used to bind the ciphertext to a particular blinded key. To illustrate the full RRBlock creation, assume a layout such as (bitlengths in brackets, "varies" depends on zone_type or data length): size(32)  zone_type(32)  expiration(64)  blind_zone_key(varies)  nonce(64)  bdata(varies)  aead_tag(128)  signature(varies) Here I have left size and zone_type unchanged for compatibility with existing *KEY types, expiration has moved slightly to the front to have fixed size key types in the header as much as possible. Doing so for the nonce would make ADcalculation and deterministic noncecalculation more cumbersome. Creation steps: 1. Fill in size, zone_type and expiration 2. Do the KDF extract step above 3. Use the KDF expand to derive randomness for blinding zkey, and fill in blind_zone_key 4. Fill in the nonce randomly, using a stored value, or by applying a PRF to the parts already filled in 5. Use the KDF expand to derive a symmetric encryption key 6. Fill in bdata and aead_tag by encrypting RDATA with everything from size to blind_zone_key as AD and the chosen nonce 7. Fill in a signature over everything before it If you need every last bit of speed in creating the RRBlock, keep a hash of everything you have filled in so far. You can pass this hash to the noncePRF, use it as an effective AD and have it ready for the signature. I do not think that this makes much of a difference, but some tobedefined *KEY types might make use of such optimizations.  
Tags  lsd0001  
related to  0008915  new  Use libsodiums HKDF 

Correction: In the last sentence, the hash should of course also go over the plaintext before being passed to the noncePRF. This is usually not a problem, since you may duplicate the hash state. 

After thinking about it some more, the extraction phase above still attempts to derive too many bits from the same zkey. Doing something similar to the TLS 1.3 key schedule, one may derive bkm as salt := "gnsextract"  zone_type pbkm := HKDFExtract(salt, zkey) bkm := HKDFExtract(pbkm, label) Calling HKDFExtract several times might seem strange at first, but if the first extraction yields a uniform key unknown to an adversary, the second extraction (which is HMAC internally) with pbkm acts as a pseudorandom function. On the other hand, if zkey (and thus pbkm) is known, the second extraction may still extract randomness from a highentropy label distribution, which is the best we might hope for. For implementation ease with HKDF libraries not distinguishing between extract and expand, one may also put an expansion between the extractions with an arbitrary (but fixed) info. This does not affect the security much, but it does potentially hurt performance. 