by Mike Hamburg (Ed448-Goldilocks)

Since Edwards’ discovery of a new elliptic curve form in 2007, implementors have produced a steady stream of implementations in this form. Edwards curves tend to be easier to implement securely than the previously used curve shapes, because they sup- port complete addition formulas. These formulas do not have exceptional cases which would lead to a division by zero. Edwards curves are also faster, and have marginally simpler formulas as well. However, curves in Edwards or twisted Edwards form always have a co-factor divisible by 4, so prime-order curves such as the NIST curves cannot be put in this form.

Here I detail the design of an Edwards curve with a 448-bit field. I hope that this curve will provide enough security to satisfy conservative users, but still be fast enough for those who are performance-conscious. It would therefore be useful as a more conservative supplement to Curve25519 and Ed25519.

Before going on, it is important to consider why a stronger elliptic curve would be desirable. In particular, why would anyone need a curve stronger than the existing 256-bit-field curves? These curves are said to require about as much work to break as AES-128 (e.g. Curve25519 has work factor $W \coloneqq \frac 1 2 \log q \approx 126$), but in strong attack models they may require more work than AES. Depending on the mode, symmetric encryption may be susceptible to multiple- target attacks, which could allow an attacker to recover the first key of n targets in time approximately $2^{128} / (n+1)$. For elliptic curves, batch attacks do not speed up the recovery of the first key, only of subsequent ones.

What sorts of attacks might break an elliptic curve with a conjectured security level near 128 bits? Here are several possibilities:

An attacker could use brute force. This is unlikely to be feasible for several decades at least, but designers might be concerned about it for very long-term security.

An attacker might build a quantum computer capable of running Shor’s algorithm. This would break every elliptic curve cryptosystem which could fit in that computer’s memory, so a larger curve would not be helpful.

A mathematical breakthrough might render all elliptic curve cryptography weak, or at least much weaker than expected. Depending on the breakthrough, a larger curve might resist attack due to its size, or it might fall along with the smaller ones.

A breakthrough might break only curves with special properties, such as complex multiplication or a certain field shape. Or it might break all curves which do not have those properties.

A protocol might have a loose security bound, and might allow an attacker to break it with only a tiny fraction of the work of solving the discrete log problem. This might enable an attack on curves which were previously out of reach, but only in that protocol.

Security bugs or side channels might compromise an implementation. The defense against this is simplicity, not field size.

Security architects might want to over-engineer a system, for marketing or just for extra confidence. Or having already done this, they might want to migrate from the NIST or Brainpool elliptic curves to an Edwards curve, without weak-ening their design.

Some of these issues can be mitigated by using a curve which is slightly bigger, such as Scott’s curve modulo $2^{336}-3$. Others favor a curve which is significantly larger, but it is difficult to evaluate exactly how large. If a significantly larger curve is chosen, the usual work factor estimate means almost nothing, since any attempt to attack such a curve would require a significant breakthrough to obsolete that estimate.

I decided to design a curve in the “significantly larger” or “overkill” level. The currently popular curves in this space include NIST and Brainpool curves at 384, 512 and 521 bits, so I aimed for something in this range. Since it is always possible to use a larger curve at the cost of worse performance, I aimed to find the best trade-off between performance and field size.

Having settled the curve shape and generation method, the remaining major choice is the field. Following Safecurves, I used a prime-order field. That leaves at least six families of desirable primes:

Random primes

Mersenne primes

Crandall primes $(2^k-c)$

Special Montgomery primes $(2^kc-1)$

Granger-Moss primes $(\Phi_n(k))$

Solinas primes $(2^k-2^l \pm \dotsm \pm 1)$

$2^{448} - 2^{224} - 1$

I chose the Solinas trinomial prime $p \coloneqq 2^{448} - 2^{224} - 1$. I call this the “Goldilocks” prime because its form defines the golden ratio $\phi \equiv 2^{224}$. Because $224 = 32 \cdot 7 = 28 \cdot 8 = 56 \cdot 4$ , this prime supports fast arithmetic in radix $2^{28}$ or $2^{32}$ (on 32-bit machines) or $2^{56}$ (on 64- bit machines). With 16, 28-bit limbs it works well on vector units such as NEON. Furthermore, $radix-2^{64}$ implementations are possible with greater efficiency than most of the NIST primes.

Karatsuba The main advantage of a golden-ratio prime is fast Karatsuba multiplication. Let $\phi = 2^{224}$ as above. Then:

$(a+b \phi) \cdot (c+d \phi) \newline = ac + (ad + bc) \phi + bd \phi^2 \newline \equiv (ac + bd) + (ad + bc + bd) \phi \space \space \space (\bmod \space p) \newline = (ac + bd) + ((a+b)(c+d)-ac) \phi$

I chose an untwisted Edwards curve, i.e. one of the form

$E_d: y^2+x^2=1+dx^2y^2$

To demonstrate that there is nothing up my sleeve, I chose d as small as possible in absolute value so that $E_d$ and its twist both have $4 \cdot prime$ order, and so that the order of the curve is less than $p$ . This last restriction was for ease of implementation, but it doesn’t matter because the least $d$ already gives a curve of order less than $p$.

The Goldilocks $d$ is $−39081$. The resulting curve satisfies the other Safecurves criteria, as shown on safecurves.cr.yp.to. In particular, because $d$ is not square in $Z / pZ$ , the strongly unified Edwards point addition formulas apply.

The key generation algorithm uses the signed all-bits-set combs algorithm. On 64-bit platforms it uses 15kiB of tables (5 comb tables, 5 teeth per comb, 3 coordinates per point, expanded to 32 bytes per coordinate). On 32-bit platforms it instead uses 12kiB of tables (8 combs, 4 teeth per comb).

The signature algorithm produces Schnorr signatures. It uses the same combset as the key generation algorithm.

The verification algorithm runs in variable time, using wNAF and a 6kiB precomputed table with multiples of the generator. Verification can in principle be batched, but I haven’t tested this.

ECDH uses the Montgomery ladder, which is marginally faster than an Edwards scalar multiplication because it doesn’t need to decompress the input point. The Montgomery ladder is enhanced to preserve sign information and to reject points on the twist, even though it is believed to be safe to allow them. It is therefore a drop-in replacement for an Edwards scalar multiplication.