EcGFp5: a Specialized Elliptic Curve

Thomas Pornin

2022 · NCC Group · eprint 2022/274

Disclaimer

This content was automatically converted from the original PDF and may have undergone post-processing. None of these steps have been reviewed or approved by the authors. Errors in formulas, definitions, proofs, or text may have been introduced during conversion. The authoritative version is the original paper on ePrint. Always cite and verify against the original publication.

Converted with: marker · 2026-02-14

Abstract

We present here the design and implementation of ecGFp5, an elliptic curve meant for a specific compute model in which operations modulo a given 64-bit prime are especially efficient. This model is primarily intended for running operations in a virtual machine that produces and verifies zero-knowledge STARK proofs. We describe here the choice of a secure curve, amenable to safe cryptographic operations such as digital signatures, that maps to such models, while still providing reasonable performance on general purpose computers.

1. EcGFp5 Definition

Let p = 2^{64} - 2^{32} + 1. This is a 64-bit prime integer; computations modulo p can be relatively efficiently implemented on a variety of platforms. It has high 2-adicity (p - 1 is a multiple of a large power of 2, here 2^{32}) which makes it convenient for STARK proofs [BBHR18]. 32-bit integer operations can also be expressed over \text{GF}(p) since, for instance, the product of two unsigned 32-bit integers is at most (2^{32} - 1)^2, which is lower than p. The Miden VM [Mid] is an open-source implementation of a virtual machine whose internal opcodes work over elements of \text{GF}(p) and can be used to generate and verify STARK proofs over arbitrary program executions.

The chosen curve has the following parameters. We first define the finite field extension \text{GF}(p^5) = \text{GF}(p)[z]/(z^5 - 3), i.e. the ring of polynomials (in the symbolic variable z) with coefficients in \text{GF}(p), and all operations performed modulo the irreducible polynomial z^5 - 3. Every element of \text{GF}(p^5) can be represented as five elements of \text{GF}(p), corresponding to the five coefficients of a polynomial of degree at most 4.

We define, over \text{GF}(p^5), an elliptic curve of equation:

y^2 = x(x^2 + 2x + 263z)

This is a double-odd curve [Por20b], with equation constants a = 2 and b = 263z; its order is 2n for the 319-bit prime integer:

n = 1067993516717146951041484916571792702745057740581727230159139685185762082554198619328292418486241

EcGFp5 is then formally defined as the group \mathbb{G} of points of that curve which are not points of n-torsion. The neutral element of the group is the point N = (0, 0) (the only point of order 2 on the curve). The sum in the group of two elements P and Q is defined as the curve point P + Q + N. As explained in [Por20b], this yields a group with the proper characteristics for defining cryptographic operations such as digital signatures or key exchange:

The group \mathbb{G} has prime order n.
Elements of \mathbb{G} can be uniquely encoded into a field element; the decoding process is unambiguous and inherently verifies that the provided encoding was valid and canonical.
Group operations can be computed with efficient complete formulas.

Several systems of coordinates can be used. In general, it is recommended to use (x, u) coordinates, in which u = x/y for element (x, y) (for the neutral N, we use u = 0). If both x and u are expressed as fractions (denoted X/Z and U/T, respectively), then general point addition formulas have a cost of 10 multiplications in the field (denoted 10M), and specialized formulas for sequences of doublings have a per-doubling cost of 2M + 5S. However, within the target compute model, it is more efficient to switch to affine Weierstrass coordinates and formulas.

In the next sections, the paper justifies the choice of a degree-5 field extension; describes the implementation of field and curve operations in the target compute model (called “in-VM”); formalizes the choice criteria for the curve parameters; and provides details on the out-of-VM implementation. A test implementation in Python and a reference implementation in Rust are provided at github.com/pornin/ecgfp5.

2. Choice of Field

Since p is a 64-bit integer, we need to work in a field extension \text{GF}(p^k) in order to have a field large enough to obtain a curve with adequate security. We aim at the usual “128-bit security” level. For such a level, we need a field with at least a 256-bit order, hence k \ge 4. Robustness of elliptic curve discrete logarithm in extension fields has been studied in various articles. A rough summary is the following:

If the extension degree k is composite, then Weil descent attacks may apply, using a tower of field extensions to turn the problem into a discrete logarithm in a higher genus curve on a smaller field [9, 3]. To avoid such issues, a prime degree is highly recommended. Diem showed that if k is prime and not lower than 11, then such attacks cannot work [Die03].
A related attack using Groebner bases was described by Gaudry [Gau09]; its complexity was further analyzed by Joux and Vitse, along with some possible variants [JV13].

For performance reasons, we would like to have k as small as possible; k = 11 would lead to a 704-bit field where computations would be too expensive. Using k = 4 would allow the known attacks on quartic extension fields [AMNS06], with complexity about O(p^{3/2}) \approx 2^{96}. Though this value is quite larger than what can practically be implemented, it still falls short of the expected “128-bit” level. Thus, we need at least k = 5.

With k = 5, Gaudry’s attack entails computing about p^{2 - 2/5} \approx 2^{102.4} systems of polynomial equations, and obtaining a Groebner basis for each of them. Each system would contain 5 equations with 5 unknowns, and a total degree 2^{k-1} = 16; obtaining the basis requires the FGLM algorithm [FGLM93] with complexity O(kD^3), where D is the degree of the underlying ideal, close to 2^{k(k-1)} = 2^{20}. This leads to a total theoretical complexity of at least 2^{142}, well beyond the target 128-bit level. Joux and Vitse’s variant has cost O(Cp^2) for some constant C that depends on k, again above the 128-bit level.

We can thus claim that a degree-5 extension field, \text{GF}(p^5), is sufficient to achieve 128-bit security against all known attacks on discrete logarithm on elliptic curves.

All finite fields with the same cardinality are isomorphic to each other; we can thus choose whatever definition of that field provides the best performance. The paper then discusses the choice of the irreducible polynomial for defining the extension. Among polynomials in \text{GF}(p)[z], none of the simpler candidates (z^5, z^5 \pm 1, z^5 \pm z^i \pm 1, z^5 \pm 2) is irreducible. The next best choices are z^5 - 3 and z^5 + 3, which are both irreducible; thus the paper chooses M = z^5 - 3.

3. In-VM Implementation

3.1 VM Opcodes

The paper assumes the following opcodes are offered by the target compute model, all with cost exactly 1 cycle:

add, sub, mul, div perform addition, subtraction, multiplication and division in \text{GF}(p). Division fails if the divisor is zero.
and, or, xor, not perform operations on Boolean values (1 for true, 0 for false). Opcodes eq and neq compare two \text{GF}(p) elements.
select applied on three values x, y, c returns x if c = 0, or y if c = 1.
add32, sub32, mul32, div32, shl32, shr32, gte32 implement operations on 32-bit values.

The compute model should be understood as a general circuit emulation in which only arithmetic gates have a cost, while data routing is free, provided that it can be resolved statically. Function calls, loop control, reading from memory and writing to memory are all free; however, data-dependent conditional jumps and array accesses at data-dependent indexes are forbidden (very expensive in Miden). The model is close to constant-time implementations, although for different reasons.

3.2 Field Operations

An element x of \text{GF}(p^5) is represented as five coefficients x_0 to x_4, such that x = x_0 + x_1 z + x_2 z^2 + x_3 z^3 + x_4 z^4. Addition and subtraction are done coefficient-wise; thus, an addition in \text{GF}(p^5) costs 5 add opcodes.

Multiplication. Multiplication in \text{GF}(p^5) (d \leftarrow a \cdot b) is done in a straightforward way:

d_0 \leftarrow a_0 b_0 + 3(a_1 b_4 + a_2 b_3 + a_3 b_2 + a_4 b_1)

d_1 \leftarrow a_0 b_1 + a_1 b_0 + 3(a_2 b_4 + a_3 b_3 + a_4 b_2)

d_2 \leftarrow a_0 b_2 + a_1 b_1 + a_2 b_0 + 3(a_3 b_4 + a_4 b_3)

d_3 \leftarrow a_0 b_3 + a_1 b_2 + a_2 b_1 + a_3 b_0 + 3 a_4 b_4

d_4 \leftarrow a_0 b_4 + a_1 b_3 + a_2 b_2 + a_3 b_1 + a_4 b_0

The multiplications by 3 come from the reduction modulo z^5 - 3. Overall multiplication cost is 49 cycles. Squaring can be done with lower cost since symmetric products yield the same value when a = b; squaring cost is 34 cycles.

Inversion. Inversion in \text{GF}(p^5) uses the Itoh–Tsujii method [IT88]. Define r = 1 + p + p^2 + p^3 + p^4. Then:

p^5 - 1 = (p - 1)r

For any non-zero x in \text{GF}(p^5), we have x^r \in \text{GF}(p), so:

\frac{1}{x} = \frac{x^{r-1}}{x^r}

which is the product of x^{r-1} (in \text{GF}(p^5)) by the inverse of x^r (in \text{GF}(p)).

The Frobenius operator \phi_1(x) = x^p is a field automorphism. For our modulus p \equiv 1 \pmod{5}, it amounts to multiplying each coefficient by a precomputed constant with no reordering, at a cost of only 4 cycles. Using the Frobenius operator, x^{r-1} is computed with three Frobenius applications and two multiplications in \text{GF}(p^5). The overall cost of inversion in \text{GF}(p^5) is 128 cycles. If x = 0, the routine returns 0 (a feature that simplifies later operations).

Legendre Symbol. The Legendre symbol of x is x^{(p^5 - 1)/2}. Using r, this reduces to the Legendre symbol of x^r in \text{GF}(p). The overall cost is 186 cycles.

Square Root. The paper uses the Frobenius operator to speed up square roots by computing:

\sqrt{x} = \frac{\sqrt{x^r}}{x^{(r-1)/2}}

The square root in \text{GF}(p) is computed via a specialized Tonelli–Shanks algorithm that avoids data-dependent conditional jumps, at a cost of 659 cycles. The total cost of a square root in \text{GF}(p^5) is 3261 cycles.

Cost Summary. The following table summarizes in-VM costs for \text{GF}(p^5) operations:

Operation	Cost (cycles)
Addition	5
Subtraction	5
Multiplication	49
Squaring	34
Inversion	128
Division	177
Legendre symbol	186
Square root	3261

An important point is that inversions in \text{GF}(p^5) are quite inexpensive: the cost of an inversion is only about 2.57 times the cost of a multiplication. This is not the usual situation when dealing with elliptic curve implementations; it impacts the strategy for point addition formulas.

3.3 Curve Formulas

Since inversions in \text{GF}(p^5) are quite efficient (lower than three times a multiplication cost), the most efficient formulas for point additions and doublings are obtained by working with the short Weierstrass equation and affine coordinates. Any elliptic curve over \text{GF}(p^5) can be expressed as a short Weierstrass curve with a suitable change of variable; thus, no curve type will be any better or worse for in-VM computations.

Change of Variable. A double-odd curve such as ecGFp5 has equation y^2 = x(x^2 + ax + b). It can be converted to the short Weierstrass equation:

Y^2 = X^3 + AX + B

with constants:

A = (3b - a^2)/3, \quad B = a(2a^2 - 9b)/27

using the change of variable (X, Y) = (x + a/3,\; y). This is very inexpensive: a single constant addition to the x coordinate, costing 1 cycle.

Point Addition. The sum of two points (X_1, Y_1) and (X_2, Y_2) is the point (X_3, Y_3) with:

\lambda = \frac{Y_2 - Y_1}{X_2 - X_1}

X_3 = \lambda^2 - X_1 - X_2

Y_3 = \lambda(X_1 - X_3) - Y_1

These formulas are not complete; when X_1 = X_2, the doubling formula must be used instead:

\lambda = \frac{3X_1^2 + A}{2Y_1}

A complete routine that supports all cases (including point-at-infinity inputs and outputs) has a fixed cost of 387 cycles. This is about 7.9 times the cost of a single multiplication, faster than the best known inversion-free formulas.

Optimizations reduce the cost to 290 cycles when it is known a priori that the addition is not an edge case, and to 326 cycles for explicit point doublings.

Point Multiplication. Multiplication of a point P by a scalar v uses a windowed method with signed digits and window width w = 4. The overall cost of the point multiplication function, including scalar splitting and window building, was measured to be 138482 cycles.

Key Pair and Signature Generation. When multiplying the conventional generator point G, the window can be precomputed and multiple windows for precomputed multiples of G can be used conjointly to reduce the number of loop iterations and doublings.

Signature Verification. Verification of a Schnorr signature entails checking that dG + eQ = R for some scalars d and e. Straus’s algorithm [Str64] (often known as “Shamir’s trick”) can be used to mutualize point doublings, performing about 320 doublings total instead of 640.

4. Curve Parameters Selection

In-VM curve operations use a short Weierstrass equation and affine formulas. Any elliptic curve is amenable to such a representation; moreover, the specific values of the Weierstrass constants A and B have little to no incidence on in-VM performance. Thus, the in-VM compute model does not imply any constraint on the curve equation type, and we are free to select a curve type that favours out-of-VM performance while providing the required security characteristics.

For proper security in arbitrary protocols, we need a prime-order group with a unique, canonical and verifiable encoding [CJ19]. In practice, this restricts the choice to:

A curve with a prime order, using the generic short Weierstrass equation.
A double-odd curve [Por20b], with order 2n for a prime n.
A Montgomery or twisted Edwards curve, of order 4n or 8n for a prime n, along with the Decaf/Ristretto encoding process [10, 2].

Double-odd curves are chosen because they have better doubling formulas (2M + 5S per-doubling overhead) and simpler encoding/decoding than Decaf/Ristretto. With a = 2 (an element of \text{GF}(p)), b must be chosen outside of \text{GF}(p). A search process finds the first usable curve at b = 263z, yielding the curve whose parameters were given in Section 1.

The paper also verifies that the embedding degree is large. For ecGFp5, the embedding degree is e = (n-1)/5, a 317-bit integer close to the size of n itself, as expected of a randomly selected curve. The factorization of n - 1 is:

n - 1 = 2^5 \cdot 5 \cdot 163 \cdot 769 \cdot 1059871 \cdot 253243826720162431254857814100127 \cdot 198400523053184002814403536918162724916343842520561

5. Out-of-VM Implementation

EcGFp5 was designed to match the abilities of the target VM compute model, not any specific concrete CPU. Out-of-VM performance is expected to be lower than what is usually expected from fast elliptic curves with 128-bit security, for two reasons: operations in \text{GF}(p) have some overhead from the specific value of p, and \text{GF}(p^5) is a 320-bit field (1.25 times larger than 256-bit), giving a rough cost factor of 1.25^3 \approx 1.95.

The implementation is in Rust, constant-time and fully portable (no inline assembly, no architecture-specific intrinsics, no unsafe code). On an Intel i5-8259U “Coffee Lake” CPU:

Operation	Cost (cycles)
\text{GF}(p) multiplication	10.18
\text{GF}(p) inversion	737.39
\text{GF}(p^5) multiplication	94.03
\text{GF}(p^5) squaring	68.63
\text{GF}(p^5) inversion	1069.75
\text{GF}(p^5) square root	12410.38
ecGFp5 point addition	1328.50
ecGFp5 point doubling	985.37
ecGFp5 point multiplication	363168.03
ecGFp5 generator multiplication	109516.20
ecGFp5 mul+add verification	336952.88

5.1 Field Operations

The implementation uses Montgomery representation: an element x \in \text{GF}(p) is represented as 2^{64}x \bmod p, normalized in the [0; p-1] range. This is coupled with Montgomery multiplication, which given x and y computes xy / 2^{64} \bmod p. The cornerstone is the reduction function that takes a 128-bit input and returns x / 2^{64} \bmod p.

The paper provides a detailed Rust implementation of the Montgomery reduction function, exploiting the specific form of p = 2^{64} - 2^{32} + 1 to achieve reduction with minimal operations. On recent Intel x86 CPUs using the mulx opcode, Montgomery multiplication achieves about 10 cycles.

For multiplications and squarings in \text{GF}(p^5), Montgomery reductions are mutualized. The paper shows how the five products composing the low coefficient of a field product can be combined using partial reduction with the identity 2^{64} = 2^{32} - 1 \pmod{p}. For inversions, Legendre symbols and square roots, the in-VM methods still apply, but since divisions in \text{GF}(p) are much more expensive than multiplications out-of-VM, inversions in \text{GF}(p^5) cost about 11.4 times a multiplication—fast compared to prime fields, but not fast enough for affine coordinates.

5.2 Curve Operations

Using fractional (x, u) coordinates, the out-of-VM implementation achieves generic point addition in 10M and generic point doubling in 4M + 5S; optimizations for sequences of doublings make it worthwhile to organize operations accordingly.

Window optimizations similar to in-VM are used. Inversions in \text{GF}(p^5) are fast enough to allow normalization of window points to affine coordinates; mixed addition costs 8M instead of 10M. Montgomery’s trick is used to batch-invert the window with a single inversion and 3(2t - 1) multiplications. On the test x86 system, 5-bit windows seem slightly better.

For generator multiplication, eight precomputed windows are used (for G, 2^{40}G, 2^{80}G, \ldots). For signature verification, the Antipa et al. optimization [Ant+05] with Lagrange’s lattice reduction algorithm [Por20a] is applied; the lattice reduction runs in about 20400 cycles on average. Verification is not required to be constant-time, so direct array accesses are used for lookups, but w-NAF is not used because regular addition schedules favour long sequences of sequential doublings.

6. Conclusion

We presented an elliptic curve designed for a specific compute model. Although we use the Miden VM as a representative of that model, we expect this curve to be generally useful for other projects related to zero-knowledge proofs; curve design and implementation is also an interesting problem in its own right. As a general-purpose curve, ecGFp5 performance is not on par with the fastest standard curves (e.g. Curve25519), but is still decent enough: a single core on a laptop computer or a smartphone can generate or verify thousands of signatures per second.

Acknowledgements

We thank Bobbin Threadbare and Hamish Ivey-Law for providing the target context and useful discussions on optimized implementations in \text{GF}(p^5), and Pierrick Gaudry for pointers and explanations on the fine details of curve attacks in extension fields. Paul Bottinelli, Marie-Sarah Lacharité, Giacomo Pope and Javed Samuel reviewed this manuscript.

References

[Ant+05] A. Antipa, D. Brown, R. Gallant, R. Lambert, R. Struik and S. Vanstone, Accelerated Verification of ECDSA signatures, Selected Areas in Cryptography – SAC 2005, LNCS vol 3897, pp. 307–318, 2005.
[ALV] T. Arcieri, I. Lovecruft and H. de Valence, The Ristretto Group, ristretto.group.
[AMNS06] S. Arita, K. Matsuo, K. Nagao and M. Shimura, A Weil Descent Attack against Elliptic Curve Cryptosystems over Quartic Extension Fields, IEICE Transactions, vol. E89-A, issue 5, 2006.
[BBHR18] E. Ben-Sasson, I. Bentov, Y. Horesh and M. Riabzev, Scalable, transparent, and post-quantum secure computational integrity, eprint.iacr.org/2018/046.
[CJ19] C. Cremers and D. Jackson, Prime, Order Please!, IEEE 32nd CSF, 2019.
[Die03] C. Diem, The GHS attack in odd characteristic, J. Ramanujan Math. Soc., vol. 18, issue 1, 2003.
[FGLM93] J.-C. Faugère, P. Gianni, D. Lazard and T. Mora, Efficient Computation of Zero-dimensional Groebner Bases by Change of Ordering, J. Symbolic Computation, vol. 16, issue 4, pp. 329–344, 1993.
[Gau09] P. Gaudry, Index calculus for abelian varieties of small dimension and the ECDLP, J. Symbolic Computation, vol. 44, issue 12, pp. 1690–1702, 2009.
[GHS02] P. Gaudry, F. Hess and N. Smart, Constructive and destructive facets of Weil descent on elliptic curves, J. Cryptology, vol. 15, issue 1, pp. 19–46, 2002.
[Ham15] M. Hamburg, Decaf: Eliminating cofactors through point compression, CRYPTO 2015, LNCS vol. 9215, pp. 705–723, 2015.
[IT88] T. Itoh and S. Tsujii, A Fast Algorithm for Computing Multiplicative Inverses in \text{GF}(2^m) Using Normal Bases, Information and Computation, vol. 78, pp. 171–177, 1988.
[JV13] A. Joux and V. Vitse, Elliptic curve discrete logarithm problem over small degree extension fields, J. Cryptology, vol. 26, issue 1, pp. 119–143, 2013.
[Mid] Polygon Miden, github.com/maticnetwork/miden.
[Mid22] Miden Assembly, version 0.2, accessed on 2022-02-21, hackmd.io/YDbjUVHTRn64F4LPelC-NA.
[Por20a] T. Pornin, Optimized Lattice Basis Reduction In Dimension 2, and Fast Schnorr and EdDSA Signature Verification, eprint.iacr.org/2020/454.
[Por20b] T. Pornin, Double-Odd Elliptic Curves, eprint.iacr.org/2020/1558.
[RCB16] J. Renes, C. Costello and L. Batina, Complete addition formulas for prime order elliptic curves, Eurocrypt 2016, LNCS vol. 9665, pp. 403–428, 2016.
[Str64] E. Straus, Addition chains of vectors (problem 5125), American Mathematical Monthly, vol. 70, pp. 806–808, 1964.

History

2026-02-17Add disclaimer: content not author-approved, eprint is authoritative6638546
2026-02-16Add CONVERTED_DATE to existing 47 paper pages7191c14
2026-02-16Add crawler metadata to all 47 paper pagesc6638f2
2026-02-16Convert numeric citations to BibTeX-style keys across all papers71c86d3
2026-02-14Add 36 new paper pages and update papers index6e99f38