Common MPC Pitfalls

Input Validation

In MPC protocols, parties exchange data encoded as bitstrings representing mathematical objects such as field elements, group elements, commitments and proofs. A corrupted party may supply anything, so the receiver must verify that each incoming value has the expected shape, decodes to a valid object of the expected algebraic type, and lies in the required domain. The pitfalls below arise when one of these checks is omitted, applied only to the encoding, or performed in the wrong algebraic domain.

Parties’ Shares Not Validated as Non-Zero and Distinct

What can go wrong. Many MPC protocols build upon Shamir secret sharing, a $(t, n)$-threshold scheme that recovers a secret $s = f(0)$ from $t$ shares of a sharing polynomial $f(x) = s + \sum_{i=1}^{t-1} a_i x^i$ over $\mathbb{Z}_q$, with coefficients $a_i$ drawn uniformly at random. Each party $P_i$ holds the share $(i, x_i = f(i))$, and any $t$ parties can reconstruct via $s = \sum_{j} x_j \, l_j(0)$ with Lagrange basis $l_j(0) = \prod_{k, k \ne j} \frac{x_k}{x_k - x_j}$. Both the index $i$ and the share $x_i$ live in $\mathbb{Z}_q$, so every implementation must reduce modulo $q$ before using them. Two related failures arise when this reduction is skipped at the input boundary. First, if a party can choose its own index and the implementation rejects only the integer $0 \in \mathbb{Z}$, an attacker submitting $i = q$ (or any $k \cdot q$) passes the check while evaluatePolynomial(q) ≡ evaluatePolynomial(0) = f(0) = secret, handing it the secret directly. Second, the Lagrange basis denominator $x_k - x_j$ vanishes modulo $q$ whenever any two reconstruction indices coincide mod $q$, whether as the same raw integer (naïve duplicate) or as a malicious $x_k' = x_j + q$ (distinct as big.Int, congruent in $\mathbb{Z}_q$). The subsequent modular inverse is undefined.

Security implication. A party whose index reduces to $0 \bmod q$ is handed $f(0)$, the shared secret itself: the dealer evaluates the sharing polynomial at the attacker’s index and returns the result as normal. In a DKG, where every party deals a contribution, the attacker collects $f(0)$ from each dealer and reconstructs the full private key with no further interaction. The duplicate failure splits into two outcomes. In availability terms, reconstruction crashes with a nil-pointer dereference (Go’s ModInverse returns nil for a non-invertible input) or throws an unrecoverable error, DoS-ing the signing ceremony. In integrity terms, some implementations silently skip the offending term or substitute a default, producing an incorrect reconstruction the caller accepts as valid.

How to avoid. Validate indices at the protocol’s share-ingestion boundary: reduce each index modulo $q$, reject zero, and verify pairwise distinctness in a single pass.

Example: tss-lib Shamir validation (Trail of Bits Shamir disclosure & PR #149). Both failures appear in bnb-chain/tss-lib’s crypto/vss/feldman_vss.go and were disclosed together by Trail of Bits in December 2021. They were fixed in a single PR (shareid-security, merge commit c26beac, December 17 2021).

Failure 1: zero index mod $q$. Before the fix, Create checked the party index against the integer literal 0 without reducing modulo $q$ first (source):

1// crypto/vss/feldman_vss.go, bnb-chain/tss-lib (vulnerable, pre-PR #149)
2for i := 0; i < num; i++ {
3    if indexes[i].Cmp(big.NewInt(0)) == 0 {
4        return nil, nil, fmt.Errorf("party index should not be 0")
5    }
6    // indexes[i] == q passes the check; evaluatePolynomial(q) ≡ f(0) = secret
7    share := evaluatePolynomial(ec, threshold, poly, indexes[i])
8    shares[i] = &Share{Threshold: threshold, ID: indexes[i], Share: share}
9}

A malicious party submits index $i = q$. The literal-zero check passes, but evaluatePolynomial(q) ≡ evaluatePolynomial(0) = f(0) = s, handing the attacker the shared secret as their share.

Failure 2: duplicate indices mod $q$. The same file’s ReConstruct performs Lagrange interpolation by inverting the index difference $x_j - x_k$ via ModInverse (source):

1// crypto/vss/feldman_vss.go, bnb-chain/tss-lib (Lagrange step in ReConstruct)
2sub := modN.Sub(xs[j], share.ID)
3subInv := modN.ModInverse(sub)         // nil if sub ≡ 0 mod q
4div := modN.Mul(xs[j], subInv)         // nil-pointer dereference
5times = modN.Mul(times, div)

A malicious party submits $x_j = x_k + q$ for some honest party $k$. The raw integers differ, so any non-modular != check passes; modular reduction makes $x_j \equiv x_k$, sub is zero, ModInverse returns nil, and the next operation panics, DoS-ing the signing ceremony.

Unified fix: CheckIndexes. PR #149 added a single validation helper called at the start of Create. It reduces each index modulo $q$, rejects zero, and rejects duplicates in one pass (source):

 1// crypto/vss/feldman_vss.go, bnb-chain/tss-lib (fixed, PR #149)
 2func CheckIndexes(ec elliptic.Curve, indexes []*big.Int) ([]*big.Int, error) {
 3    visited := make(map[string]struct{})
 4    for _, v := range indexes {
 5        vMod := new(big.Int).Mod(v, ec.Params().N)
 6        if vMod.Cmp(zero) == 0 {
 7            return nil, errors.New("party index should not be 0")
 8        }
 9        vModStr := vMod.String()
10        if _, ok := visited[vModStr]; ok {
11            return nil, fmt.Errorf("duplicate indexes %s", vModStr)
12        }
13        visited[vModStr] = struct{}{}
14    }
15    return indexes, nil
16}

Received Sequence Has the Wrong Length

What can go wrong. MPC protocols often handle sequences with an expected length such as Feldman VSS commitment vectors of length $t$, lists of $n-1$ peer signatures, or vectors of DLN proof iterations. Each carries a protocol-specified length that the verifier must check before using the sequence. Accepting a sequence with an unexpected shape is functionally running a different protocol instance from the one the verifier thought it was in. The same bug also appears at the lower bound: an empty proof, signature, or participant list can make a verification loop execute zero times and return success vacuously unless the expected length is checked first.

Security implication. In the context of DKG (Distributed Key Generation), a malicious party can send a Feldman VSS commitment vector of length $t + k$ while the protocol-specified length is $t$. Honest verifiers iterate over all $t + k$ elements without noticing the mismatch, surreptitiously raising the reconstruction threshold from $t$ to $t + k$ and leaving the shared key irrecoverable from the $t$ honest shares alone, unless the DKG is restarted from scratch.

How to avoid. Each party must compare the received vector length against the protocol-specified length before using the vector and abort the protocol on any length mismatch.

Example: WSTS threshold-raise via oversized polynomial (Issue #87 & PR #88).
WSTS (Weighted Schnorr Threshold Signatures), aka WileyProofs, is based on FROST and was vulnerable to threshold-raise attacks. Before PR #88, the per-signer DKG verification in src/v1.rs only checked the Schnorr ID, not the commitment-vector length (source):

1// src/v1.rs — Trust-Machines/wsts (vulnerable, before PR #88)
2if !comm.verify() {
3    bad_ids.push(*i);
4}
5self.group_key += comm.poly[0];

A malicious signer could append commitments to its poly to silently raise the reconstruction threshold. The Trail of Bits length-check fix in Trust-Machines/wsts landed as PR #88 (“Check length of polynomials”, merged Oct 1, 2024), seven months after the disclosure. PR #88 added the explicit equality check at every DKG verification site (source):

1// src/v1.rs — Trust-Machines/wsts (fixed, PR #88)
2if comm.poly.len() != threshold || !comm.verify() {
3    bad_ids.push(*i);
4} else {
5    self.group_key += comm.poly[0];
6}

Context Binding

Zero-knowledge proofs, commitments, and signatures are important building blocks of MPC protocols, especially in threshold cryptography, which is a major category of MPC. An adversary can try to replay or transplant such artifacts from one context into another: across separate runs of the protocol (sequential or concurrent), or within a single execution (e.g. across rounds, or claiming another party’s message as its own). To prevent this, cryptographic artifacts (transcripts, commitments, signed messages) must bind uniquely to their execution context (session, parties, role, statement), so that witnesses, openings, and proofs cannot be reused across contexts.

Challenge Hash Missing Prover’s Party Identity and Session Identifier

What can go wrong. In the Fiat-Shamir transformation, the verifier’s challenge is replaced by a challenge hash that, in the single-prover, single-session case, depends only on the public statement and the prover’s commitment. In a multi-prover or multi-session setting this is not enough, and the hash must also bind to the prover’s party identifier (pid) and to the session identifier (ssid). If the pid is missing, nothing in the hash input identifies which prover computed it, so honest $P_i$ and malicious $P_m$ obtain the same challenge on the same statement and commitment within a single session. A proof $\pi_i$ produced by $P_i$ can then be replayed verbatim by $P_m$, who claims knowledge of the underlying witness without ever holding it. If the ssid is missing, the hash produces the same challenge value across every session running the same statement. Two invocations of the proof, one in key-generation session $A$ and another in signing session $B$, differ only in the surrounding protocol context, which the hash does not see. The proof bytes from session $A$ therefore remain structurally valid in session $B$, allowing replay across sessions.

Security implication. In a DKG (Distributed Key Generation) protocol, a malicious party $P_m$ can adaptively choose its public-key to match an honest party $P_i$’s ($X_m = X_i$). The malicious party then records $P_i$’s Schnorr proof and submits it as its own round contribution, passing the proof-of-knowledge check without holding any secret. The malicious party can also reuse it in later sessions.

How to avoid. Include the prover’s party identifier (pid, public key, or protocol-assigned role) in every FS challenge hash and derive a session identifier ssid from every public parameter of the current run. In practice many libraries fold the party identifier into the ssid derivation (the participant set is included in ssid)

Example: Schnorr PoK in bnb-chain/tss-lib (CVE-2022-47930, PR #256, commit 1a14f3ac) The Schnorr PoK in bnb-chain/tss-lib lets party $P_i$ prove knowledge of its secret key share $x_i$ by sending $(R = g^k, s = k + c \cdot x_i)$ where $c$ is a Fiat-Shamir challenge. In v1.x the challenge was derived solely from the public key and the commitment (source):

 1// FILE: crypto/schnorr/schnorr_proof.go — bnb-chain/tss-lib v1.3.5 (vulnerable)
 2
 3// NewZKProof constructs a new Schnorr ZK proof of knowledge of the discrete logarithm (GG18Spec Fig. 16)
 4func NewZKProof(x *big.Int, X *crypto.ECPoint) (*ZKProof, error) {
 5    if x == nil || X == nil || !X.ValidateBasic() {
 6        return nil, errors.New("ZKProof constructor received nil or invalid value(s)")
 7    }
 8    ec := X.Curve()
 9    ecParams := ec.Params()
10    q := ecParams.N
11    g := crypto.NewECPointNoCurveCheck(ec, ecParams.Gx, ecParams.Gy) // already on the curve.
12
13    a := common.GetRandomPositiveInt(q)
14    alpha := crypto.ScalarBaseMult(ec, a)
15
16    var c *big.Int
17    {
18        // Challenge includes only public key X and commitment alpha — no session ID,
19        // no party identity, no protocol context.
20        cHash := common.SHA512_256i(X.X(), X.Y(), g.X(), g.Y(), alpha.X(), alpha.Y())
21        c = common.RejectionSample(q, cHash)
22    }
23    t := new(big.Int).Mul(c, x)
24    t = common.ModInt(q).Add(a, t)
25
26    return &ZKProof{Alpha: alpha, T: t}, nil
27}

As described in CVE-2022-47930, the Schnorr proof of knowledge does not utilize a session id, context, or random nonce in the generation of the challenge. This allows a malicious party to replay a proof generated by an honest party. The fix (PR #256, commit 1a14f3ac), merged August 23, 2023) added a Session []byte parameter prepended to every proof challenge via the domain-separating SHA512_256i_TAGGED (source):

 1// FILE: crypto/schnorr/schnorr_proof.go — bnb-chain/tss-lib v2.0.0 (fixed)
 2
 3// NewZKProof constructs a new Schnorr ZK proof of knowledge of the discrete logarithm (GG18Spec Fig. 16)
 4func NewZKProof(Session []byte, x *big.Int, X *crypto.ECPoint) (*ZKProof, error) {
 5    if x == nil || X == nil || !X.ValidateBasic() {
 6        return nil, errors.New("ZKProof constructor received nil or invalid value(s)")
 7    }
 8    ec := X.Curve()
 9    ecParams := ec.Params()
10    q := ecParams.N
11    g := crypto.NewECPointNoCurveCheck(ec, ecParams.Gx, ecParams.Gy) // already on the curve.
12
13    a := common.GetRandomPositiveInt(q)
14    alpha := crypto.ScalarBaseMult(ec, a)
15
16    var c *big.Int
17    {
18        // Session is prepended via the domain-separating tagged hash, binding the
19        // challenge to the protocol session (and, by convention, the participant set).
20        cHash := common.SHA512_256i_TAGGED(Session, X.X(), X.Y(), g.X(), g.Y(), alpha.X(), alpha.Y())
21        c = common.RejectionSample(q, cHash)
22    }
23    t := new(big.Int).Mul(c, x)
24    t = common.ModInt(q).Add(a, t)
25
26    return &ZKProof{Alpha: alpha, T: t}, nil
27}

Challenge Transcript Missing Required Values (Weak Fiat-Shamir)

What can go wrong. In the Fiat-Shamir transformation, the verifier’s challenge is replaced by a hash. Soundness requires that the challenge $c$ be the hash of every value the verifier’s equation depends on: the public statement, the prover’s first-message commitment(s), and any auxiliary values that appear in the verification relation. Missing any of these lets the prover choose the omitted value after seeing the challenge, enabling forgery. The Aumasson–Shlomovits weak-FS analysis catalogues several such variants across threshold-wallet implementations.

Security implication. Depending on what is missing: (i) missing the public statement makes the proof valid for any statement with the same structural shape (a cross-statement replay); (ii) missing a commitment lets the prover pick a response first and solve for a consistent commitment backwards, producing a proof with no valid witness; (iii) missing a verification-equation input frees the prover to construct a value that satisfies the omitted constraint post hoc. In every case the verifier accepts a proof that no honest prover could have produced.

How to avoid. When implementing an FS transform, enumerate every value that appears in the verification equation (public statement, all first-round commitments, all auxiliary public inputs) and hash all of them into the challenge. Prepend a constant-length domain-separation tag identifying the specific proof type to prevent cross-proof-type substitutions.

Example: tss-lib ProofBobWC missing u in hash (Issue #42, PR #43). The MtA “Bob-with-check” range proof in bnb-chain/tss-lib involves a commitment $u = g^\alpha$ to the prover’s randomness. Pre-fix, the FS hash omitted u (source):

1// crypto/mta/proof.go — bnb-chain/tss-lib (pre-PR #43, vulnerable)
2// u is computed but NOT included in the challenge hash:
3eHash = common.SHA512_256i(
4    append(pk.AsInts(), X.X(), X.Y(), c1, c2, z, zPrm, t, v, w)...
5    // MISSING: u.X(), u.Y() — the EC commitment to the witness randomness
6)

Because $u$ is absent, the challenge $e$ is independent of the prover’s randomness commitment. A malicious party fixes a desired response, recomputes the challenge on values of its choosing, and solves for a consistent $u$ after the fact, forging a valid-looking proof without a witness.

The fix (PR #43, merged September 11, 2019) added u.X(), u.Y() to the hash input:

1// Fixed: u (the EC commitment to witness randomness) is now in the hash
2eHash = common.SHA512_256i(
3    append(pk.AsInts(), X.X(), X.Y(), c1, c2, u.X(), u.Y(), z, zPrm, t, v, w)...
4)

Missing Domain Separator Across Signing Contexts

What can go wrong. When the same signing key is used in multiple protocol roles, signing round-1 commitments vs round-2 packages in a DKG, authenticating API requests vs producing blockchain transactions, or tagging message types in a single protocol, each role must bind its messages to a unique domain-separation tag. If the tag is missing or identical across roles, a signature produced for one role is structurally valid for the other: the same bytes verify against the same key in both contexts. The tag can live at the signing primitive itself (a context string mixed into the hash, such as RFC 8032’s Ed25519ctx) or at the protocol layer (a per-method or per-key purpose marker that gates which API entry-point a key can serve).

Security implication. A malicious party who obtains a signature in role $A$ presents the same bytes as if they had been produced for role $B$. In an MPC threshold network that exposes both a generic sign() method and a specialized verify_foreign_transaction() method against the same distributed key, a bridge that calls verify_foreign_transaction() to confirm that a foreign-chain transaction was attested by the threshold network can be defeated by a caller who submits the same payload to sign() instead: the MPC network produces a valid threshold signature (since sign() is willing to sign arbitrary bytes), and the attacker replays the resulting signature into the bridge as evidence of a verified foreign transaction. The bridge has no way to tell the two apart, both signatures verify under the same threshold public key over the same bytes.

How to avoid. Bind every signature to its protocol role. Two complementary points of enforcement:

Primitive-level domain separation. Prepend a unique, version-bearing tag to the message before signing. For Ed25519, use RFC 8032’s Ed25519ctx with a non-empty context per role; for Schnorr or generic hash-then-sign, hash tag || message rather than message alone. Rotate tags when the protocol version changes so old-version signatures do not retroactively validate under a new role.
Protocol-level domain separation. Tag each distributed key with the purpose it is allowed to serve, and reject at the API entry-point any request that targets a key whose purpose does not match the call.

Example: NEAR MPC DomainPurpose tagging (issue #2076, PR #2163). The NEAR MPC node exposes a threshold key to three different methods on the contract: sign() for arbitrary user-supplied payloads, verify_foreign_transaction() for foreign-chain (Bitcoin, Ethereum) transaction attestation used by bridges, and request_app_private_key() for confidential key derivation (CKD). All three call paths can route to the same set of distributed keys. Before the fix, the contract enforced only that the curve matched the call: any Secp256k1 key could back either sign() or verify_foreign_transaction(). A caller could therefore submit a foreign-chain transaction payload to the generic sign() method, collect a threshold signature, and then replay it to a bridge calling verify_foreign_transaction() against the same key; the bridge would accept the signature as proof that the foreign transaction had been attested.

The fix (PR #2163, merged February 19, 2026) introduces an explicit per-domain DomainPurpose enum:

 1// FILE: crates/contract-interface/src/types/state.rs — near/mpc (after PR #2163)
 2pub enum DomainPurpose {
 3    /// Domain is used by `sign()`.
 4    Sign,
 5    /// Domain is used by `verify_foreign_transaction()`.
 6    ForeignTx,
 7    /// Domain is used by `request_app_private_key()` (Confidential Key Derivation).
 8    CKD,
 9}
10
11pub struct DomainConfig {
12    pub id: DomainId,
13    pub scheme: SignatureScheme,
14    pub purpose: Option<DomainPurpose>, // new: purpose tag per domain
15}

Each contract entry-point now requires the target domain to carry the matching purpose (crates/contract/src/lib.rs):

 1// FILE: crates/contract/src/lib.rs — near/mpc (after PR #2163)
 2
 3// in sign(...)
 4if domain_config.purpose != DomainPurpose::Sign {
 5    env::panic_str(
 6        &InvalidParameters::WrongDomainPurpose { /* ... */ }
 7            .message("sign() may only target domains with purpose Sign")
 8            .to_string(),
 9    );
10}
11
12// in verify_foreign_transaction(...)
13if domain_config.purpose != DomainPurpose::ForeignTx {
14    env::panic_str(
15        &InvalidParameters::WrongDomainPurpose { /* ... */ }
16            .message("verify_foreign_transaction() requires a domain with purpose ForeignTx")
17            .to_string(),
18    );
19}
20
21// in request_app_private_key(...)
22if domain_config.purpose != DomainPurpose::CKD {
23    env::panic_str(
24        &InvalidParameters::WrongDomainPurpose { /* ... */ }
25            .message("request_app_private_key() may only target domains with purpose CKD")
26            .to_string(),
27    );
28}

Rushing Adversary Copies an Honest Commitment

What can go wrong. In a commit-and-reveal protocol, each party sends a commitment during round 1 and opens it during round 2. If the commitment scheme does not bind each commitment to the identity of its opener (for example, by hashing in the party’s ID and session ID), a rushing adversary, one who observes honest parties’ messages before sending its own in the same round, can copy an honest party’s commitment byte-for-byte, then copy the opening during the reveal phase. Both parties end up revealing the same value.

Security implication. Consider a Blum coinflip: Alice and Bob commit to random bits $v_A, v_B$ and open to produce $v = v_A \oplus v_B$. A corrupt Bob who copies Alice’s commitment, then copies her opening, makes $v_B = v_A$, so the output is always $v_A \oplus v_A = 0$, the coin no longer flips. The same pattern breaks the SPDZ MAC-check sub-protocol in two-party settings: when parties commit to their $z_i$ shares and an honest $P_1$’s commitment is copied, the reconstructed $z = z_1 + z_1 = 0$ and the MAC check passes for any opened value $a'$, defeating the integrity guarantee on every wire of the circuit.

How to avoid. Bind every commitment to its opener’s identity (and to the session). Two standard constructions:

Hash-based commitment with opener ID and session ID: $c_i = H(\text{pid}_i \,\|\, \text{ssid} \,\|\, v_i \,\|\, r_i)$. A copied commitment has the wrong pid and cannot be reopened consistently.
Signed commitment: attach a signature over the commitment with a key uniquely tied to the opener; a copied commitment fails signature verification.

Either construction prevents the rushing-adversary copy because the opener’s identity is now part of what the commitment binds to.

Example: Fresco HashBasedCommitment (Issue #432, PR #433, commit fdada93b). In the SPDZ protocol, parties hold BDOZ MACs $[\alpha \cdot a]$ on every wire under a global MAC key $\alpha$. To verify that a reconstructed value $a'$ is correct, each party computes $z_i = a' \cdot \alpha_i - (\alpha \cdot a)_i$, commits to $z_i$, and opens; if the reconstructed $z = \sum z_i \ne 0$, they abort. SPDZ also uses the same commitment scheme in coin-tossing and input-sharing subprotocols.

Fresco’s HashBasedCommitment hashed only the value and the randomness, with no opener identity in the input, allwoing a malicious party to replay it. Pre-fix commit method (source):

 1// FILE: tools/commitment/src/main/java/dk/alexandra/fresco/tools/commitment/HashBasedCommitment.java
 2// aicis/fresco @ 2dc80dca (vulnerable, pre-PR #433)
 3
 4public byte[] commit(Drbg rand, byte[] value) {
 5  if (commitmentVal != null) {
 6    throw new IllegalStateException("Already committed");
 7  }
 8  // Sample a sufficient amount of random bits
 9  byte[] randomness = new byte[DIGEST_LENGTH];
10  rand.nextBytes(randomness);
11  // Construct an array to contain the bytes to hash
12  byte[] openingInfo = new byte[value.length + randomness.length];
13  System.arraycopy(value, 0, openingInfo, 0, value.length);
14  System.arraycopy(randomness, 0, openingInfo, value.length,
15      randomness.length);
16  commitmentVal = digest.digest(openingInfo);
17  return openingInfo;
18}

Each party’s commitment is $c_i = H(z_i \,\|\, r_i)$, with no opener identity in the hash input. In a two-party SPDZ MAC check over $\mathbb{F}_{2^k}$, a corrupt $P_2$ copies $P_1$’s commitment byte-for-byte, then copies the opening $(z_1, r_1)$. Because the field has characteristic 2, the reconstructed $z = z_1 + z_1 = 0$ and the MAC check passes regardless of what $a'$ was reconstructed, breaking the MAC’s integrity guarantee on every wire of the circuit. The fix (PR #433, commit fdada93b, merged February 27, 2025) added the committer’s party ID as the first input to the hash and required the opener to supply a matching ID at open time (source):

 1// FILE: tools/commitment/src/main/java/dk/alexandra/fresco/tools/commitment/HashBasedCommitment.java
 2// aicis/fresco @ fdada93b (fixed)
 3
 4public byte[] commit(int myId, Drbg rand, byte[] value) {
 5  if (commitmentVal != null) {
 6    throw new IllegalStateException("Already committed");
 7  }
 8  byte[] randomness = new byte[DIGEST_LENGTH];
 9  rand.nextBytes(randomness);
10  // Party ID is now the first ID_LENGTH bytes of the hashed input.
11  byte[] openingInfo = new byte[ID_LENGTH + value.length + randomness.length];
12  System.arraycopy(integerToBytes(myId), 0, openingInfo, 0, ID_LENGTH);
13  System.arraycopy(value, 0, openingInfo, ID_LENGTH, value.length);
14  System.arraycopy(randomness, 0, openingInfo, value.length + ID_LENGTH,
15      randomness.length);
16  commitmentVal = digest.digest(openingInfo);
17  return openingInfo;
18}

Concurrency and State

Many protocols are proven secure in particular ‘models of execution,’ and security can fail when they are run in ways that do not conform to the proof. For instance, protocols proven secure for sequential sessions can break when concurrent sessions are allowed, or preprocessing (such as Beaver triples) can be accidentally reused because a party’s state was restored from a backup.

SPDZ Multi-Threaded MAC Check

What can go wrong. SPDZ (Damgård–Pastro–Smart–Zakarias, 2012) is a maliciously-secure MPC protocol with a dishonest majority, where up to $n-1$ out of $n$ parties can be actively corrupted by an adversary. Shared values are authenticated by an information-theoretic MAC under a global key $\alpha$ that no party knows individually, and openings are verified by a MAC check that aborts if the opened value was tampered with. SPDZ is proven secure in the UC framework, which guarantees security under “concurrent execution” with arbitrary independent protocols. However, this guarantee does not extend to a multithreaded SPDZ implementation, where all threads share the same $\alpha$. In particular, when an implementation runs two MAC check instances concurrently in different threads, a malicious party can cheat in one of them to leak the entire MAC key $\alpha$ and use it in the other to forge MACs on arbitrary values.

Security implication. The paper Rushing at SPDZ: On the Practical Security of Malicious MPC Implementations (IEEE S&P 2025) shows that a malicious party can exploit the multi-thread interleaving to cause one MAC-check thread to abort, leaking the global SPDZ MAC key $\alpha$. The adversary then uses the leaked key to manipulate a concurrent thread of the honest parties, e.g. forging MACs on tampered values at will. The paper analyzed three SPDZ implementations and found two, MP-SPDZ and SCALE-MAMBA, vulnerable to this multi-thread MAC interleaving attack. The example below walks through the patches in MP-SPDZ, one of the two.

How to avoid. Treat the MAC check sub-protocol as an atomic critical section across all threads. Three concrete rules:

Mutual exclusion on the MAC check. A mutex or semaphore prevents two threads from executing overlapping MAC-check instances, including the possible abort path.
Unconditional verification on every open. The MAC check() call must fire whenever secret values are opened, regardless of whether the opened values reach an output gate.
Design-level isolation. Where possible, avoid sharing secret state across threads entirely. Fresco’s design-by-construction single-thread-per-session model is a useful reference point.

Example: MP-SPDZ POpen and Commit_And_Open_ race conditions. Two bugs were found and patched in MP-SPDZ.

Bug 1 — Missing MAC check in multi-threaded POpen (commit 5e714b2). The SubProcessor<T>::POpen() function opens secret values. The MAC verification call check() was only triggered by an explicit output-gate condition (inst.get_n()), so in multi-threaded programs, some opened values could be used without the MAC checks needed around the open:

 1// FILE: Processor/Processor.hpp — MP-SPDZ (vulnerable, prior to fix)
 2template <class T>
 3void SubProcessor<T>::POpen(const Instruction& inst)
 4{
 5    if (inst.get_n())
 6        check();    // ← MAC check only before the loop, only if inst.get_n() is truthy
 7    // ... batched open setup ...
 8    for (auto it = reg.begin(); it < reg.end(); it += 2)
 9        for (int i = 0; i < size; i++)
10            C[*it + i] = MC.finalize_open();
11    // ← no MAC check after the loop, even when nthreads > 0
12}

The fix widens the pre-loop gate and adds a new post-loop MAC check with the same gate, so multi-threaded opens trigger both checks:

 1// FILE: Processor/Processor.hpp — MP-SPDZ (fixed, commit 5e714b2)
 2template <class T>
 3void SubProcessor<T>::POpen(const Instruction& inst)
 4{
 5    if (inst.get_n() or BaseMachine::s().nthreads > 0)
 6        check();    // ← gate widened to also fire under multi-threading
 7    // ... batched open setup ...
 8    for (auto it = reg.begin(); it < reg.end(); it += 2)
 9        for (int i = 0; i < size; i++)
10            C[*it + i] = MC.finalize_open();
11    if (inst.get_n() or BaseMachine::s().nthreads > 0)
12        check();    // ← NEW: post-loop MAC check, same gate
13}

Bug 2 — Race condition in Commit_And_Open_ (commit b86f29b). Inside Tools/Subroutines.cpp, a shared coordinator object lets one thread signal to the others that its commitment phase is complete. That signal was raised before the commitment-opening validation loop ran, so a second thread waiting on the coordinator could observe the “finished” state and proceed with values that had not yet been verified:

1// FILE: Tools/Subroutines.cpp — MP-SPDZ (vulnerable)
2
3P.Broadcast_Receive(Open_data);
4coordinator.finished();                    // ← signals completion before verifying
5
6for (int i = 0; i < P.num_players(); i++)
7    if (!Open(datas[i], Comm_data[i], Open_data[i], i))
8        throw invalid_commitment();

The fix moves the signal to after the validation loop:

1// FILE: Tools/Subroutines.cpp — MP-SPDZ (fixed)
2
3P.Broadcast_Receive(Open_data);
4for (int i = 0; i < P.num_players(); i++)
5    if (!Open(datas[i], Comm_data[i], Open_data[i], i))
6        throw invalid_commitment();
7
8coordinator.finished();                    // ← now after verifying

The attack exploits the race by having a malicious party controlling Thread B observe that Thread A’s coordinator has finished and immediately proceed to use the opened values in its own MAC check instance, before A has confirmed those values are authenticated. By carefully timing two concurrent MAC check instances the adversary extracts information about $\alpha$ through the unauthenticated intermediate state, then uses this to forge MACs on arbitrary output values.

Threshold Presignature Reuse (Nonce Reuse)

What can go wrong. ECDSA produces signatures $(r, s)$ where

$s = k^{-1}(H(m) + r \cdot x) \bmod n$

with $k$ a fresh random nonce, $r = (k \cdot G)_x$, and $x$ the long-term signing key. This equation is linear in $x$ once $k$ and $r$ are fixed, so reusing the same $k$ across two different messages $m_1 \ne m_2$ produces a pair $(r, s_1), (r, s_2)$ from which any observer recovers $x$ in closed form: solve $k = (H(m_1) - H(m_2)) \cdot (s_1 - s_2)^{-1} \bmod n$, then $x = (s_1 \cdot k - H(m_1)) \cdot r^{-1} \bmod n$. The canonical real-world incident is the 2010 fail0verflow PlayStation 3 ECDSA break, where Sony reused a fixed nonce across game-code signatures and the master key fell out of two signed binaries.

Some threshold ECDSA protocols such as GG18, GG20, and CGGMP21 generate this nonce distributively as a presignature $(k, R = k \cdot G)$ before the message is known, consuming it once a message arrives. The set of unused presignatures is a stateful object, and implementations must ensure that no two executions consume the same presignature. If they do, two or more signatures share a nonce.

Security implication. When two signatures over different messages share a presignature, anyone who observes them can recover the long-term signing key $x$. In threshold deployments the reuse is both easy to trigger and hard to detect: a malicious party can abort a ceremony after the presignature is fixed and force a retry on a different message, or route two non-interactive signing requests to different honest subsets using the same presignature. Honest parties signing non-interactively have no way to notice that the same nonce is being consumed twice. The Aumasson–Shlomovits Attacking Threshold Wallets paper catalogues presignature reuse as a first-class threshold-wallet threat.

How to avoid. Atomically (across parallel sessions) consume the presignature before starting the signing, and consume it whether or not the signing protocol completed successfully. Upon failure, never retry signing with the same presignature; generate a fresh one. Beware lifecycle events that can resurrect a consumed presignature: backup-and-restore, process restarts, snapshots, and replication must not reintroduce a presignature that has already been used.

Example: Blockdaemon Builder Vault warns against 2-of-3 presignature reuse (Builder Vault TSM docs). Builder Vault is Blockdaemon’s production MPC threshold-signing platform (powered by the Sepior TSM). Its developer documentation explains that each presignature contains shares of a random signing nonce, and that an MPC node enforces single-use by deleting the presignature in the same transaction in which it consumes its share. The docs additionally warn that backup-and-restore can reintroduce a previously-consumed presignature, turning a routine ops procedure into a key-extraction vector if mishandled. Operators are therefore instructed to delete all presignatures either before taking a database backup or upon restoring.

Insecure Subprotocols

Subprotocols assumed by the protocol design, such as broadcast channels and authenticated or confidential peer-to-peer transport, must be operationally realized by the deployment, not merely declared.

Multicast Masquerading as Broadcast

What can go wrong. Many MPC protocol proofs are written in the Universal Composability (UC) framework of Canetti (2001), which models the broadcast channel as an ideal functionality $\mathcal{F}_{\text{BC}}$: in a given round, every honest party receives the same message from the sender. Threshold protocols including GG18 and GG20 for ECDSA and FROST for Schnorr explicitly require this functionality for at least one round of key generation or signing. Realizing $\mathcal{F}_{\text{BC}}$ over an asynchronous network requires a reliable broadcast protocol such as Bracha (1987), which provides Byzantine agreement against $t < n/3$ corruptions. A library that cannot tell whether a given round was supposed to be broadcast or point-to-point cannot enforce that assumption. If the application instantiates “broadcast” as a loop of per-peer sends, a malicious sender can equivocate (send $v_1$ to one honest party and $v_2$ to another) and no honest participant can detect the split. Echo-broadcast (every party re-broadcasts what it received before accepting) provides only single-round local consistency, not full Byzantine agreement, so a malicious sender can shift the split into the second round.

Security implication. Honest parties end up with different views of the same protocol round. The composition-level guarantee the UC proof relied on (that the round fixed a single value across all honest views) no longer holds, and subsequent rounds run on diverging state. In threshold signing the practical consequences include key-generation concluding with honest parties disagreeing on the public key, silent denial-of-service by a single adversary, and (depending on which round is attacked) share exposure, proof forgeries, or permanently-inconsistent key material.

How to avoid. Implement a reliable broadcast protocol (not just echo-broadcast) for any round whose security proof requires Byzantine agreement. In settings with fewer than $n/3$ corruptions, Bracha broadcast provides the required guarantees. Enforce the per-round broadcast-vs-P2P classification at the library boundary using the protocol specification as reference, rather than delegating the decision to the caller.

Example: GG18 resharing split-view attack (Kudelski, 2021). Kudelski’s audit of ING’s threshold-ECDSA library identified a communication-layer failure in the GG18 resharing protocol. The issue was a design-level mismatch: the resharing mitigation relies on all honest parties seeing the same final confirmation, but that assumption is not realized by sending separate point-to-point messages. ING attempted echo-broadcast as the mitigation; Kudelski noted it “might actually make things worse” without a true reliable-broadcast layer underneath. If an application realizes broadcast as $N$ separate point-to-point sends, a malicious sender can equivocate.

Kudelski’s example starts with four peers $(A, B, C, D)$ using a threshold of 3, and a resharing ceremony that adds a fifth peer $E$ while keeping the threshold at 3. At the end of the resharing protocol, malicious $E$ sends different final-round messages to different honest parties:

$E$ sends ACK to $A$ and $B$.
$E$ sends not ACK to $C$ and $D$.

$A$ and $B$ believe resharing succeeded, discard their old shares, and migrate to the new committee. $C$ and $D$ believe resharing failed, keep the old shares, and do not save the new shares. The honest parties are now split between incompatible old and new committee states. Neither honest subset has enough compatible shares to sign without $E$, so the single malicious participant can lock the wallet and blackmail the rest of the committee.

The attack is exactly the multicast-as-broadcast failure: every honest party received a message from $E$, but they did not receive the same message. The fix is not another local validation check inside the resharing round; the deployment needs a broadcast mechanism that gives all honest parties a consistent view of whether the final confirmation was sent.

Unauthenticated or Unencrypted Point-to-Point Channels

What can go wrong. Many MPC protocol proofs are written in the Universal Composability (UC) framework of Canetti (2001), which models the network as ideal functionalities: typically $\mathcal{F}_{\text{AUTH}}$ for authenticated channels (the recipient is guaranteed to learn the true sender) and $\mathcal{F}_{\text{SMT}}$ for secure message transmission (also confidential). When a protocol’s proof assumes such a functionality between every pair of parties, such as in the threshold-ECDSA protocols GG18 and GG20, both of which explicitly require authenticated and (for several rounds) confidential point-to-point channels, the deployment must operationally realize that assumption, typically through mutual TLS, signed/encrypted application-level messages, or a noise-protocol handshake. Implementations that hand-roll the transport layer (raw TCP, ad-hoc JSON over HTTP, implicit trust in a central coordinator that re-signs messages) routinely fail to realize these assumptions. The protocol proof assumes the channel prevents network-layer impersonation and eavesdropping; the deployed transport does not.

Security implication. Without per-message authentication, a network attacker can impersonate parties and inject messages honest parties attribute to the wrong source; the victim of the attribution is then blamed for protocol violations it did not commit. Without confidentiality, intermediate values that the ideal functionality hides leak to the network, and downstream secret-dependent computations become vulnerable to offline analysis. In threshold signing this translates to rogue messages causing spurious aborts, silent share exposure, and key-extraction attacks that exploit observed intermediate values.

How to avoid. Instantiate the point-to-point channels with mutual TLS between each pair of parties, keyed to the specific participant set for this protocol run (certificate pinning at minimum; ideally session-scoped keys derived from a higher-level authenticated key-exchange). Never run the cryptographic protocol over unauthenticated transport, even “for testing”, since integration-test wiring often migrates into production unnoticed.

Example: axelarnetwork/tofnd accepts spoofed from field on the wire. Axelar’s tofnd is a Rust daemon implementing GG20 (Gennaro–Goldfeder, 2020), a threshold-ECDSA protocol widely deployed in MPC wallet implementations. It wraps each protocol message in a TrafficIn envelope that carries the transport-level sender identity (from_party_uid) alongside an inner MsgMeta that carries the protocol-level sender index (from: usize). Issue #60 describes the failure directly:

Currently, the sender of a tofnd message is not authenticated. Thus, malicious parties could spoof messages from other parties. […] It is easy for a malicious actor to dig into the binary payload and spoof this from field and therefore send messages on behalf of other parties.

The vulnerable handler discarded the transport identity and passed the raw payload straight to the cryptographic core (src/gg20/protocol.rs#L106-L117):

1// FILE: src/gg20/protocol.rs — axelarnetwork/tofnd (pre-fix, lines 106–117)
2while protocol.expecting_more_msgs_this_round() {
3    let traffic = chan.receiver.next().await.ok_or(...)?;
4    let traffic = traffic.unwrap();
5    // Only `traffic.payload` is forwarded to tofn; the transport-level
6    // `traffic.from_party_uid` is discarded. tofn then trusts the inner
7    // `MsgMeta { from: usize, ... }` self-attribution.
8    protocol.set_msg_in(&traffic.payload)?;
9}

A malicious party Alice with subshares {0, 1} could craft a message with MsgMeta::from = 2 (Bob’s subshare index), and no consistency check linked that index back to the transport-authenticated from_party_uid. The fix is split across two repos: tofn (the cryptographic library tofnd wraps) had to first expose the from field in its public API (tofn #42) so tofnd could then enforce from_party_uid == MsgMeta::from before dispatch.

Example: coinbase/kryptology GG20 DKG ships secret shares unencrypted. GG20’s joint key-generation procedure (inherited from GG18) assumes the Round 2 P2P delivery of each Shamir share $x_{ij}$ runs over a confidential channel, instantiated in the GG18 paper with Paillier encryption keyed to the recipient. The Coinbase library’s GG20 implementation drops the encryption step and returns the share as a bare struct field (source):

1// FILE: pkg/tecdsa/gg20/participant/dkg_round2.go — coinbase/kryptology
2
3type DkgRound2P2PSend struct {
4    xij *v1.ShamirShare  // raw share — no Paillier encryption applied
5}
6// ...
7p2PSend[id] = &DkgRound2P2PSend{ xij: dp.state.X[id-1] }

An integrator filed issue #29 after having to fork the library to make xij exportable for transmission, noting it “feels unsafe to share in unencrypted form” and pointing out that Swingby’s tss-lib fork Paillier-encrypts the share at the equivalent round. The maintainer confirmed in the same thread: “You should encrypt everything sent between participants since the paper states it’s only secure in the presence of a secure channel.” The library nonetheless leaves channel confidentiality entirely to the application. Note that the kryptology repository has since been archived by Coinbase, with an explicit notice that the library “should not be used” and is not used by Coinbase itself.

Failure Recovery and Aborts

When a subprotocol detects a consistency failure, the implementation must surface that failure in a form the caller can act on. Structured terminal errors, diagnostic mismatch signals, and coordinated cancellation prevent honest parties from misdiagnosing configuration failures as attacks or retrying with compromised state.

Panic or Opaque Error Instead of Structured Abort

What can go wrong. The OT (oblivious transfer) consistency check is a verification step in OT-extension protocols that compares the parties’ transcripts of a batch of oblivious transfers and detects when a malicious sender or receiver deviated from the protocol. When this check fails, the failure can be reported in two equally damaging ways:

Panic / abort the thread: a Rust panic! (or any language’s equivalent “crash the thread” mechanism) unwinds the stack abruptly without surfacing a structured error to the application layer.
Opaque error return: a generic, untyped error value indistinguishable from benign failures (network timeout, decode error). A caller that catches it via a generic if err != nil { return err } cannot tell that a malicious peer just probed the protocol.

At minimum, both failures prevent graceful recovery of the broader system. For protocols that further require identifiable abort (such as DKLs23), they also discard the information about which peer triggered the failure, leaving honest parties unable to ban the cheater specifically without collateral damage.

Security implication. The application is left with two bad choices, both of which an adversary can turn into attacks:

Key destruction: treat the panic as evidence that some party cheated and destroy the key share. An adversary who can trigger the panic at will now has a griefing primitive that lets them destroy honest parties’ keys without needing to know any key material.
Retry without exclusion: treat the panic as a transient error and retry without banning the offending party. The offending party repeats the selective-abort probe across multiple retries, accumulating bits of the base-OT state.

How to avoid. Replace every adversary-reachable abort path (assertions, panics, generic errors) with a structured, terminal error — a typed sentinel the caller can pattern-match on (e.g., errors.Is(err, ErrConsistencyCheckFailed) in Go, a typed Result::Err variant in Rust). Treat the failure as protocol-terminal: zeroize any compromised key material (e.g., the base-OT seed) at the failure point so no retry with the same state is possible. Where the protocol requires identifiable abort, the error must additionally carry the offending party’s identifier so the caller can ban that party specifically and continue with the remaining honest peers.

Example: Silence Laboratories dkls23 (TOB-SILA-12). Trail of Bits’ February 2024 review of Silence Laboratories’ DKLs23 library identified TOB-SILA-12, a high-severity finding titled “Implementation mishandles selective abort attacks”. The pairwise multiplication (MtA) layer called the underlying COTe (correlated oblivious transfer with errors) sender via .expect(...), which panics on Err without surfacing which party caused the failure. The audit reproduces the offending lines (citing dkls23/src/sign/pairwise_mta.rs lines 278–282 in the pre-fix codebase):

1// dkls23/src/sign/pairwise_mta.rs — silence-laboratories/dkls23 (vulnerable, pre-fix)
2let (cot_sender_shares, round2_output) = self
3    .state
4    .cot_sender
5    .process((&round1_output, &alice_input))
6    .expect("error while processing soft_spoken ot message round 1");
7// ↑ panic on COTe Err; no party ID propagates to the caller

The audit’s Exploit Scenario section for TOB-SILA-12 spells out the selective-abort-accumulation attack: a malicious receiver causes the sender to panic on each session, the caller cannot identify which peer to exclude and continues new sessions with the same attacker, and over many sessions the receiver “recovers the base OT choices of the other participants” and uses them to “(retroactively) recover the input of these other participants in another signing session, one of which corresponds to their private key share” (audit, p. 42).

The fix is two-layer: the SoftSpokenOT primitive returns a structured AbortProtocolAndBanReceiver error when the consistency check fails; the DSG (signing) layer catches that error and re-emits it as AbortProtocolAndBanParty(party_idx) with the specific peer’s identity attached so the library caller can denylist them:

1// SoftSpokenOT layer — returns AbortProtocolAndBanReceiver on consistency-check failure
2//   (sl-crypto/crates/sl-oblivious/src/soft_spoken/...)
3//
4// DSG (signing) layer — propagates with the party identifier:
5.map_err(|_| SignError::AbortProtocolAndBanParty(party_idx as u8))?;

The two-layer separation is a useful lesson: the OT primitive identifies that the receiver cheated, while the higher protocol layer attaches which peer that receiver was, so the library user can abort all concurrent sessions with that peer and refuse future participation.

Session-ID Disagreement or Non-Uniqueness Not Detected Early

What can go wrong. In the Universal Composability framework of Canetti (2001), every protocol instance is parameterized by a session identifier (ssid) that uniquely names that run. Honest parties feed the same ssid into every sub-protocol (OT extension, MAC checks, DLN proofs, Fiat-Shamir transcripts) both to detect cheaters whose contributions don’t match the agreed-upon session and to ensure that artifacts from one session cannot be replayed in another. Threshold-ECDSA protocols such as GG18, GG20, and CGGMP21 all depend on this binding. The discipline only works if every honest party derives the same ssid.

This discipline can fail in two ways. Disagreement: two honest parties derive different ssids (a bug in derivation, a clock skew, a protocol version mismatch, or a subtle string-encoding difference) and the sub-protocol’s consistency check fires looking like a malicious-peer attack. In this case neither party can tell that the cause was configuration rather than cheating. Non-uniqueness: parties agree on ssid, but the value is a constant placeholder, only partially derived, or static across runs, so within-session checks pass, yet the same ssid ends up shared across distinct sessions, eroding session isolation and enabling cross-session transcript or OT-state confusion.

Both failures share the same diagnostic invisibility: from the protocol’s perspective, disagreement is indistinguishable from an attack, and non-unique agreement looks like a perfectly healthy session.

Security implication. In the disagreement mode: in protocols with identifiable abort, each party concludes the peer is malicious and may permanently blacklist them. In other words, a single ssid-derivation bug can cause honest parties to ban each other. In retry-on-abort protocols, each retry consumes additional preprocessed randomness, and enough retries exhaust a precomputed OT extension pool, forcing an expensive re-setup. The underlying protocol never completes while looking, from the inside, exactly like it is under attack. In the non-uniqueness mode: parties agree on ssid, so within-session checks pass locally, but artifacts from one run (transcripts, OT seeds, MACs, Fiat-Shamir challenges) can be replayed into or confused with another run, eroding session isolation.

How to avoid. Define session-identifier derivation as a well-specified, version-tagged function of public protocol inputs (participant set, epoch, caller-supplied nonce). Detect mismatches at the earliest possible moment (ideally in a dedicated handshake before any cryptographic sub-protocol runs) and when a consistency check fails include a diagnostic code that distinguishes “mismatched ssid” from “MAC or transcript inconsistent under a shared ssid” so operators can tell configuration errors apart from attacks.

Example: BitGo sdk-lib-mpc DKLS retrofit hardcoded final_session_id to zeros. BitGo’s institutional MPC SDK wraps Silence Laboratories’ DKLS WASM bindings to perform threshold-ECDSA key generation. The DKLS protocol uses final_session_id (a 32-byte value supplied at retrofit time) to bind the OT-extension transcript to a specific keygen session. Without uniqueness here, the OT-setup transcript is constant across sessions and the protocol’s session-isolation guarantee collapses. The retrofit code path in modules/sdk-lib-mpc/src/tss/ecdsa-dkls/dkg.ts shipped with the value hardcoded to all zeros, so every retrofit wallet across the entire deployment shared the same ssid. The fix landed in PR #8496:

 1// FILE: modules/sdk-lib-mpc/src/tss/ecdsa-dkls/dkg.ts — BitGo/BitGoJS
 2
 3// pre-fix — every retrofit wallet on the server shared this ssid
 4final_session_id: Array(32).fill(0),
 5
 6// fix — bind the ssid to wallet-specific public material
 7final_session_id: Array.from(
 8    createHash('sha256')
 9        .update(Buffer.from(this.retrofitData.xShare.y, 'hex'))           // pubkey
10        .update(Buffer.from(this.retrofitData.xShare.chaincode, 'hex'))   // chaincode
11        .digest()
12),

The PR description spells out the protocol-level impact: “This weakens DKLS protocol transcript binding and could allow cross-session confusion when multiple retrofit wallets sign simultaneously on the same server.” The bug was invisible from inside the protocol (no consistency check fires for “my ssid matches my neighbour’s ssid, but they’re both the wrong constant”) and nothing in the type system prevented the placeholder zero-array from reaching production. Detection required reasoning about the DKLS spec rather than reading the code.

Example: tss-lib ssid semantics are unspecified, leaving each integrator to invent their own. The library’s README instructs callers to “wrap each message with a session ID” but does not specify the derivation, the wire format, or which sub-protocol identifiers must agree. Issue #292 is a representative misconfiguration question, asked with no maintainer reply on file:

“Does that mean adding additional session id data to a message like below? { sessionId: out-of-band-id, msg: round-message }. Or would it be ok to just using https for communication and its session id?”

The two interpretations the integrator floats (application-supplied out-of-band ID vs. HTTPS session ID) produce mutually unintelligible deployments: two parties built by different teams will derive ssid differently and their consistency checks will fail with no diagnostic distinguishing “configuration mismatch” from “cheating peer”. This is not hypothetical: issue #228 (“Keygen Freezing and Session ID Problem”) is a downstream report of exactly that: random keygen freezes traced by the integrator back to lacking any way to “[map] session ID to a single run of keygen round.” The library exposes no exported method to read the round of an inbound message, so the ssid-to-round binding the proof assumes cannot be enforced from outside.

Adaptive Inputs

When a party can choose its protocol contribution after observing honest parties' messages, it can bias, cancel, or copy those contributions. Commit-before-reveal, proofs of knowledge, and binding contributions to party/session context prevent this adaptivity from changing shared protocol state.

Rogue-Key Attack: No Commit-Before-Reveal and No Proof of Knowledge

What can go wrong. A distributed key generation (DKG) protocol lets $n$ parties jointly produce a public key whose corresponding secret is shared among them, with no trusted dealer. In a Feldman-based DKG (the joint-Feldman construction of Pedersen, 1991; the underlying VSS primitive is from Feldman, 1987), each party $P_i$ broadcasts $A_{i,0} = g^{a_{i,0}}$, which is a commitment to its secret contribution $a_{i,0}$. A shared public key is then defined as $Y = \prod_i A_{i,0}$. If the protocol neither requires parties to commit to their first-round messages before seeing others’ contributions nor requires each party to prove knowledge of $a_{i,0}$, a malicious party or coalition may wait to see the honest parties’ commitments and then choose its public contribution as a function of theirs.

Note that at the aggregate-key level, this lets the attacker try to force the shared public key to be a key it controls. In a full Joint-Feldman DKG, the malicious contribution must also pass share verification, which is why the concrete Drand attack below requires a coalition in the relevant threshold regime.

Security implication. Let $Y^\star = g^x$ be the adversary’s target (a key for which it holds the discrete log $x$). After observing $A_{1,0}, \dots, A_{n-1,0}$, $P_m$ announces $A_{m,0} = Y^\star \cdot \left(\prod_{i \ne m} A_{i,0}\right)^{-1}$. Multiplying all commitments yields $\prod_i A_{i,0} = Y^\star$. The shared “threshold” key is now under $P_m$’s sole control. As a consequence, reconstruction is not required and the protocol’s threshold property no longer holds.

How to Avoid. Either of the following two mitigations is sufficient; most deployments use both:

Commit-before-reveal: each party first broadcasts a commitment to its round-1 package, and only reveals the package after every other party’s commitment has been seen. The attacker cannot choose its $A_{m,0}$ as a function of the others because the commitment binds it before any other party has opened.
Proof of Knowledge: each round-1 package includes a Schnorr proof of knowledge of $a_{i,0}$, binding the commitment to the sender’s identity and the current session. An attacker that chose $A_{m,0}$ adversarially cannot produce a valid proof without knowing the discrete log.

Example: Drand DKG Threshold Constraint (Sigma Prime, 2020). Drand’s protocol specification describes it as a distributed randomness beacon using DKG and threshold BLS, with a threshold above half the participants under its security model. Sigma Prime showed that when the polynomial degree $t$ exceeds $n/2$ (that is, a $(t+1)$-of-$n$ reconstruction threshold), a coalition of $m \ge n - t + 1$ parties can mount a rogue-key attack: after seeing the honest parties’ public commitments, the colluding parties choose their own constant-term commitments so the final public key becomes an attacker-chosen $Y^\star = g^x$. The attacker then knows the discrete log of the group public key.

The post proposes an initial hash commit-before-reveal phase over each party’s polynomial commitments, for example Hash(A_{i,0} || A_{i,1} || ... || A_{i,t}), before any commitment values are revealed. Drand instead lowered the configured threshold closer to $n/2$, so the rogue-key attack would require a coalition outside the assumed fault bound.

Cryptographic Primitives

The preceding classes concern how an MPC protocol wires its primitives together. The pitfalls here concern the primitives themselves: a modulus that is not a safe prime, a Paillier key with small factors, a hash used where it offers no domain separation, randomness drawn from too small a space. Each is a failure in the choice or construction of a building block, independent of the protocol wrapping it. We collect them here because the fix is local to the primitive, and the same checklist applies regardless of which protocol is being audited.