Stratumn.com

Zero-Knowledge Proof Systems in Anonymous, Publicly Verifiable Boardroom Elections

Two zero-knowledge approaches to the problem are explained, analyzed, and compared.

In our first post in this series, “Anonymous, Publicly Verifiable Boardroom Elections”, we outlined some design goals in building a decentralized electronic boardroom voting protocol, and then sketched out a couple of different solutions to the problem using zero-knowledge proof techniques.

In our second post, we designed the protocol with Stratumn Chainscript and provided some implementation details of the solution using the Indigo Framework.

In this post, we delve deeper into the two solutions proposed in our first post — Hidden Who and Hidden What — and then highlight some of the tradeoffs involved between them.

Note that in both solutions a registration phase and a voting phase are necessary. It is in the registration phase of each approach that “verifiable anonymity” is established through zero-knowledge techniques. In the first approach, Hidden Who, a zkSNARK allows the shareholder to create a shadow identity, who can then vote openly without fear of identification. In the second approach, Hidden What, zero-knowledge proofs ensure the validity of the voter’s vote, even though individual votes are never seen, and counting can only happen after all votes are cast.

1. Hidden Who

In the registration phase of Hidden Who solution, we calculate a shadow identity for each share and assign it to that share’s owner (or a proxy of their choice). The integrity of this approach hinges on solving the following problem:

“How do we prove that a shadow identity corresponds to a valid share identity, without revealing which share identity it corresponds to?”

Election Proposal

Either during the setup phase of the network, or in the initial Propose message, the shareholder calling the election must establish the parameters of the proof system, so each subsequent shareholder can prove their shadow identities are indeed legitimate.

The most important pieces of election-specific information to provide in the Proposal message are:

  • \(ElectionID\): A name or identification number of the election, and
  • \(L\): The set of public keys corresponding to the ownership shares (all the \(Share_{public}\)).

In addition, the Propose message would provide proving and verification algorithms for potential voters to call in order to prove the legitimacy of the shadow identities they have created.

Shadow Identity Proof System

Each shareholder who would like to participate in the election must:

  • generate a shadow identity key pair (\(Shadow_{public}\), \(Shadow_{private}\)), and
  • prove their shadow identity key pair corresponds to a valid \(Share_{public}\).

In addition, the proof system should also guarantee that each \(Share_{private}\) deterministically produces exactly one valid \(Shadow_{public}\), so each share corresponds to only one shadow identity, and therefore only one vote.

Thus, proving the validity of a shadow identity actually implies two different sub-proofs, namely:

  • \(Share_{private}\) corresponds to a \(Share_{public}\) contained in the list of authorized shares, \(L\), defined in the Propose message.
  • \(Shadow_{public}\) is correctly calculated from \(Shadow_{private}\), which is correctly calculated from \(Share_{private}\).

Using the glory of zkSNARKs (a little more background?, through a library like libsnark, we could translate the following function into a system of constraints usable for proving and verifying in zero-knowledge:

function validate_shadow_identity(ElectionID, L, Shadow_public, [Share_private])  
  Share_public := PUBLIC(Share_private)
  proof_a := (Share_public ∈ L)

  Shadow_private := PRIVATE(HASH(ElectionID | Share_private))
  proof_b := (Shadow_public == PUBLIC(Shadow_private))

  return proof_a && proof_b

This function, after being translated into a system of constraints, is used as the basis of the proving and verification algorithms specified in the Propose message.

Each would-be voter would then have to prove that he or she knew inputs that would satisfy the above function. Note that three of the inputs (\(ElectionID\), \(L\), and \(Shadow_{public}\)) are public, while the fourth input (\(Share_{private}\)) is private to the prover.

Since a valid \(Share_{private}\) cannot be generated without knowledge of a \(Shadow_{private}\), we know each valid proof corresponds to a valid shareholder. And since the \(Shadow_{public}\) is deterministically generated from a \(Share_{private}\), we know that we would see duplicate \(Shadow_{public}\) if a shareholder attempted to vote multiple times.

Implementing such a circuit would require us to first implement the following cryptographic primitives in a way our proof system (in this case, libsnark) could understand:

# Message digest
HASH:                 Cryptographic hash function

# Key generation
PRIVATE(seed):        Generate private key from seed  
PUBLIC(private_key):  Generate public key from private key  

Participant Registration

Once the proving and verifying infrastructure has been established during the setup-phase of the network, every time a shadow entity needs to be generated, the relevant share owner generates a proof using the above circuit, and posts their proof as part of their Register message. Of course the Register message must be posted anonymously and not include any information linking the \(Shadow_{public}\) to a \(Share_{public}\), whose owners are assumed to be known.

Anyone wanting to verify that the shadow identity had been correctly generated could then verify the proof with the provided inputs and the verification algorithm specified in the propose message.

Voting

In the voting phase, each voter anonymously submits his or her Vote message to the network, signed with their \(Shadow_{public}\) key. Any participant who wishes to verify the validity of a Vote message merely needs to check if:

  • the vote message has been signed by a valid \(Shadow_{public}\), and
  • the \(Shadow_{public}\) used to sign the Vote message had already been used.

The vote itself can be in plaintext, as it is totally unlinked from the real identity of the voter.

Election Outcome

As the votes are in plaintext, tallying is easy for everybody. The winner is decided according to the social choice function included in the Propose message.

2. Hidden What

In the Hidden What solution, instead of obfuscating the identity of the voter, we make the vote itself illegible until all votes have been cast. We base our example protocol and explanation on a paper published by Hao, Ryan, and Zielinski. Following the cited article, we take an example election where each participant has the option to vote either yes (1) or no (0). The protocol can be expanded to handle multiple options and candidates with some increase in complexity.

The key feature of the Hidden What approach is that the votes can immediately be counted only when all registered participants have voted, and not before.

Setup

The following election-specific parameters should be included in the propose message, to be used by all participants later in the protocol:

  • a prime \(q\), representing the order of a finite cyclic group \(G\), and
  • a generator \(g\) of \(G\).

Registration

In the registration phase, each of the \(n\) participants picks a secret key \(x_i\), and then sends a message with the following information to the network:

  • \(Commitment_{i} = g^{x_i}\), and
  • A zero-knowledge proof that the participant knows \(x_i\), using a Schnorr Signature or any other non-interactive technique.

Since the discrete-log problem is considered hard in the group \(G\), once a participant has registered, we are assured that the participant, and only the participant, knows \(x_i\). Meanwhile, everyone in the network knows the public commitments \(g^{x_i}\).

Voting

During the voting phase, each participant calculates a personalized blinding factor, that incorporates all of the other participants’ commitments, as follows:

$$BlindingFactor_{i} = \Pi_{j=1}^{i-1} Commitment_{j} / \Pi_{j=i+1}^{n} Commitment_{j}$$

Each participant’s blinding factor embeds every other participant’s commitment. The blinding factors, while personalized for each participant, can even be pre-calculated and published before voting begins, as they are constructed only from public commitments submitted in the registration phase.

Each participant then calculates their vote:
$$Vote_{i} = g^{v_i}$$

And then publishes their obfuscated vote:
$$HiddenVote_{i} = BlindingFactor_{i}^{x_i} * Vote_{i}$$

The \(HiddenVote\) can only be calculated by the participant, as it requires the participant’s secret, \(x_i\). In order to prove the validity of \(Vote_i\), each participant should also submit a zero-knowledge proof that \(Vote_i\) is either yes (1) or no (0). This is a specialized version of a standard non-interactive zero knowledge proof of set membership.

Counting

Once the voting messages have all been sent, anyone can count the total number of yes votes by calculating the product of the published hidden votes:

$$TotalYesVotes = \Pi_{i} HiddenVote_{i} = g^{\Sigma_{i} Vote_{i}}$$

This equality holds because all the other terms in the \(HiddenVote\)s cancel each other out, as can be seen in the following identity:

$$\Pi_{i=1}^{n} BlindingFactor_{i}^{x_i} = 1$$

Of course, since the discrete log problem is hard in \(G\), we must exhaustively search for the actual number of total yes votes, \(t\) until we find \(g^t = TotalYesVotes\). For boardroom purposes, the total number of votes should be much, much smaller than the size of \(G\), and should not present a computational problem.

Tradeoffs

As explained above, each of the approaches attempts to obscure a different facet of the voting process, resulting in verifiably anonymity. In the Hidden Who approach, a zkSNARK is used to create shadow identities, obscuring the true identity of the voters associated with plaintext votes. In the Hidden What approach, multiparty computation and zero-knowledge commitments are used to hide the content of each voter’s vote, which can only be revealed when the election is over, through the cooperation of all participants.

Each of these approaches enjoys its own difficulties, both theoretical and practical. While both have been designed to satisfy our design criteria of Anonymity, Verifiability, and Franchise, their performance differs by other important metrics.

Fairness

We use fairness to describe the notion that everyone’s voting experience is equal, and no participant has more information than another.

In the Hidden Who protocol, as all the votes are in plaintext, every participant knows the current total at all times. Some participants may decide their vote based on the current status of the election, or hold off participation until a later time to see how the election evolves, thus violating the fairness criterion.

Similarly, in the Hidden What protocol, the last voting member can calculate the result of the election before casting her vote. This breach is much easier to fix, by either adding a “phantom participant” who votes last, or adding a commitment round, as suggested by Khader, et al.

Robustness

The fairness of Hidden What protocol comes at a cost: any participant may choose to sabotage the election by registering to vote and then not voting. As every registrant’s participation is needed to decrypt the votes, this would be an effective technique for participants who are afraid of losing, or simply thought none of the options were promising. Khader, et al have also outlined a simple modification to address this concern.

The Hidden Who example is robust as described, and the plaintext votes can simply be tallied after the timeout specified in the Propose message.

Multiple Candidates

As described, the election example is a referendum: the participants either vote yes or no. More useful elections would allow the participants to choose from a field of candidates. While this is trivial in Hidden Who, it is more complicated in the Hidden What example. While they are many modifications to the above protocol which address this issue (some outlined in Hao et al), all add complexity to the protocol.

Freedom of Social Choice

While adding multiple candidates to Hidden What is hard, adding different methods of deciding the victor of an election (the social choice function) is practically impossible. This is because we only ever see the sum of the votes. While one can imagine techniques to integrate approval voting (allowing participants to vote for as many candidates as they like), they would rapidly increase the difficulty of the exhaustive exploration in the counting step, especially in mid-size elections.

In the Hidden Who approach, as the votes are in plaintext, any social choice function embedded in the Propose segment could be easily adopted and run by any and all verifiers.

Implementation

As the Hidden Who approach relies on a general framework for generating zero-knowledge proofs from arbitrary code, it necessitates much more memory and processing time than many small distributed systems may have. In addition, the libraries used are not considered to be suitable for production applications by their authors. ZkSNARKs are a rapidly emerging field, and it seems quite likely that they will have been suitably analyzed and optimized for production applications in the near future.

Tradeoff Summary
Hidden Who
(zkSNARKs)
Hidden What
(Hao, et al + Khader et al)
Anonymity + +
Verifiability + +
Franchise + +
Fairness - +
Robustness + +
Multiple Candidates + Hard
Freedom of Social Choice + -
Implementation Hard Easy

Conclusion

At Stratumn, we are engaged in conversations with multiple clients around building a reliable secure platform for boardroom voting. The choice of which approach to use, either Hidden Who, Hidden What, or something else entirely, depends on the specific needs of the situation and which design criteria are most pressing. We look forward to building a publicly available plugin for Indigo in the near future that makes anonymous, publicly-verifiable boardroom voting easy to integrate into proof of process networks.