Stratumn.com

Zero Knowledge Proof of Age Using Hash Chains

We demonstrate a zero-knowledge proof of age protocol that could be easily implemented on a proof of process network.

In a recent post, we outlined a technique of proving that a hidden value was greater than a given threshold without revealing it: “proof of balance”. Our method used an implementation of a general zero-knowledge framework, called a zkSNARK (courtesy of the pepper project). While immensely powerful, those of you who have played with zkSNARKs are likely quite familiar with its drawbacks in terms of efficiency and ease. Here we present a much simpler protocol to prove in zero-knowledge that an integer (such as age or balance) is greater than a known threshold.

The larger insight for the zero-knowledge aficionado is this:

There are lots of efficient, specialized protocols for proving something in zero knowledge. If you can’t find (or develop) one for your specific need, zkSNARKs can come to the rescue. However, most of the time, it's worth looking for that specialized solution. At least twice.

Proof of Age Using Hash Chains

Let’s say Milena (who listens to 80s music and is clearly in her 30s) has been riding around town on her brand new yellow spotted tandem bicycle. Paula is dying to cruise around town on Milena's bicycle and asks Milena out on a date. Milena is interested, but unsure of the legality of their potential adventure. If Milena and Paula have access to a trusted source of digital identification documents (their local government, for example), we would like them to have an easy “challenge and response” protocol, like the following:

  • Milena sends Paula a challenge: “Prove you are at least 18”

  • Paula sends Milena the response: 90d17d7dcd91b4cd4a3e740c15cabac368e32381f68f9d221b7135d38a6845a7

  • Milena verifies the response is correct: “Ok, I am convinced”

We could implement such a system with the following protocol, from a paper by Sebastian Angel and Michael Walfish. An overview of the protocols follows, but if you don't care about the details, just skip to the end for the demo.

Protocol Overview

Setup

  1. In the jurisdiction where Paula and Milena live, assume there is a trusted central document authority that provides proof services. Paula has requested a secret \(S\) from this authority, for use in proving her age. While the format of \(S\) is publicly known, each individual’s \(S\) should be kept secret. For this example, S is 256-bit string: 128 zero bits followed by 128 randomly generated bits. Let’s say Paula’s (S) is:
    0000000000000000000000000000000027ae41e4649b934ca495991b7852b855
  2. The Authority then calculates a hash-chain one link greater than Paula’s age:
    $$EncryptedAge = HASH^{ActualAge+1}(S)$$ If Paula is 19, the Authority would compute 20 iterative hashes to go from \(S\) to \(EncryptedAge\). As good cryptographic hash functions are not easily inverted, it would be difficult for anyone to deduce \(S\) or the \(ActualAge\) from the \(EncryptedAge\). This means we can distribute the \(EncryptedAge\) as needed, without fear that our \(ActualAge\) will be discovered. If we choose SHA-256 as our hash function, the intermediate links in the hash chain would be:
    4e1818b7cd1e72507c8ebf971f2bd8bbfa560a5cf84eef266b1e116ddaa0e6b8
    e7481bd7efa1f31aa4cf6be006ce1056fac57a0b5b342b2b39676015a9bca33d
    d289bf5cf27327fc523bc57d73c66c9498c3236cca421d9fc2c97385ab32ab8e
    40373f27714f237f996c879bc437cdd731834ee73abc63cd77264780543f867a
    71ee023df72331465e2159a63ba66ab471ead101fb4521641c2fdf61941e1fe5
    2bc37b76eeb40e7a44eabf5939e16af42b87fd5cda2f01e426f8f5173dde7a33
    26e35921f9d23b60b565bd745754fdc6c5dd7f6d42d741ee03564aae5e14f3a9
    1c0ff0ae6a812c3877493d1331179b756b64829a317b4e511135eba418558f9a
    427e0f352248a954ea88ba4da22e95866d1b61387583a116a56614407ab92593
    5549d4ac82ea11887f21cb41d1eea51ca121d229c7c6d3cdc377b7bcaa031527
    8b0b617a130a2cc6a15ed2db62e66ff7ba1827b2743231d1689dd52a547d5748
    eccc3327c86c36d21472b854a8f6e76096710621f96c3140f97fc8461b98eb81
    061b5bd19e635c190dde454f613bd3699632bcb361f9619cd6a66c87007de277
    cbf20848bfbea3debb1bae707d12679b6d5db6c959e121769eae2e501633298c
    7368d202186123781060583577877027be5fae147d1bd1e08cf8fdebcb24d4a1
    007a0e171ce5857094957e7c8a191c9feb3cf6ee89ebc57e10123b2abee513b8
    0abb61617926c9ae1207529dfdc158f04168da4326f4bc8297cf64eeb90eee2e
    651e303cd4d4be5a6ac4f3d40577b50e8526bf5edb5fbc574f1cbfa7350cdbda

and the final link in the chain, the \(EncryptedAge\), would be:

6e6e1d4af1752b9de688c00036f5915aa471ba9d6f0884b2375044f331677c35

  1. Finally, the Authority sends Paula a Proving Kit, consisting of the following fields:
    1. Paula’s name in plaintext
    2. A timestamp
    3. Paula's \(EncryptedAge\)

Challenge

  1. Paula, who is 19, receives Milena’s request to prove she is at least 18 years of age (The \(AgeToProve\) )

Construction

  1. As Paula is eager to go for a ride on Milena’s tandem bicycle, she immediately calculates:
    $$Proof = HASH^{1+ActualAge-AgeToProve}(S) = HASH^{2}(S)$$ In this case, Paula only calculates 2 links in the hash chain to get \(Proof\): 4e1818b7cd1e72507c8ebf971f2bd8bbfa560a5cf84eef266b1e116ddaa0e6b8
  2. Paula then sends Milena the following response:
    1. The Authority’s signed Proving Kit (which contains her \(EncryptedAge\) )
    2. The \(Proof\) she calculated

Verification

  1. Milena first verifies the Authority’s signature on the Proving Kit, and then opens it. She then verifies that the name is correct (Paula) and that the timestamp is valid (perhaps more important when seeking to verify an upper bound on someone’s age).
  2. Milena extracts Paula’s \(EncryptedAge\) from the Authority’s message.
  3. Milena then calculates:
    $$VerifiedAge = HASH^{AgeToProve}(Proof)$$ That is, she finishes the hash chain that Paula started.
  4. Milena checks if \(VerifiedAge = EncryptedAge\). If so, she is satisfied.

Milena, of course, could just as easily use this protocol to evaluate the sufficiency of Paula’s monthly salary or the size of Paula’s vinyl record collection, if they could agree on a trusted certification authority. Either way, the protocol should guarantee its users completeness, soundness, and zero-knowledge, all of which we will see in the next section.

Some Desirable Properties

This protocol is only useful if it

  1. Always convinces Milena when Paula is old enough (completeness)
  2. Never convinces Milena if Paula is not old enough (soundness)
  3. Leaks no information about Paula’s actual age to Milena (zero-knowledge)

Completeness

For any age \(AgeToProve \leq ActualAge\):

$$VerifiedAge = HASH^{AgeToProve}(Proof)$$ $$VerifiedAge = HASH^{AgeToProve}(HASH^{1+ActualAge-AgeToProve}(S))$$ $$VerifiedAge = HASH^{1+ActualAge}(S)$$ $$VerifiedAge = EncryptedAge$$

Soundness

For any age \(AgeToProve > ActualAge\), Paula would have to invert \(HASH(S)\) to send a convincing \(Proof\). The exception to this is for \(AgeToProve = ActualAge +1\), in which case Paula could just send \(S\). However, as the format of \(S\) is publicly known, a watchful verifier would detect the fraud (and, in addition, be able to answer future challenges on Paula’s behalf)

Zero-Knowledge

For Milena to learn \(S\), she would have to know \(ActualAge\) and be able to reverse \(HASH\).
For Milena to learn \(ActualAge\) she would have to reverse \(HASH\) or know \(S\) and generate its hash chain until she found \(Proof\).

Note that even if \(ActualAge=AgeToProve\), Milena would learn nothing, as the \(Proof=HASH(S)\)

And now, for the Live Demo!

We have put together a javascript implementation of this protocol for recreational purposes. The code is available on github.

\(S\) :

Setup

Challenge and Construction

Verification

If you don't trust the "verification" step above, you can complete the hash chain yourself by (repeatedly) using either of the following tools with the hexadecimal proof value from the demo:

Further

Trust in human society is a complex issue and even a good cryptographic hash function won’t solve the entirety of the problem. However, in the context of signed messages from trusted sources, Angel and Walfish’s approach can help share only the desired property about our hidden data (in this case, whether the age is above a threshold) without revealing the data itself.

For other numbers (like a bank balance), significantly more computation may be involved, as the hash chains could have millions of links (or more). While specialized hardware for hashing does exist, the relevant Authority could also encode metadata in its signed message (the units of the encrypted value for instance) that optimize for the desired use-case.

At Stratumn, we are excited to be developing real-world applications to help individuals and organizations share their data intelligently, relevantly, and securely. This demo is another link in the chain that we are building to connect these extraordinary cryptographic tools to real-world applications.

Photo by Emmanuel Huybrechts - Source