Suppose there is a group of N people some of whom possess a specific property. For example, their wealth is above or below a threshold, they voted for a particular candidate, they have a certain disease, etc. The group wants to find out how many of its members posses the property -- but without revealing the identities. Unless of course it turns out that all members do or do not have the attribute of interest. However, in all other cases the counting algorithm should guarantee that nobody can find out if a particular individual possesses the property unless all the other N-1 members of the group collude.
One particular instance of the above problem is Dr. Ecco's CentiMillionaire problem, published in Dr. Dobb's Journal, May 2000 issue.
We have 20 people around the world. They are all Centimillionaires, but none have more than $10 billion. They are rich enough, however, that they all round their wealth to the nearest $100 million. Several may have the same number of centimillions. They want to know how many centimillions the richest and poorest among the 20 have and how big those rich and poor subgroups are. Nobody should learn who else is in any subgroup (unless all or all but one are in some subgroup or there is large-scale collusion). You can assume that they will be honest in the information they give and in any calculations they are asked to do... They must communicate by phone.
Liane's solution requires that all 20 people communicate with each other at least once, nearly 200 phone calls. This present web page describes a method to determine the distribution of wealth among 20 centimillionaires with only 58 calls. This solution offers the same collusion protection: 19 people must collude to infer the 20th person's wealth. The last nineteen calls are just to announce the result, and can be replaced with a single conference call if allowed. Actually all the communication can be made public without revealing anything about the wealth of any particular centimillionaire.
The solution is a combination of a Diffie-Hellman key exchange and RSA algorithms. See an excellent web page Cryptography and Number Theory for Digital Cash by J. Orlin Grabbe for background information and references. The solution relies on the fact that a discrete log problem is very hard. Unlike RSA, each CentiMillionaire distributes p and q but keeps both encryption and decryption exponents private. If both p and q are chosen to be of the form 2^k+1, then every odd number less than (p-1)*(q-1) will do as an encryption exponent ei. Thus it appears one can't easily guess which exponent another CentiMillionaire has chosen.
The counting algorithm does not require any trusted third parties, and all communications between centimillionaires can be conducted in public.
Here is how twenty centimillionaires can determine how many of them fall within a particular wealth bucket.
Let centimillionaire C1 choose:
for each wealth bucket (1 through 100):
Let centimillionaire C2:
for each wealth bucket (1 through 100):
Let centimillionaire C20:
for each wealth bucket (1 through 100):
Round 1 (20 calls) is ended
Centimillionaire C1:
for each wealth bucket (1 through 100):
Centimillionaire C2:
for each wealth bucket (1 through 100):
Centimillionaire C20:
for each wealth bucket (1 through 100):
Suppose Centimillionaires C1 and C3 conspire to learn the wealth of
C2. They know:
x^e1, (x^e1)^e2, x^(e2*...e20), x^(w*e3...e20)
where w is either 1 or 2 (depending if C2's wealth falls within the
bucket under consideration). Alas, the only way they can figure w is
to find out e2 -- by solving a discrete logarithm problem. If e2 is
chosen large enough, the solution is not feasible. This also shows
that all conversation above can be conducted in public. Even knowledge
of intermediate results does not compromise each centimillionaire's
privacy.
oleg-at-okmij.org