The Homomorphic Interpreter: Building Iron-clad Program Obfuscation

In a previous post, I discussed how achieving truly secure computation might require a mix of technologies. I suggested that that mixing verifiable computation, (differential) privacy, and homomorphic encryption could yield a powerful layer of defenses. However, I mentioned that that scheme had one weakness, which was that the compute node could track the computations that it was asked to perform homomorphically. (Here and in the rest of this post, when I say “homomorphic computation,” I mean fully homomorphic and not partially homomorphic computation.)

It seems that an important extra primitive in secure computation will be the notion of a “homomorphic interpreter.” This is a new bit of jargon, so let me try to explain what I mean here. In homomorphic computation, traditionally the data is encrypted. A user sends a homomorphic computation to someone (the compute node) which has a copy of the homomorphically encrypted data. The compute node executes this computation. This scheme keeps the raw data secret, but not the program. The compute node can store the actual program and maintain a copy in perpetuity.

In some sense, this isn’t crazy. Smart contracts typically have their source or bytecode posted onto the blockchain, so this model could perhaps be viable at scale. At the same time, it feels like a hole in the homomorphic armor. Shouldn’t there be a way to prevent the leakage of the source code?

Let’s take a quick pause and review some basics of fully homomorphic encryption. Note that in theory, arbitrary computations can be performed in a homomorphic scheme. The challenge in practice is that the homomorphic encryption has to be periodically “refreshed.” In general, there’s something like a budget for multiplication operations, and once the budget has been used up, an expensive “refresh” operation is needed. As an extra bit of useful terminology, there is also a function, Evaluate which is used to run a computation on homomorphically encrypted data.

With this context in place, the core idea of a homomorphic interpreter is as follows: The Evaluate function takes in a homomorphic program (“circuit”) and encrypted data and outputs an encrypted answer. Encode this function as a homomorphic program that accepts an encrypted program and encrypted data and outputs an encrypted answer. This is the homomorphic interpreter. A compute node that runs a homomorphic interpreter can accept encrypted programs and data from different parties, execute it locally and return the answer, all without ever learned what program was run on what data.

This idea seems fairly straightforward. Why hasn’t it been done already? The main challenge is likely the fact that constructing complex homomorphic computations is still very challenging. Usually, homomorphic computations compute simple calculations which don’t require many refresh cycles. On the other hand, the Evaluate function probably requires a fairly deep circuit to implement, making it challenging to implement. Considerable work on improving the efficiency of homomorphic encryption will be needed before a homomorphic interpreter becomes feasible. At the same time, it seems such a useful primitive that it’s worth the effort.

1 Like

I found the following related links that might be useful:
https://news.ycombinator.com/item?id=13450015
https://blog.cryptographyengineering.com/2014/02/21/cryptographic-obfuscation-and/
https://reverseengineering.stackexchange.com/questions/45/usage-of-fhe-in-obfuscation
https://www.cs.auckland.ac.nz/~cthombor/Students/wzhu/wzhuthesis.pdf
https://link.springer.com/chapter/10.1007/11734628_18
https://arxiv.org/pdf/1710.01139.pdf

1 Like