A group of researchers has published a study that once again raises alarm bells regarding privacy when using AI. What they have managed to demonstrate is that it is possible to know the exact prompt that a user used when asking a chatbot something, and that puts AI companies in a delicate position. They can, more than ever, know everything about us.
A terrifying study. If you are told that ‘Linguistic models are injective and, therefore, invertible’ you will probably be shocked. That’s the title from the study carried out by European researchers in which they explain that large language models (LLM) have a major privacy problem. And it has it because the transformer architecture is designed that way: each different prompt corresponds to a different “embedding” in the latent space of the model.
A sneaky algorithm. During the development of their theory, the researchers created an algorithm called SIPIT (Sequential Inverse Prompt via ITerative updates). Such an algorithm reconstructs the exact input text from the hidden activations/states with a guarantee that it will do so in linear time. Or what is the same: you can make the model “snap” easily and quickly.
What does this mean. What all this means is that the answer you got when using that AI model allows you to find out exactly what you asked it. In reality, it is not the answer that gives away, but the hidden states or embeddings that the AI models use to end up giving the final answer. That’s a problem, because AI companies keep these states hidden, which would theoretically allow them to know the input prompt with absolute accuracy.
But many companies already saved the prompts. That’s true, but that “injectivity” creates an additional privacy risk. Many embeddings or internal states are stored for caching, for monitoring or diagnosis, and for customization. If a company only deletes the plain text conversation but does not delete the embeddings file, the prompt is still recoverable from that file. The study shows that any system that stores hidden states is effectively handling the input text itself.
Legal impact. There is also a dangerous legal component here. Until now, regulators and companies argued that internal states were not considered “recoverable personal data,” but that invertibility changes the rules of the game. If an AI company tells you that “don’t worry, I don’t save the prompts” but it does save the hidden states, it’s as if that theoretical privacy guarantee is of no use.
Possible data leaks. A priori it does not seem easy for a potential attacker to do something like this because they would first have to have access to those embeddings. A security breach that results in the leak of a database of those internal/hidden states (embeddings) would no longer be considered an exposure of “abstract” or “encrypted” data, but rather a plain text source from which, for example, financial data or passwords that a company or user has used when asking the AI model could be obtained.
Right to be forgotten. This injectivity of LLM also complicates the requirements of regulatory compliance for the protection of personal data, such as the GDPR or the “right to be forgotten.” If a user requests complete deletion of their data from a company like OpenAI, they must ensure that they delete not only visible chat logs, but also all internal representations (embeddings). If any hidden state persists in any register or cache, the original prompt would still be potentially recoverable.
Image | Levart Photographer
 
					 
		
GIPHY App Key not set. Please check settings