Prompt Inversion > Blog > AI Embeddings: Not Necessarily Secure Anymore

June 30, 2025

AI Embeddings: Not Necessarily Secure Anymore

We discuss recent research reverse engineering AI embeddings.

AI Embeddings: Not Necessarily Secure Anymore

AI tools today rely heavily on something called "embeddings," long strings of numbers representing text, images, or other data points. Whether your team is building personalized chatbots, improving search relevance, or analyzing customer sentiment, you're likely using embeddings generated by models from vendors like OpenAI. 

A recent breakthrough published in May 2025 by researchers in the paper Harnessing the Universal Geometry of Embeddings discovered something remarkable: behind the scenes, most embedding models share an almost identical "latent geometry.” Think of embedding models as human experts turning human text into numerical vectors, with each expert giving slightly different answers based on their particular training or personality, but still capturing similar meanings and structures. Just like two skilled experts might choose different wordings to express the same idea, different embedding models produce different numerical vectors to represent the same text. Underneath, they still rely on a common web of meanings and concepts.

The researchers harnessed this insight by building a tool called vec2vec, which can effectively translate embeddings between entirely different models even if those models were created independently by separate companies, without ever needing the original sentences or documents that the vectors represent. Imagine being able to easily convert information from Google's embedding model into OpenAI's embedding model without having to reprocess the original documents.

This research breakthrough comes with important new risks and serious privacy concerns. In their findings, the team successfully demonstrated they could extract sensitive details like names, dates, and financial specifics from embedding vectors alone. They showcased this vulnerability by recovering personal information directly from the embeddings of Enron employee emails.

This also challenges the widespread assumption that embeddings inherently obscure or anonymize information. In reality, embedding vectors can leak confidential details if an attacker employs advanced techniques like vec2vec.

For executives, this means it's critical to revisit how your organization treats embeddings. They should no longer be considered safe to share publicly or store unencrypted. Compliance and risk management teams will now need to classify embeddings similarly to protected customer data, ensuring appropriate governance, access controls, and encryption.

Recent blog posts

LLMs
Security

AI Embeddings: Not Necessarily Secure Anymore

We discuss recent research reverse engineering AI embeddings.

June 30, 2025
Read more
LLMs
Agents

Keeping up with AI Advances

We list some of our favorite sources for AI news.

June 16, 2025
Tanay Wakhare
Read more
Agents
Case Study

Beyond Chatbots: Real-World Agentic Workflows at PartyBus

We highlight our recent case study implementing agentic workflows for hiring managers

June 9, 2025
Read more