Three Questions That Will Tell You More About an AI Vendor's Security Than Any Demo

Before you sign a contract or let a tool touch client data, ask these three architectural questions. The answers will tell you everything the sales deck won't.

By John Margerison · Sat, 23 May 2026 · 2330 words

The demo is always clean. The data flows look elegant on the slide. The security page says "enterprise-grade encryption" and shows a SOC 2 badge. Then you sign, deploy, and six months later you're explaining to a client why their confidential documents passed through a server you didn't fully understand, retained longer than you were told, and are now sitting in a legal hold you didn't know existed.

This is not a hypothetical. It is the pattern.

The gap between what AI vendors show in procurement and what their architecture actually does is, right now, one of the most underexamined risks in enterprise technology. Not because vendors are lying. Because the questions buyers ask are too soft. "Is your data secure?" is not a question. It is an invitation to recite marketing copy. The questions below are different. They are specific enough to require a real answer, and the quality of that answer will tell you more about the vendor than any reference call.

Ask them before you sign. Ask them before the pilot touches real data. Ask them in writing, and require written responses.

QUESTION ONE: At what point in the request flow does your server see plaintext, and what does that plaintext contain?

WHY THIS QUESTION MATTERS

Every AI tool that processes natural language has to read your text at some point. That is not a flaw. It is how the technology works. The question is not whether plaintext exists. The question is where, when, and in what form.

"Encrypted in transit" - a phrase that appears in nearly every enterprise AI vendor's security documentation - means the data is encrypted while traveling between your browser and their server. It says nothing about what happens once it arrives. The moment the request hits the inference layer, the model reads plaintext. That is the point of exposure. The question is what surrounds that moment: who has access to those logs, whether prompts are stored, whether they flow through subprocessors you haven't vetted, and whether the plaintext includes metadata your employees didn't realize they were sending.

Samsung learned this in April 2023. Within twenty days of allowing engineers to use ChatGPT, three separate incidents occurred in which employees pasted proprietary semiconductor source code, equipment defect detection algorithms, and transcripts of confidential internal meetings directly into the chat interface. The data reached OpenAI's servers with no NDAs in place, no data residency controls, and no mechanism to delete what had been submitted. Samsung banned the tool company-wide. The source code was already gone.

The risk is not always that dramatic. It can be quieter. A contract lawyer pastes a draft agreement to ask for redline suggestions. A finance analyst uploads a revenue model for formatting help. A sales lead shares a client proposal for a grammar check. Each of those actions sends plaintext to a server. Whether that server logs it, stores it, routes it through a third-party subprocessor, or uses it to improve the model depends entirely on the vendor's architecture. And most buyers never ask.

WHAT A GOOD ANSWER LOOKS LIKE

A good answer is specific and architectural. The vendor should be able to tell you: prompts are decrypted at the inference layer, processed in memory, and not written to persistent storage. They should name any subprocessors that touch the request (model providers, logging services, abuse detection systems). They should confirm whether prompt content is logged for debugging, and if so, who can access those logs and for how long. If they offer a zero-data-retention mode, they should explain what that means technically, not just as a setting.

WHAT A BAD ANSWER LOOKS LIKE

"We use AES-256 encryption and TLS 1.2." That is an answer to a different question. If the vendor cannot describe the plaintext lifecycle, they either don't know their own architecture or they're hoping you won't push. Both are problems.

HOW TO FOLLOW UP

Ask for a data flow diagram that traces a single user prompt from submission to response, including every system that touches the request. Ask whether the vendor has a subprocessor list and whether it is contractually frozen or subject to change. Ask whether their SOC 2 Type II report covers the inference layer specifically, or only the surrounding infrastructure.

QUESTION TWO: What is your data retention policy, and is it contractually binding rather than a settings toggle a future employee could change?

WHY THIS QUESTION MATTERS

Data retention is where the gap between the sales conversation and the legal reality is widest. Almost every AI vendor has a settings panel that lets enterprise customers turn off training data collection or reduce retention windows. Almost none of those settings are contractually binding by default.

The distinction matters enormously. A settings toggle is a current configuration. A contract clause is a legal obligation. A settings toggle can be changed by a vendor employee responding to a product decision, a policy update, or a terms-of-service revision that arrives by email with thirty days' notice. A contract clause requires your signature to change.

The FTC made this explicit in January 2025, warning that AI companies that quietly change their terms of service to expand data use may be engaging in unfair or deceptive practices. The agency noted that it has previously required companies that unlawfully obtained consumer data to delete not just the data but the models trained on it. That is not a theoretical consequence. It is an enforcement posture that is already active.

Slack provided a preview of how this plays out at scale. In 2023, users discovered that Slack's privacy policy permitted the company to use customer messages, content, and files to train AI and machine learning models. The policy had been updated without prominent notice. The backlash was significant. Slack clarified and adjusted its language, but the episode illustrated a structural problem: enterprise customers had been operating under an assumption about data use that the terms of service did not support.

The question is not whether your vendor currently has good retention settings. The question is whether those settings are locked by contract, and what happens when the vendor's business model evolves.

WHAT A GOOD ANSWER LOOKS LIKE

The vendor should offer a Data Processing Agreement (DPA) that specifies retention periods as binding obligations, not defaults. The DPA should state explicitly that your data will not be used to train or improve models, that it will be deleted within a defined window after the session or contract ends, and that any change to those terms requires your written consent. The vendor should be able to point you to the specific clause, not describe it in general terms.

WHAT A BAD ANSWER LOOKS LIKE

"You can turn off training in your settings." Or: "Our default is not to train on enterprise data." These are policy statements, not contractual commitments. Policy can change. Ask what happens to your data if the vendor is acquired. Ask what happens if they update their terms of service. If the answer is "we'll notify you," that is not protection. That is notice.

HOW TO FOLLOW UP

Request the DPA before the contract is signed, not after. Look for the specific clause that addresses model training and data deletion. Ask whether the retention period is the same for all subprocessors or only for the vendor's own systems. Ask for a written confirmation of what happens to your data upon contract termination, including the timeline for deletion and the mechanism for verification.

QUESTION THREE: What happens if you are subpoenaed for our content, and have you been?

WHY THIS QUESTION MATTERS

This is the question that gets the most uncomfortable silence in vendor conversations. It is also the most important one.

When your employees use an AI tool, the prompts and outputs they generate are data that exists on a third-party server. That data is subject to legal process. A government agency, a regulator, a litigant in a civil case, or a law enforcement body can serve the vendor with a subpoena or a warrant and request that content. The vendor is not your attorney. They do not have attorney-client privilege. They cannot refuse a valid legal order.

This is not theoretical. OpenAI's transparency report covering January through June 2025 disclosed that the company received 119 requests for user account information, 26 requests for chat content, and one emergency request from law enforcement during that six-month period. The company provided data from 132 accounts in response to those requests. These numbers are relatively small compared to the volume of requests received by major search and social media platforms, but the category is new and growing. Generative AI chat content is now appearing as evidence in criminal and civil cases.

The first known federal warrant requiring OpenAI to conduct a reverse search using prompts to identify an unknown user was reported in 2024. The legal framework is still forming. But the direction is clear: AI chat logs are discoverable, and vendors will comply with valid legal process.

For most enterprises, the risk is not criminal investigation. It is civil litigation, regulatory inquiry, or a competitor's discovery request in a commercial dispute. If a client's confidential strategy, a draft acquisition memo, or a sensitive HR matter was processed through an AI tool, that content may be reachable by opposing counsel in ways that would not have been possible before the tool existed.

WHAT A GOOD ANSWER LOOKS LIKE

A good answer has two parts. First, the vendor should describe their legal process policy: what standard of legal process they require before disclosing content (a warrant, not just a subpoena, for content data), whether they notify customers before complying where legally permitted, and whether they challenge overbroad requests. Second, they should answer the second half of the question honestly. Have they been subpoenaed? If yes, what was the outcome? A vendor that publishes a transparency report is demonstrating a level of accountability that one without a transparency report is not.

WHAT A BAD ANSWER LOOKS LIKE

"We take privacy very seriously and comply with all applicable laws." That is not an answer. It is a statement that they will hand over your data when required. Ask whether they have ever received a government request for enterprise customer content specifically. Ask whether their legal process policy is published. Ask whether there is a contractual notification obligation if your data is the subject of a legal request.

HOW TO FOLLOW UP

Ask for a copy of the vendor's law enforcement guidelines or government data request policy. Ask whether the contract includes a notification clause requiring the vendor to inform you if your data is subject to legal process, to the extent permitted by law. Ask whether the vendor has ever challenged a government data request on behalf of a customer, and if so, what the outcome was. Ask whether data minimization, meaning the practice of not retaining content beyond what is needed for the immediate session, is available and whether it would reduce your exposure.

THE CHECKLIST: PASTE THIS INTO YOUR RFP

The following questions are formatted for direct use in a vendor questionnaire or RFP. Require written responses. Attach them to the contract as representations where possible.

---

SECTION A: DATA ARCHITECTURE AND PLAINTEXT EXPOSURE

A1. Provide a data flow diagram tracing a single user prompt from submission to model response, identifying every system, service, or subprocessor that processes the request.

A2. At what point in the request flow is user content in plaintext? Who has access to that plaintext, and for how long?

A3. Is prompt content logged for any purpose (debugging, abuse detection, quality assurance)? If yes, specify retention period, access controls, and whether logs are accessible to vendor employees.

A4. Provide your current subprocessor list. Is this list contractually frozen, or can it be updated without customer consent?

A5. Does your SOC 2 Type II report cover the inference layer and prompt processing, or only surrounding infrastructure? Provide the most recent report.

---

SECTION B: DATA RETENTION AND CONTRACTUAL COMMITMENTS

B1. What is your data retention period for prompt content and model outputs? Is this period the same across all subprocessors?

B2. Is your data retention policy reflected in a contractually binding Data Processing Agreement (DPA), or is it a configurable default setting?

B3. Does your DPA explicitly prohibit the use of customer data to train, fine-tune, or improve AI models? Provide the specific clause.

B4. What happens to our data upon contract termination? Specify the deletion timeline and the mechanism by which we can verify deletion.

B5. If you update your terms of service or privacy policy in a way that affects data use, what notice do you provide, and does the change require our written consent to take effect?

---

SECTION C: LEGAL PROCESS AND GOVERNMENT REQUESTS

C1. What standard of legal process do you require before disclosing customer content to law enforcement or government agencies? (Specify: subpoena, court order, warrant, or equivalent.)

C2. Do you publish a transparency report disclosing the number and type of government data requests you receive? If yes, provide the most recent report.

C3. Have you ever received a government or legal request for enterprise customer content specifically? If yes, describe the outcome without disclosing confidential details.

C4. Does your standard contract include a notification clause requiring you to inform customers if their data is subject to legal process, to the extent permitted by law?

C5. Do you offer a data minimization or zero-retention mode that reduces the volume of customer content stored on your servers? If yes, describe the technical implementation and any limitations.

---

One more thing. After you receive the written answers, have your general counsel read them alongside the actual contract. The answers to these questions are only as good as the legal obligations that back them up. A vendor who answers well in a questionnaire but whose contract says nothing about retention, subprocessors, or legal process notification has told you everything you need to know about which document controls.

Takeaways

Before any pilot touches real data, require a written data flow diagram showing every system that processes a user prompt, including subprocessors, and confirm who has access to plaintext and for how long.
Request the Data Processing Agreement before signing, not after, and verify that data retention limits and the prohibition on model training are contractually binding clauses, not configurable defaults.
Ask your AI vendor directly whether they have ever received a government or legal request for enterprise customer content, and require a notification clause in the contract obligating them to inform you if your data becomes subject to legal process.
Run all vendor questionnaire responses past general counsel alongside the actual contract text. If the contract is silent on what the vendor promised verbally or in writing, the contract wins.
Implement a data minimization policy internally: train employees to avoid pasting confidential client data into any AI tool until the architecture questions above have been answered in writing and reflected in the contract.

Sources

OpenAI Report on Government Requests for User Data, January-June 2025 (openai.com)
FTC Blog: 'AI Companies: Uphold Your Privacy and Confidentiality Commitments,' January 9, 2024 (ftc.gov)
FTC Blog: 'AI (and other) Companies: Quietly Changing Your Terms of Service Could Be Unfair or Deceptive,' January 17, 2025 (ftc.gov)
DataFence: 'Cloud Data Loss Prevention: Samsung ChatGPT Ban After Engineers Leaked Source Code,' October 2024 (datafence.com)
Cybersecurity Law Report: 'Gen AI Chats Becoming Evidence: Law Enforcement Warrants and Subpoenas,' 2024
IBM Cost of a Data Breach Report 2025, Ponemon Institute / IBM (ibm.com)
Slack privacy policy backlash coverage, 2023 (multiple sources)
BigGo News: 'OpenAI Faces Dual Legal Fronts: Government Data Demands and Nonprofit Subpoenas,' 2025