Serious question.
I use ChatGPT somewhat regularly with my Google login as my account. I ran a query yesterday about a new but fairly specific topic (nothing weird just some questions about better at [skill]. )
A day later, my Facebook is chock a block full of ads on [skill]-related products and content. This is not something that I search for or anything I have explored prior to the ChatGPT conversation.
Just curious if this a thing now, and my chatgpt usage and content is going to fuel more "personalized marketing" everywhere else?
Videos
ChatGPT privacy policy
I've gone through the latest chatgpt privacy update. I read the whole policy. It's nice to know that we are able to opt out our data from training in account settings. I immediately did that.
So from what I understand they collect personally identifiable information, prompt data and the responses. They supposedly use this for business, legal, training and fraud detection purposes.
When we request data deletion or delete our entire account , they mention in the policy Not Everything is deleted. Some data is perpetually maintained for legal purposes.
What i couldn't gather from the policy are these questions. So it would be helpful if anyone can pitch in their thoughts regarding these...
What data is perpetually maintained post deletion
What's the best strategy to give openai as little data as possible, like opening an account in EU by spoofing location for better data rights ?
After a session, does prompting "Delete your memory now" help in any way ?
I know data with nda shouldn't be shared with chatgpt, but is there other ways to share data with prompts that'll help with sensitive data handling
Any alternative GPT's that perform better at privacy?
I sent over my ID twice now through the portal, but OpenAI keeps blocking my request (see image). Any advice on next steps?
When you send a privacy request through OpenAI’s portal, they send you a government ID verification request via Stripe. I have scanned my passport twice now and sent over via this service. The first time it was rejected, I thought maybe the picture was too blurry (grasping at straws for reasons basically as it was clear anyway) so I took extra effort with the second image. I followed the guidelines and yet again it’s been rejected.
I tried emailing OpenAI about this and a chatbot (assumed) called Hetvi did not read my email and sent me generic advice about unticking the box to prevent ChatGpt learning from your chat. I already know this (now). They didn’t address my question which was: is there a technical fault at play or did you really not receive my ID? I’ve sent it twice now and something feels off…
It’s a known strategy by companies who have murky privacy procedures to make the process of sending a data request through more difficult or complex. I have no doubts in my mind this is what’s happening, so now I need a plan B.
I could contact the ICO, OpenAI (again) or Stripe for clarification. If anyone has been through this process before or has tips on how I can get my data request over the line, it would be really helpful!
Imagine with the hype the project has seen how much data they have. Almost everyone has tried it at some point or another, and even if you didn’t, you’ve probably tried something that uses the ChatGPT API, so needless to say, they have access to a ton of data. OpenAI is a company with employees and bills to pay, so they must make money somehow. It seems like they charge you for generated AI responses, but it’s a minute amount that isn’t going to bring the bank. Not to mention, they hand out 5 USD of credit for free. This makes me feel like they must be monetizing your data some way or another. What do you think they do with the data you provide? Of course, other than training the AI.
Our organization is planning to create a chatbot by purchasing openai API. We will use a text document and database as knowledge base. My concern is the data ( text document and database) contains sensitive information - emails, etc. Will this data be exposed ? Can I assure we are safe using open ais API?
Hello, I want my account removed, all my data completely wiped, and I want everything about me to be completely forgotten.
OpenAI support: No way to do that.
Guys, even if you delete your account, your phone number will be in their system, your AppleID and Google is permanently linked.
They link your account to a more permanent identifier, phone number, apple and google accounts.
All they do is profiling.
You were worried about the NSA collecting your metadata? Ha Ha, OpenAI is just doing that forming a profile of you with every single prompt.
THERE IS NO DELETING.
We thought Google was evil, OpenAI is making Google look like rookie.
I'd like to search and analyze my personal notes using OpenAI's models. With the data usage policy as it is, I don't feel comfortable doing that.
My concerns with the current policy:
-
OpenAI employees and third-party contractors are allowed to read my private data without asking my permission.
-
My private data is stored on the server for 30 days, making it vulnerable to data breaches.
What a better policy could look like
As an example, consider the privacy policy that applies when using Google Sheets:
The content you save on Google Docs, Sheets, & Slides is private to you, from others, unless you choose to share it. [...] Google respects your privacy. We access your private content only when we have your permission or are required to by law.
Unlike OpenAI, the service provider does not read my private data, giving me extra peace of mind when organizing and analyzing my data. Though the data is retained, that's technically necessary to provide the service.
For data retention, I would ask that it isn't retained except when technically necessary (such as to provide history in the playground). In particular, prompts I send with the API shouldn't be retained.
Addressing abuse concerns
As far as I understand, the reason for the weak privacy of the current policy is "to investigate and verify suspected abuse". To address this concern in a privacy-respecting way, perhaps the moderation endpoint could be made a prerequisite for receiving the stronger privacy guarantees. For my use case, that shouldn't be a problem, even with false positives, since I don't mind simply omitting some notes.
Other use cases where better privacy would be good
I wanted to focus on my own use case since that's what I understand best, but here are some other scenarios where better privacy would be helpful:
-
A company searching and analyzing its internal documents
-
The personal use of the API or ChatGPT as a journal/confidante/therapist/etc.
I am making an app that will take user inputs and return some helper questions using AI.
I wanted to use OpenAI API, but now I am concerned with how they handle data, as my users would be sharing some private stuff.
I don't want someone to sue me because of this...
Can someone help me, or recommend some other solution?
I am developing an app for a CRM client that uses a proprietary database with customer and sales data. They’re exploring a conversational AI interface, like OpenAI's ChatGPT, to enable users to query the database naturally.
The plan is to integrate this feature into platforms like WhatsApp. However, the client is concerned about protecting their proprietary data and ensuring it isn’t exposed or incorporated into OpenAI’s training models.
Could you provide guidance on safeguards or best practices to address this concern? Thank you
i’ve always wondered how private chatgpt truly is, so i spent a few hours reading their privacy policy.
here's what I found:
1. your chats become content that openai can use: by default, your messages, uploads, and feedback are treated as “user content” and can be used to improve and train models, unless you turn this off in data controls or use certain business plans.
2. conversations (and lots of metadata) are stored by default: chatgpt saves your chats, plus account details and usage/technical data (ip, device info, timestamps, etc.) to run and improve the service. This is the default behavior for regular users
3. employees at openai or its vendors can review some data
content can be accessed by employees or service providers for safety checks, debugging, abuse detection, and other operational needs — so it is not something only “the model” ever sees
4. your data can be shared with third-party service providers
openai explicitly shares personal information with vendors like cloud hosts, analytics tools, and customer-support providers, who also process and store parts of your data under openai’s instructions
5. it can be handed over to governments or courts
openai’s policy allows sharing your data with law enforcement or other authorities when required by law or to protect its rights and systems. So legal requests can override your expectations of privacy.
6. deleted or “temporary” chats aren’t a hard privacy guarantee
under normal policy, deleted and temporary chats are supposed to be removed from openai’s systems after about 30 days, but legal orders (like the new york times lawsuit) have forced openai to retain even deleted chats for much longer / indefinitely for many users.
7. long-term “memory” stores facts about you
chatgpt now has a memory system that can remember things like where you live, preferences, and ongoing projects across chats, unless you turn this off or manually clear memories. That’s convenient, but it also means more persistent profiling of you on their servers
8. consumer vs enterprise: very different guarantees
enterprise / team / edu / some api customers get stricter protections (no training on their data, controlled retention, etc.). ordinary free/plus users don’t get that by default, so their experience is meaningfully less private
9. plugins/tools/browsing can expose even more
when you use web browsing, custom gpts, or tools that call out to other services, extra data (urls, document content, third-party site data) can flow through multiple systems, expanding the number of parties that see your activity beyond just openai.
if privacy matters to you, the safest approach is to run models locally. if you don’t have a gpu, use a privacy-focused platform like okara or lumo
I have asked chatGPT about something that I usually don't ask last night, and after that my instagram was all about the topic (and it's exactly what I was asking as a niche), as i mentioned it's an unusual subject and my instagram was NEVER about it (I mostly get animal videos). I understand once I watch one the algorithm will keep pushing me the same kind of videos but it's just strange how that first one slide into my feed and it's EXACTLY what I asked chatGPT about. Do you guys have the same experience?
From OpenAI's "Privacy" Policy (emphasis mine):
"Data Submitted through the API (API Customers & End Users) - We store and process any data you chose to submit through the API in order to provide you with our API services. We may also use that data to ensure quality, secure and improve the API (and any related software, models, and algorithms), and to detect and prevent misuse. Some additional details:
How we may use it: In addition to processing submitted data to generate output for you, other ways we may use such data include: spot checks and analysis to detect bias or offensive content and improve ways to reduce such occurrences; large scale analysis of anonymized submissions and completions to generally improve the API and underlying models; train and refine semantic search models to improve our search capabilities; train and refine classifier models to identify such things as bias, sentiment, and the like; train and refine aligned models (e.g., the instruct series models) to generally improve future versions of the models; and analysis to troubleshoot technical issues.
How we do not use it: We do not use the data you submit through the API to train generative custom models for other API Customers. For example, we will not create a dataset of submissions to your chatbot application and use that data to train a chatbot for another customer.
Who can see it: Your data is only visible to a limited set of employees of OpenAI and its affiliates working on providing services and support to the API Customer, as well as to a small team of such employees monitoring for potential misuse.
How long we store it: We store your data only as long as is needed to provide you with our API services and monitor misuse."
Note how this is for the API overall- which doesn't give a wooden nickel how an individual application classifies it. Also this is complimentary to but separate from OpenAI's Terms of Use, whether boilerplate or individualized.
Speaking of Terms of Use, from OpenAI's general Terms of Use (emphasis mine):
"(c) Submission of Content. OpenAI does not acquire any ownership of any intellectual property rights in the content that you submit to our APIs through your Application, except as expressly provided in these Terms. For the sole purpose of enabling OpenAI and its affiliates to provide, secure, and improve the APIs (and related software, models, and algorithms), you give OpenAI and its affiliates a perpetual, irrevocable, worldwide, sublicensable, royalty-free, and non-exclusive license to use, host, store, modify, communicate and publish all content submitted, posted or displayed to or from the APIs through your Application. When permissible under applicable Privacy Laws, the foregoing license survives consumer requests for deletion of personal data or Personal Information for the sole purpose of enabling OpenAI and its affiliates to provide, secure, and improve the APIs. Before you submit content to our APIs through your Application, you will ensure that you have the necessary rights (including rights from your end users) to grant us the license."
This is corp-speak for "if we legally can we are going to completely ignore requests for personal information deletion from users :o)". Don't be fooled by the qualifying phrase of "for the sole purpose of enabling OpenAI and its affiliates to provide, secure, and improve the APIs"- it's so broad and vague that it basically means nothing beyond "we'll mostly keep it in-house probably".
Now note the following with the context that case-by-case basis applications (well technically all but particularly case-by-case), such as AI Dungeon, are apparently required to be reviewed by OpenAI (similar to how Apples reviews apps) whenever they intend to add new functionality.
From OpenAI's Use Case Guidelines:
"What happens if I don’t go through the production review process?
If you deploy without submitting a review, your API key may be immediately revoked. In some cases, loss of an API key may be permanent. If you deployed accidentally without realizing this policy, please do contact Support and we will work out the best way to get your application compliant.
How long does the production review process take?
Typically our production team will respond to your production review request within 3 business days.
Do I have to submit a production review every time I add a feature to my use case/application?
If you are adding a new capability to your use case, you do need to submit a production review request (eg you were approved for Google ads and now want to add blog ideas). If you are just updating your current application's interface or the like, a production review request is not required. A good heuristic here is that if you are adding a new prompt or positioning something as a new feature to users, you need to submit a production review request."
So basically if OpenAI is grumpy and you're a case-by-case developer, your attempt to add a new feature can suddenly become "actually you need to add these risk mitigation functions to your application now or we'll revoke your API key :o)"
But why the silence? Why the lack of communication?
From OpenAI's Terms of Use (emphasis mine):
"4. Confidentiality (a) You may be given access to certain non-public confidential or proprietary information of OpenAI, its affiliates and other third parties, including, software and specifications related to the APIs and OpenAI’s and its affiliates’ algorithms, software, models, and systems, or other business information (collectively "Confidential Information"). Confidential Information includes any information that OpenAI or its affiliates consider confidential or would normally be considered confidential under the circumstances. You may use Confidential Information only as necessary in exercising your rights under these Terms. You may not disclose any Confidential Information to any third party without our prior written consent, and you agree that you will protect this Confidential Information from unauthorized use, access, or disclosure in the same manner that you would use to protect your own confidential and proprietary information of a similar nature and in any event with no less than a reasonable degree of care."
Looks like production review decisions, and possibly that there even was a decision made ("other business information" is a disgustingly broad and vague term), are confidential unless OpenAI says otherwise. We wouldn't want other developers to know what they can or can't get away with now would we? And since OpenAI unilaterally defines confidentiality, that's why an OpenAI spokesperson is able to comment more than the developers.
All this stated for all to see, I end with this (emphasis mine):
"(b) Confidential Information does not include any information that: (i) is or becomes generally available to the public through no fault of yours; (ii) you already possess without any confidentiality obligations when you received it under these Terms; (iii) was or is later rightfully disclosed to you by a third party without any confidentiality obligations; (iv) we approved for release in writing; or (v) you independently developed without using or referencing any Confidential Information. You may disclose Confidential Information when required by law or the valid order of a court or other governmental authority if you give reasonable prior written notice to OpenAI of the disclosure."
Now given what I've stated, is, in fact, "information", and it has, in fact, by virtue of this posting become "generally available to the public", and given that I am unassociated with the developers and thus it "through no fault of yours" (theirs), the fact that the situation is the product of OpenAI decision and policy is now excluded from confidentiality and thus allows the potential of at least limited developer comment.
A friend saw me using chatgpt while signed in, he never does it and suggested me to use it without account. Is it concerning to use it signed in?
He says that AI will gather too much information and create a profile kinda for you. Whatever you asked is kept and keeps building up for years.
I did not get his point, but is it something concerning?
In a prototype phase, we want to language-process confidential data that should not be accessible by anyone external...
There was a great post on this exact issue a year ago, where Azure OpenAI was the most recommended solution.
They, however, have a content filtering/abuse monitoring policy in place which, when flagging content, allows Microsoft personnel to access prompts/responses. There is a way to opt out of that but only for "managed customers" and it's apparently not possible for ordinary mortals to become one.
Since it's the prototyping phase, self-hosting gets too expensive based on my current research. Either renting dedicated GPU (>$1,000/mo) or even using own hardware, which is a whole different rabbit hole and advised against for non-hobby applications...
A potential alternative could be usage-priced services that run open source models (e.g. Llama), like HuggingFace with their Serverless Interference API (https://huggingface.co/docs/api-inference/en/security) or Together.ai (their policy doesn't explain well what happens with user data). But these just feel less "legit" than Azure with it's huge enterprise context.
Does anyone have experience with such a provider, or knows of any alternative options?
What's the opinion here, on Azure with the "employee can check" loophole vs. less enterprise-focused service with "data in terms of user data or tokens are not stored"
OpenAI has introduced the ability to turn off chat history in ChatGPT. If you start a conversation when chat history is disabled, that chat won’t be used to train and improve our models, and it won’t appear in the history sidebar. These controls can be found in ChatGPT’s settings and can be changed at any time. When chat history is disabled, OpenAI will retain new conversations for 30 days and review them only when needed to monitor for abuse, before permanently deleting.
Learn more here: https://openai.com/blog/new-ways-to-manage-your-data-in-chatgpt
And check out the Help Center if you have any other questions: https://help.openai.com/en/articles/7730893-data-controls-faq
I just learned today of the ongoing partnership OpenAI has with reddit. Since 2 years, every comment and post has served to train their AI models. I am absolutely horrified by this. I have to assume there is no possible way to opt out of this but if there is, please would anyone let me know