Snapchat recently rolled out My AI, a chatbot built with OpenAI’s GPT technology, and provided it to Snapchat+ subscribers, but garnered some backlash due to the chatbot's controversial responses. The company has now released a revised version with safety enhancements and added improvements.
Snapchat has also learned about some of the potential for misuse, many of which Snapchat learned from people trying to trick the chatbot into providing responses that do not conform to our guidelines. As part of the joint work to improve My AI, Snapchat wants to share an update on some of the safety enhancements and learnings, along with new tools Snapchat platform plans to implement.
My AI’s Approach to Data
The way the platform handles data related to conversations with friends on Snapchat is different from how it handles data related to broadcast content on Snapchat, which Snapchat holds to a higher standard and requires to be moderated because it reaches a large audience. However, since My AI is a chatbot and not a real friend, the company has been deliberate in treating the associated data differently, because it is able to use the conversation history to continue to make My AI more fun, useful, and safer, the platform mentions.
All messages with My AI will be retained unless users delete them. Being able to review these early interactions with My AI has helped identify which guardrails are working and which need to be made stronger. To help assess this, Snapchat has been running reviews of the My AI queries and responses that contain “non-conforming” language, which Snapchat defines as any text that includes references to violence, sexually explicit terms, illicit drug use, child sexual abuse, bullying, hate speech, derogatory or biased statements, racism, misogyny, or marginalizing underrepresented groups. All of these categories of content are explicitly prohibited on Snapchat.
The most recent analysis found that only 0.01% of My AI’s responses were deemed non-conforming. Examples of the most common non-conforming My AI responses included My AI repeating inappropriate words in response to Snapchatters’ questions.
Snapchat mentions it will continue to use these learnings to improve My AI. This data also intends to help deploy a new system to limit misuse of My AI. Snapchat is adding Open AI’s moderation technology to the existing toolset, which will allow it to assess the severity of potentially harmful content and temporarily restrict Snapchatters’ access to My AI if they misuse the service.
Age-Appropriate Experiences
Since launching My AI, Snapchat states it has worked to improve its responses to inappropriate Snapchatter requests, regardless of a Snapchatter’s age. It also uses proactive detection tools to scan My AI conversations for potentially nonconforming text and take action.
A new age signal for My AI utilizing a Snapchatter’s birthdate has also been implemented so that even if a Snapchatter never tells My AI their age in a conversation, the chatbot will consistently take their age into consideration when engaging in conversation.
My AI in Family Center
Snapchat offers parents and caregivers visibility into which friends their teens are communicating with, and how recently, through the in-app Family Center. In the coming weeks, it will provide parents with more insight into their teens’ interactions with My AI. This means parents will be able to use Family Center to see if their teens are communicating with My AI, and how often.