When you use Playground to generate text, the content is automatically checked by OpenAI’s moderation system to help detect potentially sensitive or unsafe outputs.
This means that some safe outputs may occasionally be misclassified as unsafe.
While this helps reduce the risk of missing genuinely harmful content, it could result in false positives.
What to Do if an Output Is Misclassified
If you believe Playground incorrectly flagged a safe output:
Click the thumbs-down icon next to the moderation warning.
Include any optional feedback if you wish.
Why We Prioritize Caution
Moderation systems are tuned to minimize the chance of missing harmful content.
It's better to catch too much than miss a genuinely unsafe response.