Technical Report: Performance and baseline evaluations of gpt-oss-safeguard-120b and gpt-oss-safeguard-20b

Posted by Source Author | Oct 28, 2025 | Tech Company News | 0 |

This post was originally published by Source Author on Open AI Blog.

gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are two open-weight reasoning models post-trained from the gpt-oss models and trained to reason from a provided policy in order to label content under that policy. In this report, we describe gpt-oss-safeguard’s capabilities and provide our baseline safety evaluations on the gpt-oss-safeguard models, using the underlying gpt-oss models as a baseline. For more information about the development and architecture of the underlying gpt-oss models, see the original gpt-oss model model card⁠.

Technical Report: Performance and baseline evaluations of gpt-oss-safeguard-120b and gpt-oss-safeguard-20b

About The Author

Source Author

Leave a reply Cancel reply

Recent Posts

Recent Comments

Technical Report: Performance and baseline evaluations of gpt-oss-safeguard-120b and gpt-oss-safeguard-20b

About The Author

Source Author

Related Posts

Doppel’s AI defense system stops attacks before they spread

The next chapter of the Microsoft–OpenAI partnership

Introducing gpt-oss-safeguard

Knowledge preservation powered by ChatGPT

Leave a reply Cancel reply

Recent Posts

Recent Comments