Open Source

XBai-o4

[ICLR2026] Test-Time Scaling with Reflective Generative Model.

Source: GitHub Pricing: Open Source

💻 View Code

Project snapshot

XBai-o4

Tech stack

Python

Shell

About This Project

XBai o4 is trained based on our proposed reflective generative form, which combines “Long-CoT Reinforcement Learning” and “Process Reward Learning” into a unified training form. This form enables a single model to simultaneously achieve deep reasoning and high-quality reasoning trajectory selection.

Reviews & Ratings

Share your experience

User Reviews (0)

No reviews yet. Be the first to share your experience!

XBai-o4

About This Project

Tags

Reviews & Ratings

Share your experience

User Reviews (0)

Share XBai-o4