tl;dr: the gpt-oss-20b is a strong model for competitive coding but is >3x sample inefficient compared to deespseek-r1-0528
What prompt you have used? What was the inference setting (e.g., temperature, top-p)?
The temperature is 0.6 and top-p is 0.95.
The prompts are the official prompts from the LiveCodeBenchmark
Thanks for the info. Can you share the link of the prompt? or may be an example?
What prompt you have used? What was the inference setting (e.g., temperature, top-p)?
The temperature is 0.6 and top-p is 0.95.
The prompts are the official prompts from the LiveCodeBenchmark
Thanks for the info. Can you share the link of the prompt? or may be an example?