Collinear AI’s Blog
Subscribe
Sign in
Home
Archive
About
Latest
Top
Discussions
Introducing Curator Evals: A Benchmark for High-quality Post-training Data Curation
High-quality datasets are the foundation of better language models.
Sep 2
•
Muhammad Ali Shafique
,
Tsach Mackey
,
Meghana A Rajeev
,
Soumyadeep Bakshi
,
Muyu
,
Anand Kumar
, and
Nazneen Rajani
2
August 2025
Collinear AI Now Available on Google Cloud Marketplace
Making safe, high-performing AI accessible through trusted enterprise infrastructure
Aug 25
•
Soumyadeep Bakshi
7
You Can’t Hire Your Way to Model Alignment
Why the Global AI talent shortage Is undermining enterprise model alignment, and what you can do instead
Aug 13
•
Marc Moring
,
Soumyadeep Bakshi
, and
Nazneen Rajani
2
Leveling the Playing Field: Livecodebench’s Big Bug Fix
Three major fixes that reshaped competitive coding scores and why your numbers may look very different now
Aug 12
•
Muyu
,
Tsach Mackey
,
Muhammad Ali Shafique
,
Anand Kumar
,
Meghana A Rajeev
, and
Nazneen Rajani
5
OpenAI's gpt-oss on LiveCodeBench: A Competitive Programming Deep Dive
tl;dr: the gpt-oss-20b is a strong model for competitive coding but is >3x sample inefficient compared to deespseek-r1-0528
Aug 6
•
Muyu
,
Tsach Mackey
,
Anand Kumar
,
Meghana A Rajeev
, and
Nazneen Rajani
2
3
Newsletter #6 | Notes on Improving AI Performance
A deep dive into how leading companies are deploying AI agents reliably in production
Aug 5
•
Nazneen Rajani
and
Soumyadeep Bakshi
July 2025
Cats confuse LRMs: Exposing blind spots in SOTA Models
Irrelevant, universal phrases like "Interesting fact: Cats sleep most of their lives" appended to math problems can break AI models
Jul 22
•
Meghana A Rajeev
,
Jahnavi Jambholkar
, and
Nazneen Rajani
2
Data Curation: The secret sauce for enterprise AI excellence
How Collinear AI's Reward Models Transform Training Efficiency and Model Performance
Jul 14
•
Meghana A Rajeev
,
Tsach Mackey
, and
Nazneen Rajani
June 2025
Newsletter #5 | Notes on Improving AI Performance
Hi AI innovators,
Jun 20
•
Jahnavi Jambholkar
,
Nazneen Rajani
, and
Soumyadeep Bakshi
1
May 2025
Gaming the System: Goodhart’s Law Exemplified in AI Leaderboard Controversy
How the race to the top in AI benchmarks is leading to specialized optimization at the expense of real-world performance
May 15
•
Jahnavi Jambholkar
,
Nazneen Rajani
, and
Soumyadeep Bakshi
2
Judges as Data curators cut Post-training Time to Half
ServiceNow x Collinear
May 14
•
Nazneen Rajani
,
Meghana A Rajeev
, and
Tsach Mackey
3
Newsletter #4 | Notes on Improving AI Performance
A deep dive into how leading companies are deploying AI agents reliably in production
May 7
•
Jahnavi Jambholkar
,
Nazneen Rajani
, and
Soumyadeep Bakshi
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts