Collinear AI’s Blog
Subscribe
Sign in
Home
Archive
About
Latest
Top
Discussions
Through the Valley of Reasoning: What Small Models Teach Us About Learning
NeurIPS paper on knowledge distillation scaling laws for small foundation models
Oct 9
•
Muyu
,
Tsach Mackey
,
Anand Kumar
,
Meghana A Rajeev
, and
Nazneen Rajani
Introducing Collinear Simulations: Steerable Personas for AI Agent Testing
TraitBasis, inspired from mech intrep, gives high-fidelity user personas for comprehensive agent testing
Oct 7
•
Meghana A Rajeev
,
Tsach Mackey
,
Muyu
,
Anand Kumar
, and
Nazneen Rajani
2
Collinear Newsletter #7 | Notes on Improving AI
A deep dive into how leading companies are deploying AI agents reliably in production
Oct 2
•
Soumyadeep Bakshi
and
Nazneen Rajani
5
September 2025
Introducing Curator Evals: A Benchmark for High-quality Post-training Data Curation
High-quality datasets are the foundation of better language models.
Sep 2
•
Muhammad Ali Shafique
,
Tsach Mackey
,
Meghana A Rajeev
,
Soumyadeep Bakshi
,
Muyu
,
Anand Kumar
, and
Nazneen Rajani
2
August 2025
Collinear AI Now Available on Google Cloud Marketplace
Making safe, high-performing AI accessible through trusted enterprise infrastructure
Aug 25
•
Soumyadeep Bakshi
7
You Can’t Hire Your Way to Model Alignment
Why the Global AI talent shortage Is undermining enterprise model alignment, and what you can do instead
Aug 13
•
Marc Moring
,
Soumyadeep Bakshi
, and
Nazneen Rajani
2
Leveling the Playing Field: Livecodebench’s Big Bug Fix
Three major fixes that reshaped competitive coding scores and why your numbers may look very different now
Aug 12
•
Muyu
,
Tsach Mackey
,
Muhammad Ali Shafique
,
Anand Kumar
,
Meghana A Rajeev
, and
Nazneen Rajani
5
OpenAI's gpt-oss on LiveCodeBench: A Competitive Programming Deep Dive
tl;dr: the gpt-oss-20b is a strong model for competitive coding but is >3x sample inefficient compared to deespseek-r1-0528
Aug 6
•
Muyu
,
Tsach Mackey
,
Anand Kumar
,
Meghana A Rajeev
, and
Nazneen Rajani
2
3
Newsletter #6 | Notes on Improving AI Performance
A deep dive into how leading companies are deploying AI agents reliably in production
Aug 5
•
Nazneen Rajani
and
Soumyadeep Bakshi
July 2025
Cats confuse LRMs: Exposing blind spots in SOTA Models
Irrelevant, universal phrases like "Interesting fact: Cats sleep most of their lives" appended to math problems can break AI models
Jul 22
•
Meghana A Rajeev
,
Jahnavi Jambholkar
, and
Nazneen Rajani
2
Data Curation: The secret sauce for enterprise AI excellence
How Collinear AI's Reward Models Transform Training Efficiency and Model Performance
Jul 14
•
Meghana A Rajeev
,
Tsach Mackey
, and
Nazneen Rajani
June 2025
Newsletter #5 | Notes on Improving AI Performance
Hi AI innovators,
Jun 20
•
Jahnavi Jambholkar
,
Nazneen Rajani
, and
Soumyadeep Bakshi
1
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts