Collinear AI’s Blog
Subscribe
Sign in
Home
Archive
About
Latest
Top
Discussions
RL Infrastructure for AI Agents: Why Environment-as-a-Service is the Missing Piece
Reinforcement learning for large language models is more of a systems problem than ML.
Nov 18
•
Nazneen Rajani
3
1
Announcing Spider: a lightweight tool to craft post-training data recipes
TL;DR Spider is a single client interface that turns messy distillation and ablation experiments into a simple, configurable workflow.
Nov 6
•
Soumyadeep Bakshi
6
Collinear Newsletter #8 – Notes on Improving AI
Hi AI innovators,
Nov 4
•
Soumyadeep Bakshi
8
October 2025
The case for simulations
Unlocking model uplift through better evaluations
Oct 23
•
Soumyadeep Bakshi
and
Grant Griffith
5
Through the Valley of Reasoning: What Small Models Teach Us About Learning
NeurIPS paper on knowledge distillation scaling laws for small foundation models
Oct 9
•
Muyu
,
Tsach Mackey
,
Anand Kumar
,
Meghana A Rajeev
, and
Nazneen Rajani
Introducing Collinear Simulations: Steerable Personas for AI Agent Testing
TraitBasis, inspired from mech intrep, gives high-fidelity user personas for comprehensive agent testing
Oct 7
•
Meghana A Rajeev
,
Tsach Mackey
,
Muyu
,
Anand Kumar
, and
Nazneen Rajani
2
Collinear Newsletter #7 | Notes on Improving AI
A deep dive into how leading companies are deploying AI agents reliably in production
Oct 2
•
Soumyadeep Bakshi
and
Nazneen Rajani
5
September 2025
Introducing Curator Evals: A Benchmark for High-quality Post-training Data Curation
High-quality datasets are the foundation of better language models.
Sep 2
•
Muhammad Ali Shafique
,
Tsach Mackey
,
Meghana A Rajeev
,
Soumyadeep Bakshi
,
Muyu
,
Anand Kumar
, and
Nazneen Rajani
2
August 2025
Collinear AI Now Available on Google Cloud Marketplace
Making safe, high-performing AI accessible through trusted enterprise infrastructure
Aug 25
•
Soumyadeep Bakshi
7
You Can’t Hire Your Way to Model Alignment
Why the Global AI talent shortage Is undermining enterprise model alignment, and what you can do instead
Aug 13
•
Marc Moring
,
Soumyadeep Bakshi
, and
Nazneen Rajani
2
Leveling the Playing Field: Livecodebench’s Big Bug Fix
Three major fixes that reshaped competitive coding scores and why your numbers may look very different now
Aug 12
•
Muyu
,
Tsach Mackey
,
Muhammad Ali Shafique
,
Anand Kumar
,
Meghana A Rajeev
, and
Nazneen Rajani
5
OpenAI's gpt-oss on LiveCodeBench: A Competitive Programming Deep Dive
tl;dr: the gpt-oss-20b is a strong model for competitive coding but is >3x sample inefficient compared to deespseek-r1-0528
Aug 6
•
Muyu
,
Tsach Mackey
,
Anand Kumar
,
Meghana A Rajeev
, and
Nazneen Rajani
2
3
1
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts