After visiting the recent NGB conference, we found ourselves thinking about a question that probably keeps many OR practitioners up at night: How do we use AI in our optimisation work without losing what makes OR valuable in the first place?
Two presentations crystallised this tension for us. One showing how AI can democratise optimisation modelling, the other showing how AI can discover algorithms we’d never think of ourselves. Both are impressive. Both also raise questions we need to answer before deploying them in production.
The promise: optimisation without the PhD
Madeleine Udell’s OptiMUS system lets you describe an optimisation problem in plain English and get working MILP model generation code. The results are striking: 67% success rate on complex problems where naive LLM prompting manages maybe 33%. What takes days of back-and-forth with business stakeholders could now happen in an afternoon?
Here’s the catch: the system doesn’t understand your problem, it predicts from training data. The LLM sometimes invents constraints that are redundant or irrelevant. More dangerous even, it can miss real constraints entirely. Your model runs and produces a solution but silently ignores critical requirements. Variables might look right but encode the wrong thing.
The practitioner’s dilemma: these tools save time, but they shift the verification burden. Instead of debugging your own formulation, you’re auditing an AI’s work. Is that really easier?
The alternative: when AI invents the algorithm
Hiverge’s approach is radically different. Instead of modelling your problem for a classical solver, AI generates entirely new algorithms through evolutionary search. Their results on the Airbus Beluga loading problem: 10,000× speedup compared to existing approaches. Real customers are using this in production.
But there’s a fundamental trade-off. These are heuristics, fast heuristics, but heuristics nonetheless. You get performance often unreachable with classical methods, algorithms tailored to your specific problem, and interpretable code. However, you have no optimality guarantees, limited transferability, and you need granular automated evaluation.
This makes sense for well defined, repeatedly solved problems where “good enough, fast” beats “provably optimal, slow”. Problems like scheduling, routing, packing that you solve hundreds of times daily. It is not appropriate for strategic decisions or anywhere you need to explain why this is the best solution to customers.
What problem are you actually solving?
Both approaches force us to confront an uncomfortable truth: the bottleneck in OR is often not the solver, it’s the messy human process around it. We spend weeks eliciting requirements, formalising models, debugging code, validating results, and maintaining everything as requirements change.

AI can compress the formalisation and implementation steps dramatically. But it doesn’t solve requirements gathering, validation, or maintenance. In fact, it might make them harder. AI accepts ambiguous natural language, but ambiguity in means garbage out. You still need a good problem definition. When the model works but you didn’t build it yourself, validation becomes more critical, not less. And when business rules change, can you edit AI-generated code confidently?
Our practical take, the LLM-based modelling is great for prototyping but review everything and never deploy without human validation. Don’t use it for safety-critical constraints or genuinely novel problems.
For evolved algorithms, keep classical methods as a baseline and run both. Perfect for high-volume similar instances where you have automated evaluation. Skip it when each problem is unique or you need provable optimality.
What this means
Here’s what this means for our profession. Code generation is becoming abundant. Fast heuristics are becoming abundant. But judgement about problem formulation, validation skills, and building trust, these are becoming scarce.
Both approaches share a fundamental limitation, they optimise the problem you give them, not the problem you actually have. If your requirements are incomplete, your model will be incomplete. AI makes bad formulations faster, it doesn’t make them good.
Where we’re landing
The above-mentioned tools are real and production-ready for specific use cases, not experimental, not “5 years away.” The craft of OR becomes more important, not less. Knowing when to trust AI, how to validate its output, and what problems are appropriate becomes the core skill. Transparency matters, you still need to explain solutions to stakeholders. “The AI said so” won’t cut it.
The question isn’t whether AI will replace OR practitioners, it’s how the OR practitioner will thrive. We think it’s those who combine AI speed with OR rigour, who know which tool to use, when to trust it, and how to validate the result. In fact, not different from yesterday.
If you’re exploring these AI tool for your organisation, be sure to get help. Unvalidated AI-generated models carry real risks. Involve someone who understands both the possibilities and the pitfalls before putting anything in production. We’re here to help!


