Crosby releases Redline Bench to evaluate AI models for legal work

Business Insider
Crosby introduces the Redline Bench, a tool to assess AI performance in legal tasks like contract review.

Summary

Crosby, a startup that operates as a law firm, has launched the Redline Bench to address the challenge of evaluating AI models for legal work. Unlike coding, where success is binary, legal tasks are subjective, making it difficult to define 'good' work. To solve this, Crosby's team of engineers and lawyers, including experts from Stripe and Sullivan & Cromwell, created a benchmark based on weighted criteria derived from simulated software deal negotiations. The tool compares AI-generated contract redlines against these lawyer-defined standards. The initial results show ChatGPT 5.5 leading with a 50.5% score, followed by Gemini 3.5 Flash at 45.1% and Claude Opus at 44.4%. Crosby aims to provide a transparent, public yardstick to help lawyers trust AI tools, which is crucial as billions of dollars are invested in the promise of AI lowering legal costs.

(Source:Business Insider)

Real News Press Agency

Sandstone Announces $30M in Funding to Bring AI Workflow Automation to In-House Legal Teams - Real News hub

National Law Review

Can Legal AI Platforms Predict U.S. Supreme Court Decisions? A Pre-Decision Test in Chatrie v. United States

Business Insider

Crosby releases Redline Bench to evaluate AI models for legal work

Finanznachrichten.de

Litera Announces New Cloud AI Automation to Boost Security for Law Firm Communications, Removing Metadata Risk

Finanznachrichten.de

Wolters Kluwer Brings Libra AI Workflows into InView Legal for Belgian Legal Professionals

Inews

I'm stuck living with my ex-husband - neither of us can afford to move out

The Irish Times

‘It’s going to drive quality’: AI more of a help than a threat, young professionals say

Ein Presswire

Maikel Nisimblat Launches GalexAI Clinical Records Forensic Audits

Reason Magazine

Lawyers’ Bar Journal Article Discussing Their AI-Hallucination Errors Doesn’t Entirely Satisfy Judge, but ...

National Law Review

Mastering the Chat Matrix: Technical E-Discovery Considerations for Collaborative Communications Platforms

Blockchain News

Harvey AI Integrates with Microsoft 365 Copilot for Legal Teams

Techfundingnews

How technology is making family law services more accessible

Bloomberg

Portobello Delays Sale of Legal-Tech Firm on AI Disruption Angst

Fortune

Exclusive: How college photo-sharing app Swsh became an AI-powered fan data business backed by Scooter Braun | Fortune

Bebeez

Swedish legaltech startup Lightbringer secures $10M Series A to expand AI-powered patent platform – BeBeez International