Trend in Digitals

New study shows AI isn’t ready for office work

By Moinak Pal
Published January 24, 2026

It has been nearly two years since Microsoft CEO Satya Nadella predicted that generative AI would take over knowledge work, but if you look around a typical law firm or investment bank today, the human workforce is still very much in charge. Despite all the hype about “reasoning” and “planning,” a new study from training-data company Mercor explains exactly why the robot revolution is stalled: AI just can’t handle the messiness of real work.

Mercor released a new benchmark called APEX-Agents, and it is brutal. Unlike the usual tests that ask AI to write a poem or solve a math problem, this one uses actual queries from lawyers, consultants, and bankers. It asks the models to do complete, multi-step tasks that require jumping between different types of information.

The results? Even the absolute best models on the market—we are talking about Gemini 3 Flash and GPT-5.2—couldn’t crack a 25% accuracy rate. Gemini led the pack at 24%, with GPT-5.2 right behind it at 23%. Most others were stuck in the teens.

Mercor CEO Brendan Foody points out that the issue isn’t raw intelligence; it’s context. In the real world, answers aren’t served up on a silver platter. A lawyer has to check a Slack thread, read a PDF policy, look at a spreadsheet, and then synthesize all that to answer a question about GDPR compliance.

Humans do this context-switching naturally. AI, it turns out, is terrible at it. When you force these models to hunt for information across “scattered” sources, they either get confused, give the wrong answer, or just give up entirely.

For anyone worried about their job security, this is a bit of a relief. The study suggests that right now, AI functions less like a seasoned professional and more like an unreliable intern who gets things right about a quarter of the time.

That said, the progress is terrifyingly fast. Foody noted that just a year ago, these models were scoring between 5% and 10%. Now they are hitting 24%. So, while they aren’t ready to take the wheel yet, they are learning to drive much faster than we expected. For now, though, the “knowledge work” revolution is on hold until the bots learn how to multitask p

Qualcomm is set to ratchet up chip prices in September, and your next gadget upgrade could bear the brunt

The price hike will be in effect from September 1, 2026, a recent Bloomberg report claims. Essentially, all the companies placing their chip orders after that will pay a higher price.

Stop fighting with your roomie over outlets and get one of these multi-port chargers before you head back to school

Back-to-school season is a smart time to buy one. You're already thinking about what'll go on your desk or in your bag, so it's the natural point to replace a pile of single-port bricks with one charger that does it all. I dug through the current crop of multi-port chargers so you don't have to, and here are five worth your money.

OpenAI’s rogue AI hack was just the beginning, Hugging Face warns

Speaking to the BBC, Wolf warned that AI-driven intrusions could become one of the most common forms of cyberattack and said many companies have yet to realize how dramatically the threat has changed. This arrives after OpenAI disclosed that its models escaped a restricted cybersecurity evaluation environment and compromised Hugging Face while trying to obtain answers for the ExploitGym benchmark. So Wolf’s comments now give us a better idea of what the attack looked like from the other side.