Saturday, 28 Feb 2026
  • About us
  • Contact
  • History
  • My Interests
  • Privacy Policy
Nexpressdaily.com
  • Home
  • Politics
  • Finance
  • Health
  • Technology
  • Travel
  • World
  • 🔥
  • Politics
  • Technology
  • Travel
  • World
  • Finance
  • Health
Font ResizerAa
Nexpressdaily.comNexpressdaily.com
  • My Saves
  • My Interests
  • My Feed
  • History
  • Travel
  • Finance
  • Politics
  • Health
  • Technology
  • World
Search
  • Pages
    • Home
    • Blog Index
    • Contact Us
    • Search Page
    • 404 Page
  • Personalized
    • My Feed
    • My Saves
    • My Interests
    • History
  • Categories
    • Finance
    • Politics
    • Technology
    • Travel
    • Health
    • World
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Finance

It’s getting harder to tell which company is winning the AI race, Hugging Face co-founder says

Nexpressdaily
Last updated: May 7, 2025 10:51 am
Nexpressdaily
Share
SHARE

  • Hugging Face’s Thomas Wolf says that it’s getting harder to tell which AI model is the best as traditional AI benchmarks become saturated. Going forward, Wolfe said the AI industry could rely on two new benchmarking approaches—agency‑based and use‑case‑specific.

Thomas Wolf, co‑founder and chief scientist at Hugging Face, thinks we may need new ways to measure AI models.

Wolf told the audience at Brainstorm AI in London that as AI models get more advanced, it’s becoming increasingly difficult to tell which one is performing the best.

“It’s getting hard to tell what the best model is,” he said, pointing to the nominal differences between recent releases from OpenAI and Google. “They all seem to be, actually, very close.”

“The world of benchmarks has evolved a lot. We used to have this very academic benchmark that we mostly measured the knowledge of the model on—I think the most famous was MMLU (Massive Multitask Language Understanding), which was basically a set of graduate‑level or PhD‑level questions that the model had to answer,” he said. “These benchmarks are mostly all saturated right now.”

Over the past year, there has been a growing chorus of voices from academia, industry, and policy claiming that common AI benchmarks, such as MMLU, GLUE, and HellaSwag, have reached saturation, can be gamed, and no longer reflect real‑world utility.

In a study published in February, researchers at the European Commission’s Joint Research Centre, published a paper called “Can We Trust AI Benchmarks? An Interdisciplinary Review of Current Issues in AI Evaluation” that found “systemic flaws in current benchmarking practices”—including misaligned incentives, construct‑validity failures, gaming of results and data‑contamination.

Going forward, Wolf said the AI industry should rely on two main types of benchmarks going into 2025: one for assessing the agency of the models, where LLMs are expected to do tasks, and the other tailored to each use case for models.

Hugging Face is already working on the latter.

The company’s new program, “Your Bench,” aims to help users determine which model to use for a specific task. Users feed a few documents into the program, which then automatically generates a specific benchmark for the type of work that users can apply to different models to see which one is best for the use case.

“Just because these models are all working the same on this academic benchmark doesn’t really mean that they’re all exactly the same,” Wolf said.

Open‑source’s ‘ChatGPT moment’

Founded by Wolf, Clément Delangue, and Julien Chaumond in 2016, Hugging Face has long been a champion of open‑source AI.

Often referred to as the GitHub of machine learning, the company provides an open‑source platform that enables developers, researchers, and enterprises to build, share, and deploy machine‑learning models, datasets, and applications at scale. Users can also browse models and datasets that others have uploaded.

Wolfe told the Brainstorm AI audience that Hugging Face’s “business model is really aligned with open source” and the company’s “goal is to have the maximum number of people participating in this kind of open community and sharing models.”

Wolfe predicted that open‑source AI would continue to thrive, especially after the success of DeepSeek earlier this year.

After its launch late last year, the Chinese‑made AI model DeepSeek R1 sent shockwaves through the AI world when testers found that it matched or even outperformed American closed‑source AI models.

Wolf said DeepSeek was a “ChatGPT moment” for open‑source AI.

“Just like ChatGPT was the moment the whole world discovered AI, DeepSeek was the moment the whole world discovered there was kind of this open society,” he said.

This story was originally featured on Fortune.com

Share This Article
Email Copy Link Print
Previous Article Google’s latest Pixel decision is one of the most annoying yet
Next Article Joe Biden slams Trump for ‘foolish’ appeasement of Putin

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
XFollow
InstagramFollow
LinkedInFollow
MediumFollow
QuoraFollow
- Advertisement -
Ad imageAd image

Popular Posts

Cyberpunk 2077 comes to Mac… 5 years later

You can finally play Cyberpunk 2077 on your MacBook! Later this week. In 2025. Five…

By Nexpressdaily

Meta launches V-JEPA 2, an open-source AI "world model" to understand and predict 3D environments and object movements, to help robotics and self-driving cars (Ryan Browne/CNBC)

Ryan Browne / CNBC: Meta launches V-JEPA 2, an open-source AI “world model” to understand…

By Nexpressdaily

Climate Change Is Worsening Sleep Apnea

We all have cause to take climate change personally. Not only do higher temperatures lead…

By Nexpressdaily

You Might Also Like

Finance

Amazon (AMZN) Q2 earnings report 2025

By Nexpressdaily
Finance

Could Shopify Help You Become a Millionaire?

By Nexpressdaily
Finance

Why your portfolio is less diversified than you might think

By Nexpressdaily
Finance

Aid cannot make poor countries rich

By Nexpressdaily
Nexpressdaily.com
Facebook Twitter Youtube Rss Medium

About US

NexpressDaily.com is a leading digital news platform committed to delivering timely, accurate, and unbiased news from around the world. From politics and business to technology, sports, health, and entertainment – we cover the stories that matter most. Stay connected with real-time updates, expert insights, and trusted journalism, all in one place.

Top Categories
  • World
  • Finance
  • Politics
  • Tech
  • Health
  • Travel
Usefull Links
  • About us
  • Contact
  • History
  • My Interests
  • Privacy Policy

© Nexpressdaily. All Rights Reserved.

Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?