A.I. Has a Measurement Problem – The New York Times

Theres a problem with leading artificial intelligence tools like ChatGPT, Gemini and Claude: We dont really know how smart they are.

Thats because, unlike companies that make cars or drugs or baby formula, A.I. companies arent required to submit their products for testing before releasing them to the public. Theres no Good Housekeeping seal for A.I. chatbots, and few independent groups are putting these tools through their paces in a rigorous way.

Instead, were left to rely on the claims of A.I. companies, which often use vague, fuzzy phrases like improved capabilities to describe how their models differ from one version to the next. And while there are some standard tests given to A.I. models to assess how good they are at, say, math or logical reasoning, many experts have doubts about how reliable those tests really are.

This might sound like a petty gripe. But Ive become convinced that a lack of good measurement and evaluation for A.I. systems is a major problem.

For starters, without reliable information about A.I. products, how are people supposed to know what to do with them?

I cant count the number of times Ive been asked in the past year, by a friend or a colleague, which A.I. tool they should use for a certain task. Does ChatGPT or Gemini write better Python code? Is DALL-E 3 or Midjourney better at generating realistic images of people?

We are having trouble retrieving the article content.

Please enable JavaScript in your browser settings.

Thank you for your patience while we verify access. If you are in Reader mode please exit andlog intoyour Times account, orsubscribefor all of The Times.

Thank you for your patience while we verify access.

Already a subscriber?Log in.

Want all of The Times?Subscribe.

See the rest here:
A.I. Has a Measurement Problem - The New York Times

What is Artificial Intelligence? How Does AI Work? | Built In [Last Updated On: September 5th, 2019] [Originally Added On: September 5th, 2019]
Artificial Intelligence What it is and why it matters | SAS [Last Updated On: September 5th, 2019] [Originally Added On: September 5th, 2019]
artificial intelligence | Definition, Examples, and ... [Last Updated On: September 5th, 2019] [Originally Added On: September 5th, 2019]
Benefits & Risks of Artificial Intelligence - Future of ... [Last Updated On: September 5th, 2019] [Originally Added On: September 5th, 2019]
What is AI (artificial intelligence)? - Definition from ... [Last Updated On: September 11th, 2019] [Originally Added On: September 11th, 2019]
What is Artificial Intelligence (AI)? ... - Techopedia [Last Updated On: September 13th, 2019] [Originally Added On: September 13th, 2019]
9 Powerful Examples of Artificial Intelligence in Use ... [Last Updated On: September 18th, 2019] [Originally Added On: September 18th, 2019]
What's the Difference Between Robotics and Artificial ... [Last Updated On: September 18th, 2019] [Originally Added On: September 18th, 2019]
The Impact of Artificial Intelligence - Widespread Job Losses [Last Updated On: September 18th, 2019] [Originally Added On: September 18th, 2019]
Artificial Intelligence & the Pharma Industry: What's Next ... [Last Updated On: September 18th, 2019] [Originally Added On: September 18th, 2019]
Artificial Intelligence | GE Research [Last Updated On: September 18th, 2019] [Originally Added On: September 18th, 2019]
A.I. Artificial Intelligence (2001) - IMDb [Last Updated On: October 5th, 2019] [Originally Added On: October 5th, 2019]
10 Best Artificial Intelligence Course & Certification [2019 ... [Last Updated On: October 15th, 2019] [Originally Added On: October 15th, 2019]
Artificial Intelligence in Healthcare: the future is amazing ... [Last Updated On: October 15th, 2019] [Originally Added On: October 15th, 2019]
Will Artificial Intelligence Help Resolve the Food Crisis? - Inter Press Service [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
Two-thirds of employees would trust a robot boss more than a real one - World Economic Forum [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
UofL partners with industry experts to launch Artificial Intelligence Innovation Consortium Lane Report | Kentucky Business & Economic News - The... [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
China Sees Surge of Edtech Investments With Focus on Artificial Intelligence - Karma [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
NIST researchers use artificial intelligence for quality control of stem cell-derived tissues - National Institutes of Health [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
Indiana University Touts Big Red 200 and Artificial Intelligence at SC19 - HPCwire [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
One way for the Pentagon to prove it's serious about artificial intelligence - C4ISRNet [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
Artificial Intelligence Will Enable the Future, Blockchain Will Secure It - Cointelegraph [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
Artificial intelligence has become a driving force in everyday life, says LivePerson CEO - CNBC [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
4 Reasons to Use Artificial Intelligence in Your Next Embedded Design - DesignNews [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
Artificial Intelligence Essay - 966 Words | Bartleby [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
AI News: Track The Latest Artificial Intelligence Trends And ... [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
AI in contact centres: It's time to stop talking about artificial intelligence - Verdict [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
Newsrooms have five years to embrace artificial intelligence or they risk becoming irrelevant - Journalism.co.uk [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
Scientists used IBM Watson to discover an ancient humanoid stick figure - Business Insider [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
The Mark Foundation Funds Eight Projects at the Intersection of Artificial Intelligence and Cancer Research - BioSpace [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
Colorado at the forefront of AI and what it means for jobs of the future - The Denver Channel [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
Highlights: Addressing fairness in the context of artificial intelligence - Brookings Institution [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
Artificial intelligence won't kill journalism or save it, but the sooner newsrooms buy in, the better - Nieman Journalism Lab at Harvard [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
How To Get Your Rsum Past The Artificial Intelligence Gatekeepers - Forbes [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
Epiq expands company-wide initiative to accelerate the deployment of artificial intelligence for clients globally - GlobeNewswire [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
Preparing the Military for a Role on an Artificial Intelligence Battlefield - The National Interest Online [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
Podcast decodes ethics in artificial intelligence and its relevance to public - Daily Bruin [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
Global Military Artificial Intelligence (AI) and Cybernetics Market Report, 2019-2024: Focus on Platforms, Technologies, Applications and Services -... [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
Artificial intelligence warning: Development of AI is comparable to nuclear bomb - Express.co.uk [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
Google's new study reveals 'Artificial Intelligence benefiting journalism' - Digital Information World [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
Artificial Intelligence (AI) in Retail Market worth $15.3 billion by 2025 - Exclusive Report by Meticulous Research - GlobeNewswire [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
With artificial intelligence to a better wood product - Newswise [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
Report to Congress on Artificial Intelligence and National Security - USNI News [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
Most plastic is not getting recycled, and AI robots could be a solution - Business Insider [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
Fujifilm Showcases Artificial Intelligence Initiative And Advances AI - AiThority [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
Artificial intelligence could be one of the most valuable tools mankind has built - here's one small but meani - Business Insider India [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
Artificial Intelligence: A Need of Modern 'Intelligent' Education - Thrive Global [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
Drones And Artificial Intelligence Help Combat The San Francisco Bays Trash Problem - Forbes [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
DesignCon Expands Into Artificial Intelligence, Automotive, 5G, IoT, and More For 2020 Edition - I-Connect007 [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
Is St. Louis ready for artificial intelligence? It will steal white-collar jobs here, too - STLtoday.com [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
IT chiefs recognise the risks of artificial intelligence bias - ComputerWeekly.com [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
PNNL researchers working to improve doctor-patient care through artificial intelligence - NBC Right Now [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
How Augmented Reality and Artificial Intelligence Are Helping Entrepreneurs Create a Better Customer Experience - CTOvision [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
Manufacturing Leaders' Summit: Realising the promise of Artificial Intelligence - Manufacturer.com [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
2019 Artificial Intelligence in Precision Health - Dedication to Discuss & Analyze AI Products Related to Precision Healthcare Already Available -... [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
Artificial intelligence will affect Salt Lake, Ogden more than most areas in the nation, study shows - KSL.com [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
It Pays To Break Artificial Intelligence Out Of The Lab, Study Confirms - Forbes [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
The Best Artificial Intelligence Stocks of 2019 -- and The Top AI Stock for 2020 - The Motley Fool [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
Artificial Intelligence of Things (AIoT) Market Research Report 2019-2024 - Embedded AI in Support of IoT Things/Objects Will Reach $4.6B Globally by... [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
How Augmented Reality and Artificial Intelligence Are Helping Entrepreneurs Create a Better Customer Experience - Entrepreneur [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
SC Proposes Introduction Of Artificial Intelligence In Justice Delivery System - Inc42 Media [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
Artificial intelligence in FX 'may be hype' - FX Week [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
Fujifilm Showcases Artificial Intelligence Initiative And Advances at RSNA 2019 - Imaging Technology News [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
The Surprising Way Artificial Intelligence Is Transforming Transportation - Forbes [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
Artificial Intelligence in 2020: The Architecture and the Infrastructure - Gigaom [Last Updated On: December 2nd, 2019] [Originally Added On: December 2nd, 2019]
AI IN BANKING: Artificial intelligence could be a near $450 billion opportunity for banks - here are the strat - Business Insider India [Last Updated On: December 2nd, 2019] [Originally Added On: December 2nd, 2019]
The impact of artificial intelligence on humans - Bangkok Post [Last Updated On: December 2nd, 2019] [Originally Added On: December 2nd, 2019]
Should the EU embrace artificial intelligence, or fear it? - EURACTIV [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
BioSig Technologies Announces New Collaboration on Development of Artificial Intelligence Solutions in Healthcare - GlobeNewswire [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
Artificial intelligence-based fitness is promising but may not be for everyone - Livemint [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
Pondering the Ethics of Artificial Intelligence in Health Care Kansas City Experts Team Up on Emerging - Flatland [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
Baidu Leads the Way in Innovation with 5712 Artificial Intelligence Patent Applications - GlobeNewswire [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
Artificial Intelligence and National Security, and More from CRS - Secrecy News [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
Longer Looks: The Psychology Of Voting; Overexcited Neurons And Artificial Intelligence; And More - Kaiser Health News [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
Emotion Artificial Intelligence Market Business Opportunities and Forecast from 2019-2025 | Eyesight Technologies, Affectiva - The Connect Report [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
The next generation of user experience is artificially intelligent - ZDNet [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
What Jobs Will Artificial Intelligence Affect? - EHS Today [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
Will the next Mozart or Picasso come from artificial intelligence? No, but here's what might happen instead - Ladders [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
Artificial intelligence apps, Parkinsons and me - BBC News [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
VA launches National Artificial Intelligence Institute to drive research and development - FierceHealthcare [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]

Cloud Hosting

A.I. Has a Measurement Problem – The New York Times

Recent Posts

Categories

Archives

Media Sites

Pages

Site admin