Skip to main content
Blog

Six Takeaways from Stanford University’s 2024 Report on the State AI

AEIdeas

April 30, 2024

The 2024 Artificial Intelligence Index, released by Stanford’s Institute for Human-Centered Artificial Intelligence (HAI), delivers crucial insights into AI’s influence and developments. This detailed yearly report presents a data-based overview of AI’s progress across significant areas, including research, ethics, policy, public perception, and economics. A few highlights from the report are below.

AI Beats Humans on Some Tasks, but Not on All.

AI has achieved levels of performance that match or exceed human capabilities in certain tasks, including image classification, visual reasoning, and English understanding. It still lags behind on more complex tasks like competition-level mathematics, visual commonsense reasoning, and planning. However, the most important trend is the incredible pace of improvement. Many of today’s limitations are likely to be addressed in the very near future.

The US Is Leading with Models

The United States leads the world with 61 notable machine learning models, thanks to its strong AI research infrastructure, significant research and development investment, and major tech companies and research institutions. These models excel in areas like natural language processing and computer vision, often setting new performance standards and frequently being open-sourced, enabling global innovation.

Foundation models, such as GPT-4Claude 3, and Llama 3, represent a groundbreaking AI class, trained on vast, diverse datasets enabling them to acquire a broad understanding of language, reasoning, and problem-solving. Their adaptability allows them to be fine-tuned for specific tasks with minimal training, enhancing efficiency and cost-effectiveness. In 2023, the US solidified its lead with 109 models, far outpacing China’s 20.

The High Costs of Training

The report emphasizes the staggering increase in the computational resources and financial investment required to train the latest AI models. To put this into perspective, the original Transformer model, introduced in 2017 and forming the basis for nearly all modern LLMs, cost approximately $900 to train. In 2023, leading foundation models like OpenAI’s GPT-4 and Google’s Gemini Ultra incurred estimated training costs of $78 million and $191 million, respectively.

To put the required computational power into perspective, the original Transformer needed about 7,400 petaFLOPs (a measure of computational power) to train. In contrast, Google’s Gemini Ultra required an astonishing 50 billion petaFLOPs—nearly 6,800 times more computational power than its predecessor from just six years earlier.

Training advanced models requires significant computing power and financial resources, limiting development to a few well-funded companies. An exception is Meta’s Llama 3, introduced after the HAI report. Despite its impressive benchmark performances, rivaling closed models like OpenAI’s GPT-4 Turbo, Meta has open-sourced Llama 3. By absorbing the high training costs, Meta enables developers to leverage these cutting-edge large language models (LLMs) for innovative applications and services.

AI Impact on Workers

Recent studies provide evidence of AI’s impact on productivity across sectors. A Microsoft meta-review found workers using AI productivity tools completed tasks 26-73 percent faster. A Harvard study showed consultants with GPT-4 access experienced a 12.2 percent productivity increase, 25.1 percent speed boost, and 40 percent quality improvement on selected tasks compared to those without AI. And finally, a National Bureau of Economic Research study revealed call center agents using AI handled 14.2 percent more calls per hour. These findings suggest the growing integration of AI into the economy over the past five years has the potential to boost productivity significantly. This recently released paper by my AEI colleagues Brent Orrell and David Veldran is a deeper dive into the workforce issues with AI.

AI Driving New Breakthroughs in Science and Medicine

One chapter was dedicated entirely to some of the ways AI is introducing breakthroughs in science and medicine, including GNoME for materials discovery and GraphCast for improved weather forecasting. In medicine, AI systems like SynthSR transformed brain scans for advanced analysis, while AlphaMissence assisted in mutation classification. The FDA also approved an increasing number of AI-related medical devices, over 45 times more than were approved in 2012. 

While it is commonly believed that LLMs require extensive fine-tuning on domain-specific data to excel in specialized fields like medicine, a 2023 study from Microsoft challenges this assumption. By employing prompt engineering techniques, Medprompt became the first model to surpass 90 percent accuracy on the MedQA benchmark. This suggests the potential of prompt engineering as an effective alternative to fine-tuning, demonstrating that LLMs can be adapted to specialized domains without the need for extensive training on domain-specific data.

The Steady Increase of US Regulations

In 2023, the Executive Office of the President and the Department of Commerce led AI regulation, each issuing five AI-related rules. The Department of Health and Human Services followed with four. The number of agencies involved in AI regulation increased from 17 in 2022 to 21 in 2023, a trend likely to accelerate following the Biden administration’s executive order and Office of Management and Budget directives encouraging regulatory agencies to explore AI governance in their sectors.