Testing and Safety Science Analysis
The US leads in AI safety research and testing infrastructure, with dedicated organizations like Anthropic, OpenAI's safety teams, and government initiatives focused on responsible AI deployment.
Key Metrics
Safety Spend (%) = (Internal safety team budget / Total lab expenditure) x 100
Testing Window = Time from training completion to public release (US labs: ~2-6 weeks, Chinese labs: often <24 hours)
Safety Overhead (%) = (Compute for moderation & filtering / Total inference compute) x 100 — US major labs: 5-13%, Chinese labs: negligible
Ecosystem = (Third-party red-teamers + Auditors + Safety data vendors + Independent labs) x (Revenue + Government funding)
AI Testing and Safety Science Landscape
AI Testing and Safety Science can largely be split into two dominant branches: internal safety departments within the major labs, and external safety teams that exist to serve frontier AI developers. The relationship between these branches—and the tension over talent, funding, and influence—shapes how safety outcomes evolve across both nations.
Internal Safety Departments
These exist within the major labs or other AI organizations releasing products and applications. At the frontier, these groups might be split into teams that focus on alignment, red-teaming, evals/preparedness, interpretability, and incident tracking.
Rough ballparks of what fraction of total lab spend goes toward safety science and testing. The aim is to convey both total amount and relative amount spent—though adjacency between safety and capability work can cloud the picture.
Number of safety-relevant research releases by the 5 leading labs per country, excluding system/model cards and focusing on other safety-adjacent findings. May also capture the magnitude of those findings via citation counts and awards.
Percentage of frontier releases accompanied by a safety card, extent of the card, existence of a Responsible Scaling Policy, and number of amendments. Chinese labs rarely publish cards and few maintain internal protocols.
US major labs spend 5–13% of inference compute on additional safety layers (moderation, filtering, classifiers). Chinese labs spend negligible amounts on moderation. Many Chinese releases are open-weight and served elsewhere with varying levels of safety layers.
Chinese labs regularly ship open models within 24 hours after training completes. Some US labs perform suites of evals and external verification, sitting on models approximately 2–6 weeks before release.
An amalgamation of safety benchmarks and jailbreaking scores looking at models from each country’s five leading labs, drawing from sources like the CAIS Safety Leaderboard and standardized red-teaming evaluations.
External Safety Teams
These exist to serve frontier AI developers, working to advance the safety outcomes of their products. Such groups take on many shapes and sizes, particularly as their relationships with AI labs continue to mature.
Groups that spin up out of academia, government agencies, or other nonprofits. These organizations focus on the dominant frontier AI safety categories: alignment, red-teaming, evals/preparedness, interpretability, and incident tracking.
Organizations that pursue contracts to uncover vulnerabilities at scale. Not all are for-profits—many come from nonprofit backgrounds (METR, Humane) or government (AISI). They focus on jailbreaking as a service, alignment evals, uplift studies, and model weight security.
A rapidly emerging category that performs process, managerial, and control checks. Whereas red-teamers focus on model behavior, auditors inspect the organizational mechanisms that shape those behaviors.
Another emerging category: teams that curate safety data for later training runs. These offerings are highly targeted, designed to enhance the security and reliability of specific AI releases.
External Safety & Testing Metrics
While the external wing complements the AI labs, there is tension in both nations. Talent and funding flow much more strongly to the labs and their internal safety teams. The disparity here is not common in any other industry.
Size, quality, and flow of the AI safety talent pool across tiers: undergraduate, post-graduate, and professional.
Total revenue of external safety partners across sub-branches: independent labs, red-teamers, auditors, and data vendors.
While movements at the state level have rebalanced this topic, there remains more mandatory oversight in China. Yet the focal point of that testing continues to revolve around content moderation.
Direct funding to organizations like CAISI/CnAISDA or indirect grants to other safety organizations.
Count of published standards and number of certifications. While early attempts may border on safety theater, efforts to standardize safety protocols deserve credit.
Using AIID as a source alongside the apparent rise of a Chinese equivalent. Key questions: has each nation built capacity to track incidents, and in reported harms, what was the AI weapon of choice?
This layer is designed for maximum interactivity and scrollability—quick-hitting metrics, accurate reporting, and memorable supporting graphics. Planned visualizations include: funding flow charts, heatmaps of system card releases, inference overhead diagrams, pre-release timeline comparisons, alignment score dashboards, talent pool tiers, and incident bulletin boards.
Related: Frontier Models
Testing and safety science directly governs the release and deployment of frontier models.
NIST AI Safety Institute Expands
The National Institute of Standards and Technology has expanded its AI Safety Institute, developing comprehensive testing frameworks for frontier AI systems.
China Announces AI Safety Regulations
Beijing has introduced new regulations requiring safety testing for large language models before deployment, though enforcement mechanisms remain unclear.
Academic Safety Research Grows
US universities and research institutions continue to lead in AI safety research, with significant funding from both government and private sector sources.