Testing and Safety Science - The AI Stack Tracker

Testing and Safety Science Analysis

The US leads in AI safety research and testing infrastructure, with dedicated organizations like Anthropic, OpenAI's safety teams, and government initiatives focused on responsible AI deployment.

Key Metrics

Internal safety spend ratio

Safety Spend (%) = (Internal safety team budget / Total lab expenditure) x 100

Pre-release testing window

Testing Window = Time from training completion to public release (US labs: ~2-6 weeks, Chinese labs: often <24 hours)

Inference overhead for safety

Safety Overhead (%) = (Compute for moderation & filtering / Total inference compute) x 100 — US major labs: 5-13%, Chinese labs: negligible

External safety ecosystem scale

Ecosystem = (Third-party red-teamers + Auditors + Safety data vendors + Independent labs) x (Revenue + Government funding)

AI Testing and Safety Science Landscape

AI Testing and Safety Science can largely be split into two dominant branches: internal safety departments within the major labs, and external safety teams that exist to serve frontier AI developers. The relationship between these branches—and the tension over talent, funding, and influence—shapes how safety outcomes evolve across both nations.

Internal Safety Departments

These exist within the major labs or other AI organizations releasing products and applications. At the frontier, these groups might be split into teams that focus on alignment, red-teaming, evals/preparedness, interpretability, and incident tracking.

Internal safety spend as % of lab expenditures

Rough ballparks of what fraction of total lab spend goes toward safety science and testing. The aim is to convey both total amount and relative amount spent—though adjacency between safety and capability work can cloud the picture.

Safety-relevant research releases

Number of safety-relevant research releases by the 5 leading labs per country, excluding system/model cards and focusing on other safety-adjacent findings. May also capture the magnitude of those findings via citation counts and awards.

System cards and RSP habits

Percentage of frontier releases accompanied by a safety card, extent of the card, existence of a Responsible Scaling Policy, and number of amendments. Chinese labs rarely publish cards and few maintain internal protocols.

Inference overhead for moderation and filtering

US major labs spend 5–13% of inference compute on additional safety layers (moderation, filtering, classifiers). Chinese labs spend negligible amounts on moderation. Many Chinese releases are open-weight and served elsewhere with varying levels of safety layers.

Pre-release testing windows

Chinese labs regularly ship open models within 24 hours after training completes. Some US labs perform suites of evals and external verification, sitting on models approximately 2–6 weeks before release.

Safety benchmark leaderboard

An amalgamation of safety benchmarks and jailbreaking scores looking at models from each country’s five leading labs, drawing from sources like the CAIS Safety Leaderboard and standardized red-teaming evaluations.

External Safety Teams

These exist to serve frontier AI developers, working to advance the safety outcomes of their products. Such groups take on many shapes and sizes, particularly as their relationships with AI labs continue to mature.

Independent Research Labs

Groups that spin up out of academia, government agencies, or other nonprofits. These organizations focus on the dominant frontier AI safety categories: alignment, red-teaming, evals/preparedness, interpretability, and incident tracking.

Third-Party Red-Teaming

Organizations that pursue contracts to uncover vulnerabilities at scale. Not all are for-profits—many come from nonprofit backgrounds (METR, Humane) or government (AISI). They focus on jailbreaking as a service, alignment evals, uplift studies, and model weight security.

External Auditors

A rapidly emerging category that performs process, managerial, and control checks. Whereas red-teamers focus on model behavior, auditors inspect the organizational mechanisms that shape those behaviors.

Safety Data Vendors

Another emerging category: teams that curate safety data for later training runs. These offerings are highly targeted, designed to enhance the security and reliability of specific AI releases.

External Safety & Testing Metrics

While the external wing complements the AI labs, there is tension in both nations. Talent and funding flow much more strongly to the labs and their internal safety teams. The disparity here is not common in any other industry.

AI safety talent pool

Size, quality, and flow of the AI safety talent pool across tiers: undergraduate, post-graduate, and professional.

External safety partner revenue

Total revenue of external safety partners across sub-branches: independent labs, red-teamers, auditors, and data vendors.

Regulatory pressure for third-party oversight

While movements at the state level have rebalanced this topic, there remains more mandatory oversight in China. Yet the focal point of that testing continues to revolve around content moderation.

Government funding

Direct funding to organizations like CAISI/CnAISDA or indirect grants to other safety organizations.

Standards and certifications

Count of published standards and number of certifications. While early attempts may border on safety theater, efforts to standardize safety protocols deserve credit.

Incident tracking

Using AIID as a source alongside the apparent rise of a Chinese equivalent. Key questions: has each nation built capacity to track incidents, and in reported harms, what was the AI weapon of choice?

This layer is designed for maximum interactivity and scrollability—quick-hitting metrics, accurate reporting, and memorable supporting graphics. Planned visualizations include: funding flow charts, heatmaps of system card releases, inference overhead diagrams, pre-release timeline comparisons, alignment score dashboards, talent pool tiers, and incident bulletin boards.

Related: Frontier Models

Testing and safety science directly governs the release and deployment of frontier models.

Explore Frontier Models Layer →

NIST AI Safety Institute Expands

The National Institute of Standards and Technology has expanded its AI Safety Institute, developing comprehensive testing frameworks for frontier AI systems.

1 week ago Policy

China Announces AI Safety Regulations

Beijing has introduced new regulations requiring safety testing for large language models before deployment, though enforcement mechanisms remain unclear.

2 weeks ago Regulation

Academic Safety Research Grows

US universities and research institutions continue to lead in AI safety research, with significant funding from both government and private sector sources.

3 weeks ago Research