AI & Society

AI Military Targeting: How the US Used AI in Iran

By Ashit Vora12 min read
Man watching TV holding a smartphone - AI Military Targeting: How the US Used AI in Iran

What Matters

  • -Palantir's Maven Smart System fuses satellite imagery, drone feeds, radar, and signals intelligence into one AI-powered targeting interface.
  • -The system replaced the work of roughly 2,000 intelligence analysts with 20 operators - identifying 5,500+ targets in weeks.
  • -Anthropic's Claude AI model powers Maven's intelligence assessments, target identification, and battle scenario simulations.
  • -The Pentagon spent $13.4 billion on military AI in fiscal 2026. Maven's contract alone grew from $480 million to $1.3 billion in under two years.
  • -The campaign exposed critical engineering challenges: data freshness, accuracy gaps (60% AI vs 84% human), and the limits of 'human-in-the-loop' at machine speed.

On February 28, 2026, the US military struck approximately 1,000 targets across Iran in a single day. By mid-March, that number crossed 5,500. In the 2003 Iraq invasion, a similar opening campaign took weeks and thousands of intelligence analysts.

This time, roughly 20 people operated a single AI system.

Operation Epic Fury marked the first large-scale use of AI-driven military targeting. CENTCOM commander Admiral Brad Cooper confirmed it publicly on March 11, 2026. The core technology: Palantir's Maven Smart System, powered in part by Anthropic's Claude AI.

Whether you build AI products or just use them, this campaign reveals what happens when AI moves from research labs to the highest-stakes environment on earth. The technical architecture, the accuracy challenges, and the engineering trade-offs carry lessons for any organization deploying AI in production.

The Technical Architecture: How Maven Actually Works

Maven isn't one model. It's a data fusion platform that combines multiple AI capabilities into a unified targeting pipeline.

Data ingestion. Maven pulls from four primary intelligence streams:

  • Satellite imagery - overhead photos of terrain, structures, and activity patterns
  • Drone video feeds - real-time full-motion video from surveillance aircraft
  • Radar data - electronic signatures from military equipment and installations
  • Signals intelligence (SIGINT) - intercepted communications and electronic emissions

Each stream generates massive volumes. A single Reaper drone produces roughly 1.5 terabytes of data per mission. Satellite constellations capture millions of square kilometers daily. No human team can process all of it in real time.

AI classification. Machine learning models - primarily computer vision and natural language processing - classify objects across these streams. The system identifies buildings, vehicles, military equipment, radar installations, and infrastructure. It cross-references findings against intelligence databases to determine what qualifies as a military target.

Target package generation. Once Maven identifies a potential target, it generates a full strike recommendation: the target classification, confidence score, recommended weapons system, expected collateral damage estimate, and suggested timing. This is what Cameron Stanley, the Pentagon's Chief Digital and AI Officer, described when he said: "We've gone from identifying the target to now coming up with a course of action, to now actioning that target, all from one system."

Human review. A human operator reviews the recommendation and decides whether to approve, modify, or reject it. Admiral Cooper stressed: "Humans will always make final decisions on what to shoot and what not to shoot and when to shoot."

The entire pipeline - from raw intelligence to a ready-to-execute strike recommendation - runs in near real time. What previously took hours or days of analyst work takes seconds.

Why This Would Be Nearly Impossible Without AI

To understand what Maven replaced, look at how military targeting worked before.

The Iraq model (2003). During the invasion of Iraq, the targeting process was almost entirely manual. Roughly 2,000 intelligence analysts reviewed satellite photos, intercepted communications, and field reports to build a target list. Each target required multiple analysts to confirm, cross-reference, and validate. Building a strike package took hours to days per target.

The opening "shock and awe" campaign struck around 500 targets in the first 24 hours - a feat that required months of preparation.

The Iran model (2026). Maven processed the same types of intelligence data, but at machine speed. Twenty operators managed a pipeline that identified, classified, and recommended 1,000+ targets on day one. By mid-March, CENTCOM reported over 5,500 targets struck.

The math makes the difference obvious:

FactorIraq (2003)Iran (2026)
Analysts/operators~2,000~20
Targets (first 24 hours)~500~1,000
Prep time for target listMonthsHours
Data streams fusedManual, siloedAutomated, unified
Time per target recommendationHours to daysSeconds to minutes

This isn't incremental improvement. It's a fundamentally different operational capability. Processing 5,500 targets manually at the same speed would have required tens of thousands of analysts working around the clock - a workforce the military doesn't have and couldn't deploy.

The AI didn't replace human judgment. It replaced the mechanical work of sifting through satellite images, correlating signals data, and assembling the information into a format commanders could act on. That's the bulk of intelligence work - not the decision, but the processing that makes the decision possible.

The Technology Behind Maven: A Nine-Year Journey

Maven didn't appear overnight. Its development tells the story of how military AI moved from concept to combat.

2017: Project Maven launches. Deputy Secretary of Defense Robert O. Work established the "Algorithmic Warfare Cross-Functional Team." The first mission: use machine learning to analyze drone footage from the fight against ISIS. Google held the initial contract, worth roughly $9 million.

2018: Google walks away. More than 3,000 Google engineers signed an internal petition protesting the company's involvement. Google withdrew from Maven and published AI principles restricting weapons development. Palantir, founded by Peter Thiel, stepped in as the primary contractor. Booz Allen Hamilton received a $751 million prime contract for broader Maven support.

2022-2023: Maven matures. Responsibilities split between the National Geospatial-Intelligence Agency (for geospatial AI) and the Pentagon's Chief Digital and AI Office (for broader applications). The geospatial portion became an official program of record.

2024: Major contract expansion. The Pentagon signed a $480 million, five-year contract with Palantir specifically for the Maven Smart System. Separately, NATO acquired a version of the platform (MSS NATO) for alliance-wide use.

2025: The scale grows. Maven's contract ceiling rose to $1.3 billion. The US Army awarded Palantir a separate enterprise agreement worth up to $10 billion over a decade. Anthropic signed a $200 million contract with the DOD, integrating Claude into classified networks - the first major AI lab to do so.

2026: Combat deployment. Maven was deployed at full scale during Operation Epic Fury. On March 20, Deputy Secretary of Defense Steve Feinberg signed a memo formalizing Maven as an official program of record, to be managed by the US Army by September 2026.

Pentagon CIO Kirsten Davies confirmed to the Senate that Claude powers Maven's intelligence assessments, target identification, and battle scenario simulations.

Total Pentagon AI spending for fiscal year 2026: $13.4 billion.

The Role of Claude AI Inside Maven

Anthropic's Claude serves a specific function within the Maven architecture. It's not the targeting AI itself - it's the reasoning layer.

According to Pentagon testimony, Claude handles three tasks:

  1. Intelligence assessments. Claude processes and synthesizes intelligence reports - the text-based analysis that accompanies imagery and signals data. It summarizes findings, identifies patterns across reports, and flags inconsistencies.

  2. Target identification support. While computer vision models handle the image classification, Claude helps interpret the broader context: what type of facility is this? What's its likely function based on surrounding infrastructure and intelligence reporting?

  3. Battle scenario simulation. Claude models different strike approaches and their likely outcomes - helping commanders evaluate options before committing to a course of action.

This mirrors how large language models are used in enterprise AI: not as the primary decision engine, but as the reasoning and synthesis layer that makes sense of data other systems have processed.

The Anthropic-Pentagon relationship, however, hit turbulence. Anthropic had drawn two contractual red lines: no autonomous weapons and no domestic mass surveillance. When negotiations over these guardrails broke down in late 2025, the situation escalated.

On February 27, 2026 - one day before Operation Epic Fury launched - Defense Secretary Hegseth designated Anthropic a "supply chain risk," a label previously reserved for foreign adversaries. Trump directed agencies to stop using Anthropic products.

Anthropic sued. A federal judge blocked the designation on March 26, calling it "Orwellian" and likely motivated by "unlawful retaliation."

The dispute raised a question that goes beyond military applications: can AI companies set ethical boundaries on how their technology is used - and enforce them? OpenAI and xAI moved in to fill the gap within hours. Military Times reported that DOD personnel resisted switching: "Career IT people at DOD hate this move because they had finally gotten operators comfortable using AI."

Engineering Challenges: Where AI Targeting Breaks Down

The Iran campaign exposed three technical problems that apply far beyond military use.

1. Accuracy Gaps Between Lab and Field

Maven reportedly identifies objects at roughly 60% accuracy in operational conditions. Human analysts achieve about 84%.

A 2021 US Air Force experiment painted an even starker picture: a targeting AI scored 25% accuracy in real-world conditions while rating its own confidence at 90%.

This gap between lab performance and field performance is one of the most persistent problems in production AI systems. Models trained on clean, labeled datasets degrade when they encounter real-world variability - weather, camouflage, unusual structures, or data from sensors they weren't trained on.

For any organization deploying AI in production, the lesson is clear: benchmark accuracy on your actual operating environment, not your training data.

2. Data Freshness as a Critical Failure Point

The most documented targeting error in the campaign - the strike on a school building in Minab - wasn't caused by AI failure. It was caused by stale data. The Defense Intelligence Agency's classification of the building hadn't been updated since at least 2016, even though satellite imagery showed the building had changed use years earlier.

The AI did exactly what it was designed to do. It processed the intelligence it was given and generated a recommendation. The intelligence was wrong.

This is the same failure pattern we see in commercial AI deployments. A recommendation engine trained on last quarter's data serves outdated suggestions. A fraud detection model trained on 2024 patterns misses 2026 attack vectors. A customer service agent trained on an old product catalog gives wrong answers.

AI is only as current as the data it runs on. And in high-stakes applications, data freshness isn't a nice-to-have. It's the difference between a correct decision and a catastrophic one.

3. Human-in-the-Loop at Machine Speed

Admiral Cooper described turning "hours and sometimes even days into seconds." That's the selling point. It's also the risk.

When AI generates 1,000+ recommendations per day, the nature of human review changes. Researchers call it automation bias - the tendency to accept AI recommendations without independent verification, especially when the system presents them with high confidence scores.

Georgia Tech researchers studying the campaign noted: "The presence of a human in the loop does not automatically make the process safe. If the loop moves faster than human cognition, the human becomes a formality."

This challenge shows up in every domain where AI makes recommendations at scale. Radiologists reviewing AI-flagged scans. Loan officers processing AI-scored applications. Content moderators approving AI classifications. The faster the system runs, the harder it becomes for the human to meaningfully override it.

The engineering solution isn't to slow everything down. It's to build tiered review systems - automated approval for low-stakes, high-confidence decisions, and mandatory human scrutiny (with time built in) for high-stakes or low-confidence ones.

The Bigger Picture: Gaza, NATO, and What Comes Next

The Iran campaign didn't happen in isolation. It fits into a global trend of AI-assisted military operations.

Israel deployed two AI systems in Gaza: The Gospel (which identifies buildings as potential targets) and Lavender (which identifies people as potential targets). 972 Magazine's investigation revealed that Lavender processed data to identify 37,000 suspected Hamas operatives after a sample showed 90% accuracy.

Maven is architecturally closer to Gospel - it identifies physical targets and sites rather than individuals. But both reflect the same trend: AI compressing the targeting timeline to a speed that fundamentally changes how military operations work.

NATO acquired its own version of Maven (MSS NATO) in March 2025. The Lieber Institute at West Point has published legal analysis on both systems, noting they raise identical questions about how AI-assisted targeting fits within existing International Humanitarian Law.

One unexpected consequence: AI kill chains generate detailed digital logs of every recommendation, approval, weapons selection, and damage estimate. This makes Operation Epic Fury the most documented targeting campaign in military history. Those records create accountability trails that didn't exist in previous conflicts - and could become evidence in future legal proceedings.

What This Means for AI in Business

You don't need to build weapons to learn from this campaign. The technical challenges are the same ones every organization faces when deploying AI at scale:

Data quality is everything. Maven's most visible failure came from stale data, not broken algorithms. The same applies to any AI system: if your training data or reference data is outdated, your AI will confidently produce wrong answers. Build continuous data validation into every production system.

Lab accuracy doesn't transfer to production. Maven's 60% operational accuracy vs. the Air Force experiment's 25% real-world accuracy vs. 90% self-reported confidence tells a familiar story. Always test on production data, not training benchmarks.

Speed amplifies both capability and error. AI can process in seconds what humans need hours for. That's the value. But every error also propagates at machine speed. Design systems with circuit breakers - automatic stops when confidence drops below a threshold or when outputs deviate from expected patterns.

"Human in the loop" is a design challenge, not a checkbox. Just putting a human in front of an AI's output doesn't create meaningful oversight. The human needs enough time, context, and authority to actually override the system. If your AI makes 1,000 recommendations a day and you expect one person to verify them all, you don't have oversight. You have a rubber stamp.

The Pentagon spent $13.4 billion on AI this fiscal year because they concluded AI-driven operations are the future of warfare. Every industry is making the same bet at a different scale. The organizations that deploy AI successfully won't be the ones that move fastest. They'll be the ones that build the systems - data pipelines, validation layers, human review processes - to catch errors before they compound.

Maven proved AI can do things that were previously impossible. The Iran campaign also proved that "possible" and "reliable" aren't the same thing. That gap is where the real engineering work happens.

Frequently asked questions

The Pentagon used Palantir's Maven Smart System, which fuses satellite imagery, drone video feeds, radar data, and signals intelligence into a single interface. Anthropic's Claude AI model was embedded within Maven for intelligence assessments, target identification, and battle scenario simulations. CENTCOM commander Admiral Brad Cooper confirmed the use of AI tools on March 11, 2026.

Share this article