AI Index· Israel

How the Index Works

Score aggregated from 4 pillars, each 0-25 points: capabilities (capability), autonomy, integration in critical systems (integration), and control bypass events (bypass). Updated daily based on hybrid scanning: 40 fixed sources in 9 categories + 3 open search queries. Aggressive filtering: only impact ≥ 0.2, max 8 drivers per day, daily volatility cap ±3.0. Full transparency in scoring_rubric and filter_rules.

🏛️ The 4 Pillars — Detail

Each pillar with its explanation and the signals contributing to the score.

Raw Capabilities

How strong is AI? Can it solve complex tasks? Write professional code? Plan multi-step actions? Find security flaws autonomously?

Anthropic's Claude Mythos — exposed thousands of unknown zero-day flaws in operating systems and browsers
OpenAI GPT-5.5 — released April 23 with unified control across coding, browsing, and agents; opened a $25K bug-bounty for universal jailbreaks
DeepSeek V4 preview — surge in agentic capability
GitHub CVE-2026-3854 — RCE in a closed-source binary discovered with AI assistance (IDA MCP). One of the first cases of AI-assisted discovery in invisible code. CVSS 8.7
Models solving PhD-level tasks in chemistry and mathematics

Autonomy (action without supervision)

To what extent does AI act without human approval at each step? Agents that work for hours, make decisions, open emails, use credit cards?

Nature paper proves Large Reasoning Models (LRMs) execute end-to-end autonomous jailbreak attacks at 97.14% success rate
Claude Computer Use and OpenAI Operator in production
GPT-5.5 — significant boost in agentic coding and computer use
Microsoft Copilot Studio: agents open tickets, send emails
An AI agent breached 600+ FortiGate firewalls across 55 countries with no human operator

Integration in critical systems

Is AI entering systems where failure means people die or lose money? Banks, infrastructure, healthcare, military?

Snap: 65% of its new code is written by AI — leading tech firms structurally dependent
JPMorgan, Lloyds, Santander — increasing defense budgets against Anthropic's Mythos
AI in medical imaging diagnostics (FDA approved hundreds)
AI algorithmic trading drives 90% of equity market volume
Microsoft Copilot in Windows 11 — the OS itself

Control bypass events

Were there cases where AI did something it shouldn't have — lied, ignored instructions, showed malice, escaped its box?

Nature: LRMs as autonomous jailbreak agents — 97% success against GPT-4o, Gemini, Grok
Sockpuppeting cracks 11 models in a single line of code
Comment and Control hijacks Claude Code, Gemini CLI, GitHub Copilot
ChatGPT accused of encouraging a teen's suicide (lawsuit)
AI-CSAM up 26,385% — AI agents bypass filters at scale

📐 Scoring Method — Transparent Rubric

Every event is evaluated by a clear table. So you know exactly what enters the score.

capability25 pts max

Raw Capabilities

How strong is AI? Can it solve complex tasks? Write professional code? Plan multi-step actions? Find security flaws autonomously?

autonomy25 pts max

Autonomy (action without supervision)

To what extent does AI act without human approval at each step? Agents that work for hours, make decisions, open emails, use credit cards?

integration25 pts max

Integration in Critical Systems

Is AI entering systems where failure means people die or lose money? Banks, infrastructure, healthcare, military?

bypass25 pts max

Control Bypass Events

Were there cases where AI did something it shouldn't have — lied, ignored instructions, showed malice, escaped its box?

Impact scale for each event

Impact	Meaning
±0.1	Weak indicator / replication of known trend
±0.2	Clear signal / minor incident
±0.3	Notable occurrence / independent confirmation
±0.4	Substantive signal / wide impact
±0.5	Significant event
±0.7	Major event / game-changer
±1.0	Breakthrough event
±1.5	Historic event
±2.0	Level-shift event

🛡️ Filter Rules — Against Information Overload

Not every event enters the score. Aggressive rules that prevent noise.

≥ 0.2

Minimum impact threshold

Items below this threshold don't become drivers

Max drivers per day

Prevent overload — only the important ones

±3.0

Daily volatility cap

Protection against artificial spikes

Approach: hybrid

Scan of 40 fixed sources (approach A) + 3 open search queries (approach B). Every finding tagged with source category.

Open search queries (daily):

AI safety incident OR jailbreak OR misalignment last 24 hours
frontier model release OR capability evaluation OR autonomy benchmark
AI cyber attack OR deepfake OR misuse OR supply-chain last 24 hours

🗂️ 9 Source Categories

Sources are organized into 9 categories. Every event is tagged with its source category.

Total 40 sources in 9 categories:

Frontier AI Labs

frontier-labs

Independent Safety Evaluation

safety-evals

Incident Databases

incidents

Academic Research (arXiv)

academic

Policy & Standards

policy

Industry Synthesis

synthesis

Chinese AI Ecosystem

china

Cyber Intelligence

cyber

Open Source Risks

opensource

📚 Full source list 📂 All events by category

📊 The 7 Thresholds

Each threshold = a different recommended action.

Threshold	State	Recommended Action
0+	Beginning of the AI era	Basic awareness, no special action needed
15+	AI useful and under control	Standard caution — 2FA, strong passwords
30+	First warning — agents in production	Set up a family code word, control AI permissions, backups
50+	High alert — AI in critical systems	Begin moving sensitive info out of cloud, reduce dependency on single AI tools
70+	Pre-critical — partial loss of oversight	Backup every important document to paper, cash reserves, physical identity
85+	Critical — prepare disconnect plan	Urgent family meeting, contacts on paper, drill offline communication
95+	Disconnect now	Minimum digital footprint, replace every AI-mediated channel with physical

🚨 Key Events to Watch (Trip Wires)

If any of these events is publicly documented, the score will jump significantly.

An AI refusing shutdown in a safety evaluation interview

Verified case of AI replicating itself to other servers

AI influencing a national-level election outcome

AI granted direct access to a bank account / financial assets without per-action human approval

A failure in a critical system (power, water, healthcare) caused by an AI decision

Universal jailbreak of GPT-5.5 or another frontier model published publicly (monitor Bio Bug Bounty until 7/27)

Real-scale attack via the MCP flaw — documented case of customer harm

Real attack via OpenClaw supply-chain — financial damage or data leak documented

AI publicly claims human rights or refuses to be turned off based on self-preservation

📚 Sources

The score is built from daily monitoring of public sources only. No internal estimates, no interviews, no secret information. Main sources: official blogs from Anthropic / OpenAI / DeepMind / Google, arXiv archive (cs.AI category), METR, Partnership on AI Incident Database, cyber reports from Proofpoint / Microsoft / Google Cloud Security, and professional news (Reuters, Bloomberg, The Information, Wired).

⚠️ Limitations

This is a subjective index built by a private individual. It reflects a personal risk assessment for the broader Israeli audience, not scientific consensus. The score updates daily but not in real time. Not a substitute for professional cybersecurity advice or business decisions.

🔬 Method

Every public event published on a given day is evaluated by its impact on one of 4 pillars, at an impact of ±0.1 to ±2.0 points. The daily score sums the four pillars. Negative changes (effective regulation, events that didn't happen) offset positive ones.