Claude 3.7 is here, hot on the heels of Grok 3 and a host of other developments, but how good is it really? And what does it say about the next few months in AI? I’ve read the papers, played with the model for hours, and benched it on Simple. Things aren’t slowing down. Plus the latest in humanoid robots, led by Helix and freaked out by Protoclone. And reports of GPT 4.5 and DeepSeek R2.
GraySwan Competition! https://app.grayswan.ai/arena/challenge/agent-red-teaming
https://x.com/GraySwanAI/status/1894084923260043282
AI Insiders ($9!): https://www.patreon.com/AIExplained
Chapters:
00:00 - Introduction
01:25 - Claude 3.7 New Stats/Demos
05:22 - 128k Output
06:13 - Pokemon
06:58 - Just a tool?
09:54 - DeepSeek R2
10:20 - Claude 3.7 System Card/Paper Highlights
17:18 - Simple Record Score/Competition
20:37 - Grok 3 + Redteaming prizes
22:26 - Google Co-scientist
24:02 - Humanoid Robot Developments
3.7 Release Notes: https://www.anthropic.com/news/claude-3-7-sonnet
vs o3 and Grok 3: https://x.com/12exyz/status/1891723056931827959
Extended Thinking: https://www.anthropic.com/research/visible-extended-thinking?s=09
System Prompt: https://docs.anthropic.com/en/release-notes/system-prompts#feb-24th-2025
System Card: https://assets.anthropic.com/m/785e231869ea8b3b/original/claude-3-7-sonnet-system-card.pdf
Unfaithful CoT: https://arxiv.org/pdf/2305.04388
Original Constitution: https://www.anthropic.com/news/claudes-constitution
Responsible Scaling Policy: https://assets.anthropic.com/m/24a47b00f10301cd/original/Anthropic-Responsible-Scaling-Policy-2024-10-15.pdf
Amodei and Hassabis:https://www.youtube.com/watch?v=4poqjZlM8Lo
https://simple-bench.com/
400 Weekly Users: https://x.com/bradlightcap/status/1892579908179882057
Grok 3 Jailbroken: https://x.com/LinusEkenstam/status/1893832876581380280
Google Co-Scientist: https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/
But Hassabis Says Years Away: https://www.youtube.com/watch?v=yr0GiSgUvPU&t=156s
DeepSeek R2 Reuters: https://www.reuters.com/technology/artificial-intelligence/deepseek-rushes-launch-new-ai-model-china-goes-all-2025-02-25/
Protoclone: https://www.reddit.com/r/interestingasfuck/comments/1it9rpp/protoclone_the_worlds_first_bipedal/
Helix: https://www.figure.ai/news/helix
TechTrance: https://www.youtube.com/@TheTechTrance/videos
GPT 4.5 Soon: https://www.theverge.com/notepad-microsoft-newsletter/616464/microsoft-prepares-for-openais-gpt-5-model
Altman roadmap: https://x.com/sama/status/1889755723078443244
Non-hype Newsletter: https://signaltonoise.beehiiv.com/
Podcast: https://aiexplainedopodcast.buzzsprout.com/