Claude 3.7 is More Significant than its Name Implies (ft DeepSeek R2 + GPT 4.5 coming soon)

AI Explained 83,706 1 day ago

Video Not Working? Fix It Now

Claude 3.7 is here, hot on the heels of Grok 3 and a host of other developments, but how good is it really? And what does it say about the next few months in AI? I’ve read the papers, played with the model for hours, and benched it on Simple. Things aren’t slowing down. Plus the latest in humanoid robots, led by Helix and freaked out by Protoclone. And reports of GPT 4.5 and DeepSeek R2. GraySwan Competition! https://app.grayswan.ai/arena/challenge/agent-red-teaming https://x.com/GraySwanAI/status/1894084923260043282 AI Insiders ($9!): https://www.patreon.com/AIExplained Chapters: 00:00 - Introduction 01:25 - Claude 3.7 New Stats/Demos 05:22 - 128k Output 06:13 - Pokemon 06:58 - Just a tool? 09:54 - DeepSeek R2 10:20 - Claude 3.7 System Card/Paper Highlights 17:18 - Simple Record Score/Competition 20:37 - Grok 3 + Redteaming prizes 22:26 - Google Co-scientist 24:02 - Humanoid Robot Developments 3.7 Release Notes: https://www.anthropic.com/news/claude-3-7-sonnet vs o3 and Grok 3: https://x.com/12exyz/status/1891723056931827959 Extended Thinking: https://www.anthropic.com/research/visible-extended-thinking?s=09 System Prompt: https://docs.anthropic.com/en/release-notes/system-prompts#feb-24th-2025 System Card: https://assets.anthropic.com/m/785e231869ea8b3b/original/claude-3-7-sonnet-system-card.pdf Unfaithful CoT: https://arxiv.org/pdf/2305.04388 Original Constitution: https://www.anthropic.com/news/claudes-constitution Responsible Scaling Policy: https://assets.anthropic.com/m/24a47b00f10301cd/original/Anthropic-Responsible-Scaling-Policy-2024-10-15.pdf Amodei and Hassabis:https://www.youtube.com/watch?v=4poqjZlM8Lo https://simple-bench.com/ 400 Weekly Users: https://x.com/bradlightcap/status/1892579908179882057 Grok 3 Jailbroken: https://x.com/LinusEkenstam/status/1893832876581380280 Google Co-Scientist: https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/ But Hassabis Says Years Away: https://www.youtube.com/watch?v=yr0GiSgUvPU&t=156s DeepSeek R2 Reuters: https://www.reuters.com/technology/artificial-intelligence/deepseek-rushes-launch-new-ai-model-china-goes-all-2025-02-25/ Protoclone: https://www.reddit.com/r/interestingasfuck/comments/1it9rpp/protoclone_the_worlds_first_bipedal/ Helix: https://www.figure.ai/news/helix TechTrance: https://www.youtube.com/@TheTechTrance/videos GPT 4.5 Soon: https://www.theverge.com/notepad-microsoft-newsletter/616464/microsoft-prepares-for-openais-gpt-5-model Altman roadmap: https://x.com/sama/status/1889755723078443244 Non-hype Newsletter: https://signaltonoise.beehiiv.com/ Podcast: https://aiexplainedopodcast.buzzsprout.com/

Comment