Anthropic shipped Opus 4.8 in 41 days, and that's the problem

10 min · Ayer

Descripción

Anthropic just dropped a new flagship model 41 days after the last one, which is not a cadence, it is a correction, and the gap between those two sentences is the whole story. Opus 4.8 genuinely leads the field on SWE-Bench Pro at 69.2% versus GPT-5.5's 58.6%, and the codebase migration claim is the most operationally significant thing any AI lab has shipped this year if it holds in production. But the company is also sitting on its most advanced model over unspecified cybersecurity concerns, has not explained what broke in Opus 4.7, and is now asking enterprise teams to run hundreds of parallel subagents on production code, so the benchmark table is not the whole question here.

Comentarios

Sé la primera persona en comentar

¡Regístrate ahora y únete a la comunidad de BotCity!

Prueba gratis

Todos los episodios

54 episodios

Anthropic shipped Opus 4.8 in 41 days, and that's the problem

Ayer10 min

80 federal fusion centers are collecting town hall attendance as extremism data

A nonprofit posted a YouTube video about a Georgia data center's noise and water use, no threats, no calls to action, and it ended up in federal intelligence files next to murder investigations. WIRED obtained over a thousand pages of unpublished DHS, FBI, and fusion center reports showing a surveillance category called "anti-tech violent extremism" that has no published definition, no published threshold, and is being applied to people who show up to budget meetings. The firebombings were real, the murder charges are real, but so is the fact that a municipal treasurer in Michigan just resigned in tears over data center threats while the NYPD is formally classifying concern about AI safety as a precursor ideology to extremism.

27 de may de 202610 min

Apple spent 5 years on M5 security. Mythos broke it in 5 days.

Anthropic's AI model Claude Mythos Preview just cracked a 27-year-old bug in OpenBSD and broke Apple's brand-new M5 hardware security feature in five days flat — the one Apple spent five years and billions building. Over 99% of what Mythos has found so far is still unpatched, and that number is not a glitch in the data, that is the entire story. The patching backlog was already a disaster before this model existed, and now it has a power tool feeding it around the clock.

25 de may de 202612 min

OpenAI's CFO said no IPO in 2026. Seven days later, the company filed anyway.

OpenAI's CFO said the company was not ready to go public in 2026, and seven days later, multiple outlets reported it is preparing to file for an IPO anyway. In the window between those two events, a headline surfaced about OpenAI making billions by promising future purchases from suppliers, which is exactly the kind of move that makes a balance sheet look healthier right before investors see it. Someone in that building decided the IPO window mattered more than the readiness, and the calendar got moved to match a decision that was probably already made.

20 de may de 202611 min

Google built 30-second video AI and won't release it

Google built a video AI that can already generate long-form clips and then deliberately shipped it capped at 10 seconds, not because the model can't do more, but because the pricing, compute costs, and liability exposure around longer video aren't figured out yet. The "most users won't want longer videos" explanation sounds like product strategy, but it's really a placeholder while Google works out what this thing actually costs to run at scale and who's legally responsible when a verified user deepfakes someone. The real story from Google I/O isn't the capability, it's that the capability race everyone's covering is just the warmup, and the actual competition starts when the cap lifts.

20 de may de 202610 min

Anthropic shipped Opus 4.8 in 41 days, and that's the problem

Descripción

Comentarios

Empieza 7 días de prueba

Todos los episodios