Iniciar sesión

Own Your Mind

059: Why DeAI: Why Self-Host Your AI?

7 min · 29 de may de 2026

Portada del episodio 059: Why DeAI: Why Self-Host Your AI?

Descripción

I tried local AI inference on an M4 Max, failed with dense models, then hit 94 tok/s with MoE architecture via MLX. Here is the honest case for self-hosting, and where it actually beats the API path.

Comentarios

0

Sé la primera persona en comentar

¡Regístrate ahora y únete a la comunidad de Own Your Mind!

Todos los episodios

47 episodios

025: Tokenomics: Bittensor Halving

The December 2025 halving's impact on TAO. Emission reduction, price action, dTAO interaction — what the 3-month retrospective tells us about Bittensor's token economics.

1 de jun de 20269 min

024: Gensyn

Gensyn is verifiable AI compute — proving ML work happened correctly. Freedom Score 68, Returns Score 45. We cover the proof system, why verification matters, and the pre-token risk profile.

31 de may de 20268 min

059: Why DeAI: Why Self-Host Your AI?

I tried local AI inference on an M4 Max, failed with dense models, then hit 94 tok/s with MoE architecture via MLX. Here is the honest case for self-hosting, and where it actually beats the API path.

29 de may de 20267 min

058: Why DeAI: What Is DePIN?

DePIN turns idle hardware into coordinated infrastructure using token incentives. It is the physical layer that makes decentralised AI possible. What it actually means, how it connects to AI, and why the framing matters.

27 de may de 20267 min

004: Venice: Private AI With Verifiable E2EE

Venice is the rare DeAI project that's actually a product people use, with billions of LLM tokens processed daily through private, uncensored inference. Now with verifiable E2EE and TEE privacy modes powered by NEAR and Phala. Built by Erik Voorhees with no VC funding, no presale. We cover the four privacy modes, the E2EE encryption architecture, uncensored models, VVV token economics, and the centralisation trade-offs.

26 de may de 20268 min