Own Your Mind

059: Why DeAI: Why Self-Host Your AI?

7 min · 29. maj 2026

episode 059: Why DeAI: Why Self-Host Your AI? cover

Description

I tried local AI inference on an M4 Max, failed with dense models, then hit 94 tok/s with MoE architecture via MLX. Here is the honest case for self-hosting, and where it actually beats the API path.

Comments

0

Be the first to comment

Sign up now and become a member of the Own Your Mind community!

All episodes

47 episodes

025: Tokenomics: Bittensor Halving

The December 2025 halving's impact on TAO. Emission reduction, price action, dTAO interaction — what the 3-month retrospective tells us about Bittensor's token economics.

1. juni 20269 min

024: Gensyn

Gensyn is verifiable AI compute — proving ML work happened correctly. Freedom Score 68, Returns Score 45. We cover the proof system, why verification matters, and the pre-token risk profile.

31. maj 20268 min

059: Why DeAI: Why Self-Host Your AI?

I tried local AI inference on an M4 Max, failed with dense models, then hit 94 tok/s with MoE architecture via MLX. Here is the honest case for self-hosting, and where it actually beats the API path.

29. maj 20267 min

058: Why DeAI: What Is DePIN?

DePIN turns idle hardware into coordinated infrastructure using token incentives. It is the physical layer that makes decentralised AI possible. What it actually means, how it connects to AI, and why the framing matters.

27. maj 20267 min

004: Venice: Private AI With Verifiable E2EE

Venice is the rare DeAI project that's actually a product people use, with billions of LLM tokens processed daily through private, uncensored inference. Now with verifiable E2EE and TEE privacy modes powered by NEAR and Phala. Built by Erik Voorhees with no VC funding, no presale. We cover the four privacy modes, the E2EE encryption architecture, uncensored models, VVV token economics, and the centralisation trade-offs.

26. maj 20268 min