Inference
loading
·
loading
·
The State of Local and Affordable Inference in October 2025
·1379 words·7 mins
An overview of the current landscape of GPUs and AI compute for local inference as of October 2025 from Nvidia and AMD to Intel, Apple, and the cloud.
NVIDIA DGX Spark: underwhelming and late to the Party
·890 words·5 mins
NVIDIA’s DGX Spark arrives late as an AI inference system whose performance is lagging behind. With low speed unified VRAM, immature software optimizations, and heavy competition from Apple, AMD, and Intel, the Spark exposes how little remains of NVIDIA’s CUDA moat.
Ollama Environment Configuration
·66 words·1 min
Key environment settings for running Ollama efficiently on Windows and Linux.
Force Ollama to use only a single Nvidia GPU
·163 words·1 min
Selecting and enabling which GPUs are visible to ollama within windows.
Base Mac Mini M4, The alternatives for Low-End NVIDIA Hardware for Inference
·741 words·4 mins
How the Mac Mini M4 has enabled affordable Local LLM inference and making VRAM-starved NVIDIA cards obsolete for low power and low cost
