Skip to main content

LLM

loading · loading ·
Planning power consumption wrt energy Prices
·197 words·1 min
I built a system with DuckDB, and Streamlit to monitor and manage energy prices for better control and planning LLM jobs during winter
Testing DeepSeek-OCR: Vision Text Compression for LLMs
·466 words·3 mins
Notes from testing DeepSeek-OCR as a local vision-language model for OCR and text compression on a large archive of screenshots. Includes observations on model performance, visual-token compression, and multilingual results.
Gemini Pro 2.5 in October 2025: decent text, shaky coding, tricky tradeoffs
·903 words·5 mins
A brief look at Gemini Pro 2.5 compared with ChatGPT 5 and Claude Opus/Sonnet, plus notes on Gemini 2.5 variants, NotebookLM, and mobile privacy concerns.
LLM false metric generation
·654 words·4 mins
There is a lot of synthetic data that is being generated by LLMs, These include false metrics.
The Problem With Proprietary LLM Providers: Removing Model Access without recourse
·414 words·2 mins
OpenAI’s removal of GPT-4o, o3, and other models after GPT-5’s launch breaks fundamental MLOps principles. Without model versioning and control, data science workflows become unreliable. Local LLMs offer a better alternative for maintaining consistency.
Base Mac Mini M4, The alternatives for Low-End NVIDIA Hardware for Inference
·741 words·4 mins
How the Mac Mini M4 has enabled affordable Local LLM inference and making VRAM-starved NVIDIA cards obsolete for low power and low cost
The Strawberry Challenge, When LLMs Need Tools to Count
·170 words·1 min
Around October 2024, The infamous “How many R’s are in strawberry?” question has become a fascinating litmus test for Large Language Models, exposing a fundamental limitation in how these systems process text.
AI in Finance Workshop with a live demo
·292 words·2 mins
Artificial Intelligence Applications in Financial Services with a demo on Local LLM Invoice Processing with Automated Billing Code Assignment
Concurrency in LLMs: Why It Matters More Than size of LLM
·321 words·2 mins
Understanding why handling multiple requests beats raw token speed for my local LLM deployments.