Optimizing My Local LLM Setup for Batch Tasks
In one recent run I had local LLMs going for more than 28 hours, chewing through 437 markdown files. The model used was qwen3.5:122b, served via Ollama working two Mac Studios on a private LAN. The task was unglamorous: read a file, return a JSON with a two-sentence summary, the named persons mentioned, the topic tags, the key-ideas the file argues. Repeat. Aggregate across files. Write per-entry scaffolds.