Ollama local non-thinking LLM models as of 2025 for a MacBook >= M1

Local LLM’s using ollama provide very good results. With the invention of the reasoning models, there are plenty very good reasoning models available on ollama. However, for quick questions, for which you expect a quick answer, the best results deliver non-thinking models, as they do not spam you with tons of thinking steps for a long time before they get to the point. There are many non-thinking models available over ollama too, which run smoothly on a modern MacBook with an M1 or better CPU. I found different models have different strength, so I tend to use the following models for given tasks:

  • qwen2.5-coder-14b: for code
  • gemma3: for creative writing, RAG, or math
  • phi4: for all other technical tasks
  • mistral-small3.1: for german-english translations or other non-english language specific queryies