
A few weeks ago we wrote an insightful blog post on how you can run an AI LLM locally on your machine. Since then we've tried out 2 more models. Our primary goal was to check out the performance of each model in a basic CPU-only machine; no GPU's anywhere. How would each of the models hold up when used by a less technically-inclined individual? We installed from the smallest to the mid; the really huge models were ignored here since we assumed not many average users would sacrifice Gigabytes of diskspace for that. The machine we used: a base Core i7 machine with 32GB of RAM and 1TB disk space. The models we ran:
We chose these models since they did not consume much of disk space. NOTE: This is not in-depth technical testing; we are just approaching things as an average computer user would.
Now, for those who do not want to read through this entire article here is what we found about each of the models:
Now, on to the meat of the article. We prompted each with a question about RLC circuits, their usage and the Ordinary Differential Equations that represent it. The results were stunning and revealed just how each model solved the problem.
DeepSeek R1 is an AI model that was created by Chinese startup, DeepSeek. It uses an "innovative Mixture-of-Experts (MoE) architecture, which allows for efficient inference while maintaining high performance". What really makes DeepSeek interesting is that it has "self-learning" capabilities i.e. when used, it can infer and learn from user prompting over time. What we noticed about it is that it can actually reason right up to the best condensed and simplest solution while the other 2 (especially Qwen) would sometimes get stuck in endless loops of meta-thinking.
DeepSeek's response was brilliant. It's 'thinking' mirrored almost human-like reasoning capabilities. It took some seconds before starting to think, then took some time thinking through the problem then finally streaming the solution. The smaller 1.5b model was the quickest but its output was not good enough for highly technical prompts. The larger DeepSeek models had slower speed of response but better results.
Qwen 3.5 is an Alibaba product. The unique thing about it is that it lacks a 'self-learning' component. This does not mean it is handicapped in any way. We detected that it responded quite well to our RLC circuit question complete with the 'how-to' solve the ODE's of the circuit. One thing we noticed about it is that its 'thinking' follows a checklist: Problem -> Constraint -> Solution ... The time taken to start reasoning was shorter than that of DeepSeek and its resource usage was a bit better than that of DeepSeek R1.
One more thing: when it comes to following instructions strictly, Qwen 3.5 beat both DeepSeek R1 and Gemma 3.
Gemma 3 are open weight models built by Google on Gemini technology. Gemma 3's speed of response amazed us. It was pretty fast. It lacks the 'thinking' step that the previous 2 models have. And it did not put as much strain on our machine's resources while in use. Its deficiencies become apparent in multi-turn contexts where it loses the train of thought so you might need to watch on that.
It did produce in some instances subpar responses and sometimes failed to follow the instructions correctly. Due to its lack of 'thinking' one could not tell how it reached an answer unless you challenged it in a second prompt.
So where can each of these models shine? Which would be the most appropriate use case for each?
First, while these smaller models can be used in coding tasks they are not the best. We do not recommend using them in coding. However, they are very good at content generation tasks, quick references and answers, scoping out ideas and the like.
There is still a lot more that we will need to uncover as we try them out in our day-to-day activities.
Years of Experience
Happy Clients
Web Systems Built