After 8 months of running everything local, ive accepted the productivity tools also have to be local

Quick context: M3 max 64gb, currently running llama 3.3 70b q4 as my daily driver via ollama, qwen3 coder 30b for code (switched from qwen2.5 earlier this year), mlx for the smaller stuff. tried llama 4 scout earlier this year but 64gb is too tight at q4 once you leave headroom for context, sticking with 3.3 70b until i upgrade hardware. moved off the claude api around september last year. the cost wasnt the main reason, the data was. i do contract work with NDAs and i was getting uncomfortable about how much code and how many internal docs were going through anthropics servers.