Actual comparison between locally ran Qwen-3.6-27B and proprietary models

Hey y'all! I've recently written a text in Russian about my experience comparing Qwen-3.6-27B with lower tier cloud models on hard tasks -- I wanted to share the translation of the post, since I found the results interesting and surprising. It might break Rule 3, since it's evaluation of LLM written code, but whatever, my methodology is handcrafted and results are still non-trivial. Sorry for the translation, my English is not that good. __ I once had a server with a 3090 and a Xeon from AliExpress, and I used to run local models on it.