Sustained dense 72B inference on M5 Max 128GB how much does 14” vs 16” matter for thermal throttling under continuous load?

I’m considering the M5 Max 128GB 14” or 16 inch model for a workload that runs continuous inference on a dense 72B model (Qwen 2.5 72B Base, Q4_K_M, MLX) at 32K context. Not batch jobs. Not occasional prompts. Continuous 30-second cycle loop running for hours to days at a time. The burst benchmarks from another thread I found look great but those are 128 token generations. I need to know what happens after 2+ hours of sustained load on the 14” form factor. Specific questions: 1.