Best image + audio -> video long form (>10 mins)?

Sort of new to this. I am running HeyGen right now but would like to switch to a better self hosted model that I'll run in cloud. Wondering what's the best long form model and if LTX 2.3 could generate long form videos. Use case: I need to make videos for a non-profit and all videos are just me. - I am wondering if there's a video-to-video thing where I put an AI generated image face of someone else and swap my face with that, - or if there's an image to video tool where I use my audio and an AI generated video to create videos.