I Taught a Virtual Camera to Behave Like a Human Operator: How a Face Tracking Algorithm for Shorts/Reels Works

In the previous article I described my “anime factory” in detail — a pipeline that automatically turns episodes into finished Shorts. But inside that system there is one especially important module that deserves a separate deep dive: a virtual camera for automatic reframing.
In this article, I will break down not just an “auto-crop function,” but a full virtual camera algorithm for vertical video. This is exactly the kind of task that looks simple at first glance: you have a horizontal video, you need to turn it into 9:16, keep a person in frame, and avoid making the result look like a jittery autofocus camera from the early 2010s.
But as soon as you try to build it not for a demo, but for a real pipeline, engineering problems immediately show up:

















