I Taught a Virtual Camera to Behave Like a Human Operator: How a Face Tracking Algorithm for Shorts/Reels Works

In the previous article I described my “anime factory” in detail — a pipeline that automatically turns episodes into finished Shorts. But inside that system there is one especially important module that deserves a separate deep dive: a virtual camera for automatic reframing.
In this article, I will break down not just an “auto-crop function,” but a full virtual camera algorithm for vertical video. This is exactly the kind of task that looks simple at first glance: you have a horizontal video, you need to turn it into 9:16, keep a person in frame, and avoid making the result look like a jittery autofocus camera from the early 2010s.
But as soon as you try to build it not for a demo, but for a real pipeline, engineering problems immediately show up:
















It would seem that the question of the color of the Moon and the Sun from space for modern science is so simple that in our century there should be no problem at all with the answer. We are talking about colors when observing precisely from space, since the atmosphere causes a color change due to Rayleigh light scattering. «Surely somewhere in the encyclopedia about this in detail, in numbers it has long been written,» you will say. Well, now try searching the Internet for information about it. Happened? Most likely no. The maximum that you will find is a couple of words about the fact that the Moon has a brownish tint, and the Sun is reddish. But you will not find information about whether these tints are visible to the human eye or not, especially the meanings of colors in RGB or at least color temperatures. But you will find a bunch of photos and videos where the Moon from space is absolutely gray, mostly in photos of the American Apollo program, and where the Sun from space is depicted white and even blue.