That is a huge clipping plane of a couple thousand set against an extremely miniature person with the focal point at a distance of millimeters. That big pyramid you see is basically the size of the camera’s lense, at the end. That’s a big closeup. The focal distance is practically inside it.
I tried working on it a bit, and I’ll probably try more, but I don’t suppose you considered doing it the way they did it in the old days–and actually a lot of new movies, I think–and doing the far distance, the middle distance and the close distance as a series of mattes? They could even be animated mattes. You could even store them as separate scenes in the same file, and layer them with the compositor. This is just an amateur talking, so I have no practical idea, but it seems some of the things you do have a tendency towards brute forcing, where things could, so far as I can see, be worked around with relatively little effort.