As you say, likely the photo, I’ve tried several photos and the results don’t line up. Even when I have the known dimensions of a room, simply because the photo has either been tweaked in Pshop, or used a correcting lens.
There are a couple of reasons this picture is a litlle problematic:
1/ it’s non that apparent to figure out which lines are mutually orthogonal (eg. roof of the left building is not parallel to the main building, street is inclined, etc)
2/ I’d say this is a 2-point perspective (although the vertical lines seem to be rotated a little bit) - in such a case you should ask fSpy to find the principal point from a third vanishing point because it won’t work.
What you should do instead is set the principal point to Manual. Because vertical lines are mutually parallel it means the principal point is located on the line connecting both vanishing points