I think the best bet would be to use compositing techniques to achieve this. There would be two or three different blender shots at varying distances, and just blend them together in Blender’s sequencer (or compositor).
Clouds and that could also be done via post-processing. Volumetrics will add a considerable amount of time to even a single image render, making animations without a huge farm out of the question, but similar (possibly better-looking) effects can be done with intelligent compositing.
At most, you need 5 shots:
High Orbit: 1/2 the surface of the Earth will be visible, as well as atmospheric falloff. Terrain detail is not visible.
Low Orbit: A much smaller area of the Earth’s surface will be visible, but in greater detail. The same planet-wide image maps may not be sufficiently detailed. Take a look at NASA’s blue marble images - they have huge 1.3 gig downloads containing very high-res satellite imagery. They’re split into 8 sectors, too, making it easier to load in a computer. You’ll probably only need one, but depending on your location you may need to splice up to four of these together to center it on your area of interest. Depending on camera angle, you might also have to worry about how the atmosphere at the horizon looks. Shading from large terrain features might be visible (large mountain ranges).
High Altitude: Again, more and more detail in a smaller area. The look of the atmosphere and clouds will start to change, and terrain features will be more visible. High-altitude clouds (cirrus wisps) will be passed through, but they’re too faint to see much as you pass through.
Low Altitude: The terrain really starts to pop out, and the area you’re zooming into will start to be visible. The camera will pass through the most clouds on this layer. Compositing just a cloudy fog onto the camera will give the impression of passing through clouds without the insane rendertimes of volumetrics.
Landscape: This is the final shot. It starts of as a wide establishing shot, then finally tracks into your final location. The clouds and sky are firmly overhead, and can be ignored, or just 2D on a skybox.
Of course, you can exclude or include whichever shots you don’t think you need.