With CUDA 8, NVIDIA is making good on an old promise of providing unified virtual memory for the GPU, allowing developers to transparently fall back to system RAM without effort (at least for static data).
In this way, the hard GPU memory limit is essentially lifted.
The bad news is, it’s going to be available on the Pascal architecture only. NVIDIA so far has not announced any consumer/workstation products with this architecture, but if you’re currently planning any a major GPU purchases, you may want to hold off on it for a while.
On page 15 of this CUDA presentation, they show an (eventually constant) performance hit of about 40% for a memory footprint beyond 70GB (on a 16GB GPU) on a large-scale fluid simulation.