I Think My GPU Broke--Severe Lagging in Viewport Last 2 nights

For the past two nights, Blender’s viewport became unusably laggy.
For reference, I’d rebooted the PC about two days ago, so it was working fine before the reboot.
The problem happens in Cycles viewport render mode.

I’d recently updated video driver to 366.36 and because the lagging occurs in both 4.3.2 and 4.1, I assumed the new Studio driver from nVidia was at fault, so I rolled it back to 366.14 last night. But that didn’t solve the problem.

Even stranger is that the slower mode, Vulcan experimental, is now faster than OpenGL mode, where usually OpenGL is fastest.

So after the rollback didn’t fix it, I upgraded to the 366.36 driver again. That’s when Blender started working normally with no lag again.

I made a screen recording of the problem, but the upload feature here is broken tonight (sorry an error occurred please try again).

image

So I uploaded the screen capture to Youtube:

But I’ll describe:
I pan the camera around and it’s about 0.5FPS. I open the preferences and turn off the RTX3090 GPU in CUDA and then I can fluidly move the camera now.
I turn on GPU in Preferences under CUDA and now the dialog box is hard to drag across screen, it is painting about 1 per second the dialog box.
I go to the render settings on right hand panel and turn off GPU, now I can move the camera view fluidly again. Turn on GPU compute and it’s lagging just awful.

I searched threads about this and someone rebooted their PC and fixed the problem. My case is the opposite, was working until I rebooted 2 days ago. I rebooted again just now, but it did not help. Tried running only Blender before starting any other programs. No good. Normally Blender runs great with 50 other programs going in the background, so this is weird.

I’m starting to suspect my GPU has developed a defect and needs to be returned for service/repair, but before I do that, are there any other things I should check? I reinstalled Blender, updated it to 4.3.2 and it didn’t solve the problem. What else could have broken my system two days ago? And why did it work okay after rollback and then roll forward of the video driver for one night? I don’t want to have to roll back/forward the video driver every time I want to use Blender.

I noticed when I render in Blender, I don’t hear the GPU fan speeding up anymore.
But it’s not overheating. And the utilization tops out at only 10% now. Used to go to 100% and the fan would kick up to loud levels.
image

Adding to tonight’s mystery of poor viewport performance, my GPU normally turns up the fans to a loud level when Blender is rendering to file. Tonight it’s silent. And the GPU usage isn’t hitting 100% like it normally would during render. It’s peaking at maybe 10%. Idle temperature normally would be 46°, but rendering it’s maxing out at 37° where in the past, it would hit over 85°. Something is off here. GPU-Z shows the fans at 53%. And right now the PC is idle and it’s emitting a lot of HEAT.

image

GPU chip draw only 10 watts during render?? 4096x4096 Cycles render of a 1.5 million polygon interior scene with indirect lighting. Have I entered the Twilight Zone?

These numbers don’t add up. The GPU clock DROPPED to 200MHz during the 30 second render of a 4K image. Notice the clock dropped while the load went up. Normally the clock will go to 1700MHz during render. None of these makes any sense to me.

  1. Not used as CPU+GPU. Only GPUs are enabled.
    ※ When using CPU+GPU, use CPU+GPU if the performance of both hardware is similar.

Just because you selected a GPU in the hardware selection doesn’t mean you’re resetting it when you use the CPU. It works as a CPU when you select it in the rendering settings.

  1. RTX performs best on Optix. (Optix is an RTX GPU option.)

  2. If this is a GPU driver problem, you will need to completely uninstall and reinstall the driver.
    Reinstalling the driver while the existing driver is present may not resolve the existing issue.

A rendering test would be good to see problems with hardware.
Comparison of CPU and GPU rendering processing times.
If it comes out similarly, there is a problem.

Add…

gdfgd

The other issue I found is the Denoise setting.
For RTX, select Optix, but what is currently selected is an option using CPU.

If it is set like this, the GPU processes the screen and the CPU processes Denoise, which can cause delays.

To add to what @oo_1942 said: You can use DDU to completely remove your GPU drivers. Make sure to run it in safe mode with your internet disconnected so that Windows doesn’t automatically download the drivers.

I currently have ONLY GPU selected:

image

But Optix is greyed out in both viewport and render menus.

The viewport lags when in Cycles render mode, even on a blank project!

The GPU is running COLD. Normally it idles 10° HOTTER than it’s running during a render. Also GPU chip using only 10.5 watts… nVidia developers say 570 watts is normal for full GPU utilization. So something’s very wrong here.

I think what I’m going to try is a full system image restore. I did the last image backup just 10 days ago when everything worked properly.

I did the DDU driver removal and reinstall.
That seems to have fixed Blender’s issues, but created a whole host of other issues with mapping of my tablet, and rearranging of all desktop icons. I think I have most/all of it sorted.
The nagging question is “what went wrong?” How did the driver get into that state?
My GPU temperature is now idling at 40°. During render, GPU is showing 3-5% utilization and temperature goes to 60° now. Render time for the 4K image is normal based on past benchmarks.
Does seem odd that utilization is so low though.
image

Clocks look more normal now:

The GPU fans don’t speed up when rendering like they used to though. Used to be I could tell when it was working hard by the sound of the fans getting very loud. I don’t hear them at all now.

I noticed that DDU didn’t uninstall MSI Afterburner, which I was using for the past 4 years to run the fans at lower temperatures to extend life of GPU. I found it’s still running, but blank as in not detecting the GPU. So I reinstalled it and got the fan controls back, but then Blender became unresponsive again in the viewport. I don’t get it. Afterburner worked flawlessly as a fan speed booster for four years, and now suddenly it’s interfering with Blender? And on render, instead of 1900MHz, it’s dropping the clock to 200MHz. I don’t get it.

I removed MSI Afterburner and rebooted and Blender and the clock speeds are normal.

But how can I get a more aggressive cooling fan setting that I had with Afterburner’s fan controls?

The GPU runs MUCH hotter with the fan maxing at only 53%, I am concerned that GPU life will be short at high temps.

I’m thinking that Afterburner is not compatible with 366.36 nVidia driver. It used to work well on older versions. All the problems started 3 days ago when I updated the nVidia driver.

What other methods of applying a custom fan curve can I use to keep the GPU cool under load without affecting the clocks? I don’t like 100° C for full fan speed, which seems to be the factory default.

I’ve found Asus GPU Tweaker III, which is compatible with my Asus ROG Strix RTX3090 and it does a decent job controlling the fans with custom curves.

The only remaining issue is that I’m wasting about 80 watts of electricity due to the high idle clock speed. Haven’t been able to get that down. It only idled down ONCE, right after I quit Blender, but subsequent quits, it just stays at 1740MHz. Wattage at idles should be around 10 watts, not 100 watts or more.

This is a big deal, being electricity is $.52/kwh here, so I’m on solar with batteries. That extra 100 watts can drain the battery bank significantly over night.

I’ve watched half a dozen videos on the topic, tried a bunch of ‘fixes’ but none of them had any effect. If I’m at the desktop with no apps running, the GPU should be at idle, should it not?

At least Blender is working fine.

Fact discovered: MSI Afterburner is NOT compatible with 366.36 Studio driver from nVidia.

If you overclock using MSI Afterburner, the blender does not work properly.
MSI Afterburner can be used for down-clocking when there is a blender malfunction due to clock.

※ Overclocking of the CPU and GPU causes blender to malfunction.
Overclocking with MSI Afterburner results in performance degradation if the right combination is not found.

I wasn’t using Afterburner for overclocking. I was using it for fan control, because the fans would only start to ramp up when the GPU reached 100°C. Now they ramp up at 60°C and GPU will last longer.
It worked until I updated video driver to 366.26 four days ago. Then I started having problems.
I’ve switched to Asus GPU Tweak III for a fan profile that gives adequate cooling and Blender is happy.
The only problem I have now is that when the system is idle, it’s using too much electricity. GPU clock used to drop to 210MHz when Blender was closed. Now it’s always at 1740MHz and using over 100 watts instead of 10 watts when system is idle. My solar battery storage ran down to 43% overnight, much more than usual and the PC is blowing warm air instead of cool air out the vent on top. I’ve researched the problem all evening and tried every suggestion, but the clock remains at 1740 idle. It only idled one time after the first use of Blender and when I shut it down, it went to 210MHz and the card temps went into the low 30s with power at 10 watts. That extra 90 watts is adding up quick.

There is a problem if the GPU temperature goes up to 100°C.
Clean the dust inside first.
If it rises to 100°C even after cleaning, it should be inspected.

The maximum temperature shown in the manual of 3090 is 93°C.

※ Temperature is responsible for the performance degradation of the hardware.

Add…

Reducing the clock using Afterburner does not cause problems with the blender.
To reduce power consumption, try Afterburner.
Afterburner lets you save your settings, so you can easily switch between them.

Reducing the clock does not provide maximum performance, but it is also a way to reduce malfunctions.

Without the fan control software, if left to its own, it doesn’t start to spin up the fans until it gets into the low 90s. IMHO, the fans should not be loafing when the GPU is under load. It’s dust free. The card is fairly new, but when I was investigating the other problem I did clean the inside of the computer with compressed air.
Idling at 46°C and the fans running at 30% does not seem like a good fan policy, so I used the software tool to make my own fan curves.

This keeps it running MUCH cooler and actually spins the fans to 70% when rendering, maintaining about 50°C on a long render. Without the fan curves, the default fan programming would have it over 90°C when rendering.

As I mentioned, AB is not compatible with Studio driver 366.36, so I cannot use it unless I roll back to a much older driver version.

I’m not sure about the exact cause.
I think 3090 is basically high power usage and heat generation. :thinking:

The hardware temperature rise can also be monitored using software that can check temperature/power for all parts.

The measurement program I use is HWiNFO (free).

I’ve had issues with the hardware of my system and have experienced infinite reboots and fever symptoms, so I’m checking the temperature and voltage periodically (it’s currently resolved)
The abnormality in my system was a faulty motherboard voltage connector contact.

※ For GPU drivers, if you’re not a game user, you don’t need to update them periodically.

Add…

Please do not refer to the Windows Task Manager graph for GPU utilization.
You need to check the clock usage rate of Afterburner.

When using rendering in blender, it doesn’t seem to reflect in Task Manager’s graph.
I think this probably shows the usage of the game.

I use Studio drivers because I’m a content creator. But updates to programs like OBS require me to update the graphics driver fairly often. Same with Blender. To take advantage of certain features of Open Image Denoise in Blender 4.3.2, it is necessary to run the latest driver, according to the Blender.org website on Blender 4.4 features.

I don’t believe there’s a hardware problem, just a poor choice of fan curves by Asus, probably trying to make the card silent. Doesn’t even turn the fans on until it reached 45°C. Now I’ve got them starting and 35 and max speed at 80°C. GPU runs MUCH cooler.

Then the next thing I did was a ‘hack’, but it works:

Using this utility, I can force GPU into state 8, which is lowest power.

This gets me down to 420Mhz. Still not the 210MHz that it used to idle down to, but power consumption of the GPU is now 30W instead of 110W. I’ve got my PC AC mains draw down to 357W idling now.

It’s a workaround and not a perfect fix, but at least it prevents me from draining my solar battery system too low overnight.

I also did some tests between Optix and Open Image Denoise with GPU option enabled. A scene rendered in 21 seconds with Optix and 16 seconds with Open Image Denoise.

1 Like