Tesla K80 24GB x3 rendering

Retrotrollet · March 21, 2021, 2:02pm

I Biosmodded/ and finetuned all Tesla card yesterday to about 1200mhz so i dont need msi afterburner anymore.

M-Alien_Inxthink · March 23, 2021, 8:38am

Hello, I am still waiting for various components for liquid cooling. I think that by next month or the next I would have the whole computer assembled.

I had never taken so long to set up a pc. I have been collecting parts since September of last year, and it is not easy to build this computer because there are many parameters to consider.

For example; It has two independent circuits for liquid cooling in push / pull mode, with 16 120mm fans and 4 200mm fans … Apart from the economic investment that I have been making, for example; The liquid refriguertación only, it has cost me about € 1000 approximately, and the best of all is that I just quit my architectural modeling job 🥲

You won’t have a job for me? 🥲

Retrotrollet · March 31, 2021, 5:09pm

My build up and running

Lucas-pl · April 9, 2021, 6:17pm

About basic Tesla overclocking:
I managed to easily overclock mine to run at 810 Mhz instead of default 732 Mhz using Nvsmi application (provided by default with nvidia drivers). 810 MHz is the highest supported GPU clock speed listed in the application, I see I can change the clock speed of the memory too, but there is only one supported value on the list.

==============NVSMI LOG==============

Timestamp                                 : Fri Apr  9 19:51:52 2021
Driver Version                            : 461.40
CUDA Version                              : 11.2

Attached GPUs                             : 2
GPU 00000000:07:00.0
    Supported Clocks
        Memory                            : 2600 MHz
            Graphics                      : 810 MHz
            Graphics                      : 784 MHz
            Graphics                      : 758 MHz
            Graphics                      : 731 MHz
            Graphics                      : 692 MHz
            Graphics                      : 666 MHz
            Graphics                      : 640 MHz
        Memory                            : 324 MHz
            Graphics                      : 324 MHz

I suppose I can still change those values to whatever I want but here is the question: what can go wrong if I set some “unsupported” clock speed for memory? Or set the GPU clock speed even more above 810 MHz?

I’m not talking about the excessive heat or system stability under high load (the temperature is still fine, I have some room for more heat) - I’m talking about the eventual risk of irreversibly damaging or breaking my Tesla.

Fun fact - I couldn’t overeclock the gpu clock spped using Afterburner (which seemed to be the easiest way) but I could overclock the memory speed:)

Retrotrollet · April 9, 2021, 7:10pm

Hi Lucas !
I manage to bios mod all my teslacard k80/k40 to 1160mhz gpu and 3000 mem.
No need of afterburner.
1160mhz with my vcore i can rendering safe.

K40 have gtx780 core ,so if someone can biosmod K40’s vcore better, like 1 3volt, it can do 1350-1500mhz.
I can bucketrendering with higher mhz, but its slower in reality.
Gtx780 have a Lighter bios, easy to voltmod.
K40’s bios is bigger with a lot of diffrent vcore maps.
K80 Bios is like K40 ,tricky to get the volt over 1.092.

The memory speed is 3000mhz on all similar gtx card without memcooler.
So 3000mhz is safe.
I can send you my bios files if u want ?

Retrotrollet · April 9, 2021, 7:19pm

U can read my bios map for K40 and put the settings for your K20 bios.
I can put up pictures of it so u can mod your card

Retrotrollet · April 9, 2021, 8:11pm

Some bios setting to copy for your gpu.

Retrotrollet · April 9, 2021, 8:24pm

Try to implement mem 3000mhz and gpu 850mhz to begin with, skip voltmode and power.
You can flash bios a lot of times

Lucas-pl · April 9, 2021, 11:15pm

Thank you, I’m still a bit scared of flashing the bios (last time I flashed a mobo bios to the latest version I ended up bricking it…) but I managed to overclock with Afterburner and Nvsmi. As I mentioned - changing the clock speed value in Afterburner didn’t actually change it but It turned out it did change the range of supported speeds in Nvsmi.
That way I managed to get 984MHz for gpu and 3200MHz for memory but finally ended up at 870MHz and 3000 MHz. 3200 MHz seemed ok for short renders (bmw scene) but became unstable for long computing (folding@home), higher speeds for gpu generated too much heat - I got stable 90C but wanted to have a 1-2C safety threshold.

Retrotrollet · April 10, 2021, 12:02am

870mhz and 3000mhz is great
I had 7 GTX 780 in one pc and bios flash the wrong card
3 cards useless for a couple of years.
Found bios/eprom USB programmer CH341A that you can restore graphiccards bios with

Lucas-pl · April 13, 2021, 8:18pm

Doing some more research and getting new ideas:) - the K20 doesn’t have any sophisticated way of throttling according to temperature (or at least I don’t know how to enable it) and when the temperature reaches 90C the clock speed is just decreased to the minimal level.
Since the NVsmi tool lets me change the clock speed “on the fly” I managed to script this in the same way I control my case fan speed using HWinfo - I have 950 MHz set for temperatures below 85C and 898 MHz above 85C.
So far this works very well for me - in my case, the gpu temperature under load can vary from 75 to 90C depending on the room temperature, rendering/computing specification and duration etc. so this is a simple way to get the maximum performance without overheating and dropping to low clock speed.

Retrotrollet · April 14, 2021, 5:08am

Its great for aircooling, and close to 1000mhz👍

futc · May 19, 2021, 11:38am

Hey there, I recently bought a K80 and I am having some really weird problems, for some reason my k80 is quite a bit slower than it should be. I even flashed your bios files @Retrotrollet that you uploaded on techpowerup but still I get less performance than a stock K80 should get. according to opendata.blender.org a single k80 core renders the BMW benchmark scene in about 3 minutes, but a single k80 core on my card takes 4 minutes, even with the overclocked bios this is way longer than it should be. My temps are okay and the card is going up to the 1000 MHz clock. Do you guys have any idea what could be causing this problem? I am running windows 10 on a ryzen 3900x, asus b550-f system. I already tried different driver/cuda versions but there was no difference in performance. I also have the same slow speeds when using pop-os. Any help would be greatly appreciated.

Lucas-pl · May 19, 2021, 11:44am

Since the temperature and clock speed is ok, maybe the PCIe bus speed is the bottleneck? In my motherboard, the secondary GPU slot is slower that the primary, the difference in render time is not significant but still noticeable.

futc · May 19, 2021, 11:55am

I believe pci speeds don’t really make a difference in blender rendering. The lower pci slot the k80 is in has 4x lanes so not a lot, but in blender this shouldnt make a big difference. I have seen tests online where a GPU has the same performance with x16 lanes and x1 lane.

Edit: maybe the problem is that the pci slot is connected to the chipset rather than the cpu?

Retrotrollet · May 19, 2021, 12:08pm

Hi !
I really dont know what the problem is.
K80 i belive need a litte more bandwith than a singel gpucard.
So i have k80 at 16x and 8x
My 2xK40 works at 4x
I tested k80 external with 4x with no luck.

Lucas-pl · May 19, 2021, 12:21pm

I have seen tests online where a GPU has the same performance with x16 lanes and x1 lane

Well I heard that too, but testing on GTX1660 gave me different render times by 3-5% depending on the PCIe slot - not great, not terrible. Don’t know how that would affect the Tesla because mine fits only in one of those slots;).

I doubt that would help, but you might also try disabling ECC - this also has some slight impact on performance and is totally useless in rendering.

Retrotrollet · May 19, 2021, 12:31pm

I have ECC off.
Someone else on forum told me to turn it off
It probably gain a little.
I have 9 modded gpus in same pc so am happy it works.
Did make a new driverkit for my PC without nvidia telemetry and shit, it contains only GTX 1080Ti/K80K40 it works far better and more stable.

futc · May 19, 2021, 12:34pm

I already disabled ECC, it didn’t make any difference sadly. Hmm maybe I need to test the k80 in the top x16 slot on my motherboard, the slot with x4 lanes connected to the chipset might be too little. But the problem is I don’t want to have my 6700xt in the bottom slot because it would definitely destroy it’s gaming performance.

Retrotrollet · May 19, 2021, 1:39pm

Amd and nvidia i same pc
Never tested that, its kind of trouble, i think.
Test 2 nvidia gpu card and u probably get better performance from k80