fermi 470 performance problem

you are on a laptop? then yes its really that much of a difference, laptop GPUs normally operate at much lower clocks which relusts in huge performance loss.

here some specs to compare:
9800M GS
Pipelines 64 - unified
Core Speed * 530 MHz
Shader Speed * 1325 MHz
Memory Speed * 800 MHz

9800GTX (the values in the brackets are the values im running this card, yes its overclocked)
Pipelines 128 - unified
Core Speed * 675 (@ 800) MHz
Shader Speed * 1688 (@1890) MHz
Memory Speed * 1100 (@1166) MHz

this is why your card has not the same performance… and i should at least see the same difference to the fermi…

ok so you think the 470 is crap
are you going to keep filling a thread with benchmarks that show this again and again or are you going to ask for your money back or trade it?
what do you expect to achieve here? we are trying to be helpful but it is not possible.
re the vert sync it is something I know makes a good difference in Solidworks.
I thought it might help you too, sorry if it doesnt perform miracles.
I am not a graphics expert so I cant enter into a discussion with you about the relative effects of this setting on wires vs solid.
happy blending :smiley:

im not expecting any help, because there is nothing that can be done. its a driver problem and clearly on NVIDIAs site now to do the work.
im just updating this with benchmarks of newer drivers in case anybody is interested to know “when/if” its worth upgrading to one of these cards, i dont think there is anything wrong with that.

My GTX 470 is performing very poorly too… :frowning:

Benchmark Results, Screen Size 1333 x 874

Overall Score (FPS)
gl : 18.0654 fps
render : 6.15 sec

Spin wireframe view, subsurf monkey, 4 subsurf levels : 60.0142 fps
Spin solid view, subsurf monkey, 4 subsurf levels : 29.9448 fps
Spin solid view, 1000 monkes : 9.7860 fps
Spin wire view, 1000 monkes : 18.8456 fps
OpenGL image load & free, 256x256 px : 329.2491 fps
OpenGL image load & free, 512x512 px : 87.4408 fps
OpenGL image load & free, 1024x1024 px : 22.3460 fps
OpenGL image load & free, 2048x2048 px : 5.6084 fps
Raytracing with AO and area light, 2 threads : 4.53 sec
Shadowbuffer light, 1 threads : 7.76 sec

Has anyone tried benchmarking the GTX 280 against the GTX 4xx series to see if the 280 might actually be faster with things like viewport speed (since it’s the most powerful GPU not based on Fermi)?

If Fermi is causing slowdown outside of apps. optimized specially for Fermi compared to the GTX 280 then it would make more sense for people to get a 280 because a 470 would be very similar in performance while outputting a lot more heat and consuming more power.

I have the same Problem with Gtx 465

edit: meanwhile, the Nvidia Cuda Driver with GTX 265 - used for Octane Render - works!

Much like Big Fan I can’t explain why either but I saw a good difference in this benchmark by disabling Vertical Sync. Has anyone actually tried it? Also, has anyone tried this benchmark with this card in Linux as Linux always gives me better viewport performance than Windows. Drastically better.

Drastically the same.

It may be drivers, or it could be unhelpful settings, or both, but it might be due to the architecture of the 4xx cards.
Maybe it tessellates very well but its not so hot when handling a ‘static’ mesh , although I’m not sure why that should be. Its hard to imagine they would knowingly make a design that crippled performance for max, blender etc.
Perhaps you should ask your questions on the Nvidia forum where someone with a better technical understanding can give you some answers.

i already asked there and there are also some others with other 3d apps that have performance problems but noone from NVIDIA answers in these forums so we are left on our own to figure out why this happens.
i can not imagine that this is a architecture problem, otherwise games with high poly counts like crysis would have the same performance problems which they dont. (i might be wrong) but shouldnt this card be strong in openGL application? but maybe there is a problem with openGL and thats why there are no quadros yet with the new architecture o_O

i would not be to surprised if that was built in by purpose since they also detect if you have a ATI card in your system and then disable physiX if you have…

the problem is as long as we dont have anything official its all speculation:/

My gtx260:

Benchmark Results, Screen Size 1920 x 1027

Overall Score (FPS)
gl : 37.9693 fps
render : 3.89 sec

Spin wireframe view, subsurf monkey, 4 subsurf levels : 120.8355 fps
Spin solid view, subsurf monkey, 4 subsurf levels : 59.8274 fps
Spin solid view, 1000 monkes : 12.2334 fps
Spin wire view, 1000 monkes : 25.4109 fps
OpenGL image load & free, 256x256 px : 1504.4672 fps
OpenGL image load & free, 512x512 px : 404.0249 fps
OpenGL image load & free, 1024x1024 px : 89.6773 fps
OpenGL image load & free, 2048x2048 px : 19.8733 fps
Raytracing with AO and area light, 1 threads : 2.92 sec
Shadowbuffer light, 4 threads : 4.85 sec

The drivers for gtx4xx need a little work, but then you have about 250% more then me.
Cheers, mib

Edit: Somebody trys the new beta driver 256.35 ?
http://www.nvidia.com/Download/Find.aspx?lang=en-us

OpenGL 4.1 is out and nvidia announced support with the next driver. (so did AMD)
For all we know it might be that nvidia simply neglected the opengl integration in the latest drivers to be optimized for the GF100 and is going to implement it with the next driver.

All there is to do is to wait and see. :smiley:

they also announced the new quadro cards based on the new architecture, so they have a reason to make it work with openGL

http://developer.nvidia.com/object/opengl_driver.html new openGL Drivers are out, lets see how they perform:O

258.96

Benchmark Results, Screen Size 1600 x 1181

Overall Score (FPS)
gl : 27.6538 fps
render : 2.87 sec

Spin wireframe view, subsurf monkey, 4 subsurf levels : 137.6389 fps
Spin solid view, subsurf monkey, 4 subsurf levels : 38.6761 fps
Spin solid view, 1000 monkes : 10.3462 fps
Spin wire view, 1000 monkes : 34.5445 fps
OpenGL image load & free, 256x256 px : 646.3310 fps
OpenGL image load & free, 512x512 px : 163.9601 fps
OpenGL image load & free, 1024x1024 px : 37.7219 fps
OpenGL image load & free, 2048x2048 px : 10.3721 fps
Raytracing with AO and area light, 4 threads : 2.44 sec
Shadowbuffer light, 4 threads : 3.29 sec

259.09

Benchmark Results, Screen Size 1600 x 1181

Overall Score (FPS)
gl : 28.5355 fps
render : 2.75 sec

Spin wireframe view, subsurf monkey, 4 subsurf levels : 135.3129 fps
Spin solid view, subsurf monkey, 4 subsurf levels : 39.4178 fps
Spin solid view, 1000 monkes : 10.5529 fps
Spin wire view, 1000 monkes : 34.7798 fps
OpenGL image load & free, 256x256 px : 657.5368 fps
OpenGL image load & free, 512x512 px : 175.0782 fps
OpenGL image load & free, 1024x1024 px : 42.1098 fps
OpenGL image load & free, 2048x2048 px : 10.7411 fps
Raytracing with AO and area light, 4 threads : 2.36 sec
Shadowbuffer light, 1 threads : 3.14 sec

not much difference… im gona do a clean install and see if that speeds things up a bit more.

There’s a discussion on the luxology boards on the 480 and people report that everything in modo works just fine for them… :spin:

Then again luxology is in the partner program with nvidia so may’haps they know something the rest of us don’t. :confused:

i think i have a similar problem with engts 250 1gb… i know is not comparable with 465, but it should be more or less the same as the 9800gtx and with more than 1 million poligons start to lag hard… i will try new drivers too.

how can i perform that benchmark you are posting results of? EDIT: whops… found the link! pardon me! :smiley:

thanks!

your 250 should outperform my 9800GTX or at least (if not overclocked) give similar results.

yes the test give me 30-36fps (depending on the settings of the nvidia panel) so looks like is the same as the 9800gtx

the thing that i don’t understand is that if i add multires level 6 (about 2mil poligons) to suzanne, i think the fps are about 1 per second or less… blender cannot be used like this, while i heard of people that manage to achieve higher poligns count without this performance problems.

also i heard that in 2.53 the poligon management has been deeply improoved, while i have the same exact performances in 2.49 and 2.53.

so maybe the performance are normal and im just expecting too much, but i was expecting to be able to manage at least 1,5 mil poligons (and the system is already lagging a lot with this amount)

For better sculpt performance some settings may help:
Change Window Draw Method and VBO in User Prefs > System from Automatic
to another value. For me it is Triple Buffer and VBO =on.
Disable Double Sided in Mesh Settings.

It is necessary to take a sculpt build from graphicall.org or a regular build compiled without openmp what slowdown the sculpt performance.
In actual builds openmp is disabled for sculpting in 2.53 beta it is enabled.
I could work until 30 Million faces, than my system starts caching and lags.

I hope it helps a little, cheers, mib

thanks for the hints mib! i changed the settings but i cannot find a no openMP build on graphicall… i have never been good with graphicAll archives research:D could you point me one please?

I don’t no what OS you use, so look for gsoc builds of jwilkins and/or Nicholas Bishop.
They are special for sculpting. For linux it is easier to find.

Cheers, mib