Windows-64 impact on rendering

I was reading the tech news today and found a processor comparison between AMD and Intel on Windows XP (32 and 64 bit). The results will suprise you…

Of course SiSoft Sandra is synthetic and PovRay is Real-world and relevent to what we do here. The next test shows the generation of a Mandelbrot fractal (AKA procedural texturing).

Of course the performance will be tweaked by both AMD and Intel, possibly with various firmware drivers to help their CPU execute even faster. If I have neglected any rendering benchmarks, please point them out. You can find this and the other benchmarks at X-Bit Labs.

Doesn’t surprise me. Athlon came first in x86-64, and intel copied almost verbatim AMD’s intruction set a few months down the road.

Kind of funny actaully, AMD was originally just trying to usurp market share from intel many years ago, making cheap, sub-par chips. Nowadays they’re the big kid on the block.

Rather ironic.

Is there any current thot toward leveraging 64-bit computing in Blender? I’ve noticed that Lightwave is coming out with a 64-bit version (I read in a recent edition of 3D World Magazine, I think…) that has some performance gains on large projects. I’m mostly ignorant about this stuff, but would like to see speed improvements, as always… :wink:

that would be great, with my new proc; does anyone have any ideas?

Does that require code changes to incorporate 64 over 32? Or is this something that the compiler does by itself, since the architecture is somewhat similer?

i think the compiler should do that

i will check on that.

i think the compiler should do that

i will check on that.[/quote]
both

your application can be built to run better on 64 bit architectures, but mostly I see the gain as not being limited to 4Gb of memory
[supposedly windows xp 64 limits each process to 4Gb… using 64 bit addressing you ought to be able to get 18 exabytes of memory addressed]

but your application also has to be able to deal with longs and pointers being 64 bits [blender can, x68_64 isn’t the only 64 bit platform]

the compilier ought to take care of the rest, but like sse there are probably other things that can be done

would be nice to see a benchmark between a

3600 Athlon
and a 3.6Ghz P4

rather than the 3800… i don’t care about real clock speed. but IMO the marketing and cost comparison between AMD and Intel would say that these two chips should be compared side by side. (since clock speed is not comparible between any of the platforms including Apple)

i think it would be that the intel on 32 bits would beat the AMD on 64. (but hey i won’t know till the benchmarkers get their acs together, or AMD rethinks their labelling)

Alltaken

There are plans to release a 64-bit Blender version, there has actually already been some work done on this.
Unfortunately there currently is no developper with a 64-bit system and enough memory (atleast more than 4gb) to test and further develop this. Ton is trying to get a system like that or just the memory available to a developper though, since it would be nice for marketing!

Not if you have a pentium :slight_smile:

In the forum “testing builds” in Blender.org you can find a 64 bits compilation of Blender.

I don’t know what the fuss surrounding 64-bit processors is all about. My SGI has got 64-bit processors (yes, thats processors plural - there are four of them! :o ) and its eight years old, so they are nothing new. Its also got 1Gb of RAM and runs Blender smoother than my Pentium powered desktop. And it cost less than my desktop as well!! Damn things still waaaay too noisy though :smiley:

Personally, I can’t wait until Blender gets multi-processor support.

Me, too… that will give me a boost, and its not far away!

Also, the multi-core talk that’s been tossed around recently in processor design news may be more of a performance boost than a 64-bit switch, especially with multiple thread implementation…

The latest cvs compilation has “threads” to support dual processors machines:

http://blender3d.org/cms/Render_changes.515.0.html

And also it can speedup the render a bit, in case you have only one Pentium but with HT technology.

I’d like to see a similar comparison running some 64-bit linux distribution(maybe gentoo). Since basically everything can be recompiled under linux, I’m guessing you’d see a bigger speedup.

Hmmm… sounds interesting. I’m not sure if this is just a Pentium thing though, or if it will work with multi-processor SGI’s (or any other multi-processor system e.g the new Macs have 2 processors don’t they? ).

But why bother supporting SGIs? Surely they are rare as hens teeth/expensive/obsolete? Well no, not really. Keep an eye on ebay and you will find dual processor Octanes for around £200-£300 or so. I have a quad processor Onyx which has cost me £400 to get up and running. Don’t be put off by the apparently low clock speeds of the cpu. They are almost all 64-bit (only early stuff is 32 bit) and they have radically different architecture to desktop PCs which means that they are surprisingly fast. Another bonus is that systems like Onyx 2’s and Origin’s can scale up to have as many cpus as you can afford. A massive three rack Origin system went for about £1500 on ebay recently. It had 10 dual processor systems in the racks if I recall. Cheap rendering horsepower or what?!?!?!? If you avoid the early stuff like Indys and early Indigos, and the rare stuff like Challenges and Irises you’ll get a very stable, usable system which still has plenty of life left in it.

A big bonus is many SGIs come with composite and s-video in and out, analogue and digital audio in and out, and multi monitor support as standard. And many can also do things that PCs simply can’t. For example, my Onyx display settings are 1600 pixels by 1200 pixels with 48 bits per pixel (12 bits per component RGBA) Thats about 68 billion colours!!! Pretty handy for film and video work. 8)

If it isn’t obvious already, I do love my SGI’s, but I am not blind to their faults. You can’t buy SGIs new (well, not unless you’ve got serious amounts of money), and if yours breaks it will be hard , if not impossible, to get it fixed. You can’t just stroll down to PC world and buy spares for them, and PC engineers will run off in terror if they have to work on one (SGIs do tend to be complex in the extreme. For example the display system on my Onyx consists of 6 separate PCBs each about a foot square - the geometry engine alone has 10 custom processors!!!) :slight_smile: SGI used their own type of memory which can be hard to find, and the hard discs are various flavours of SCSI which are mounted on their own type of sled (which varies from model to model and are, like most SGI spare parts, fairly rare). Also, while there is a large amount of quality free software, there is nowhere near as much as there is for PCs.

AFAIK, the current status of the multithreading support is that it launches the scanline passes in two different threads so you’d only get half of the benefit of your quad processor system. It’s not architecture specific but it goes through the SDL thread support, so assuming that is developed correctly for your platform, it will work.

Martin

What are the hurdles the programmers are facing for an unspecified number of threads instead of being limited to 2? I was hoping it would be as simple as setting up a loop that creates new threads for each processor and assigning them a ‘border render’ to define what they should work on. Are they having to hard-code in each thread handler?

What are the hurdles the programmers are facing for an unspecified number of threads instead of being limited to 2? I was hoping it would be as simple as setting up a loop that creates new threads for each processor and assigning them a ‘border render’ to define what they should work on. Are they having to hard-code in each thread handler?[/quote]
umm, rasterization is a good thing to split across processcors

but what about the transformation of the verticies? until they are transformed we wouldn’t be able to quickly tell which ones are necescary to determine the triangles visible in a particular portion of the render

now, this isn’t an issue at all when raytracing, but for a scanline renderer I guess it would be if you wanted peak performace rendering a single frame.

I guess the best you might be able to hope for is to take each stage of rendering [animation’s deformations, transformation, lighting calculation [shadow buffers, radiosity], rasterization and raytracing] into seperate threads that fill buffers used by the next step. Also, don’t render frames entirely sequentially [don’t finnish rendering frame one before starting frame 2]… I guess ideally the os would be able to schedule this, but I think it is a bit extreme [and I don’t code this stuff so my ideas should be taken with 60 pounds or so of salt]

I have a Sun 2100z dual 2.4 Opteron (Nvidia Quadro 3000) and a PowerMac G5 dual 2.5 at work. The Sun box blows the Mac out of the water as far a rendering with Blender. I’m guessing it’s either the video card or Linux (not using as many resources).