Page 1 of 4 123 ... LastLast
Results 1 to 20 of 63
  1. #1
    Member Marc Driftmeyer's Avatar
    Join Date
    Sep 2014
    Location
    Pacific Northwest WA
    Posts
    45

    2.8 Branch update with Cycles CPU/GPGPU rendering together

    https://developer.blender.org/D2873

    CPU rendering will be restricted to a BVH2, which is not ideal for raytracing
    performance but can be shared with the GPU. Decoupled volume shading will be
    disabled to match GPU sampling.
    Thread priority or GPU sync tweaks are likely needed to improve performance,
    but might as well post the patch for testing already. Perfect scaling is not
    going to happen due to BVH2 usage though.
    Go to User Preferences > System to enable the CPU to render alongside the GPU.






  2. #2
    Looks promising. Lets see how it goes.



  3. #3
    can we have a link to download the branch please ?



  4. #4
    Is this compatible with SMP? (bi xeon platforms for example)



  5. #5
    Member Marc Driftmeyer's Avatar
    Join Date
    Sep 2014
    Location
    Pacific Northwest WA
    Posts
    45
    It's been approved: https://developer.blender.org/D2873

    I have yet to see it pull from Master. I don't build branches with specific diff changes so I have no knowledge of exactly how they are branching testing of such and remerging. I'm not interested in brushing up on git merge and working on several local copies.

    I would assume once they recognize the CPU it's limitations would dictate the threads available for OpenCL based CPU kernels to leverage.


    /* Fallback to standard device name API. */
    if(name.empty()) {
    name = get_device_name(device_id);
    }

    /* Distinguish from our native CPU device. */
    if(get_device_type(device_id) & CL_DEVICE_TYPE_CPU) {
    name += " (OpenCL)";
    }

    return name;
    }



  6. #6
    Will this be merged in current master? Or only 2.8?



  7. #7
    Member Marc Driftmeyer's Avatar
    Join Date
    Sep 2014
    Location
    Pacific Northwest WA
    Posts
    45
    Originally Posted by juang3d View Post
    Will this be merged in current master? Or only 2.8?
    I believe it is going into Master first as the source code being effected is the 2.79/master branch. I'm seeing nothing in the properties.py reflecting this feature.

    Even though it's gotten the greenlight I still don't see it merged.



  8. #8
    Member Marc Driftmeyer's Avatar
    Join Date
    Sep 2014
    Location
    Pacific Northwest WA
    Posts
    45
    From what I can tell this is going into 2.79 next revision but getting refined at the moment.

    Code:
    ** \file BKE_blender_version.h
     *  \ingroup bke
     */
    
    /* these lines are grep'd, watch out for our not-so-awesome regex
     * and keep comment above the defines.
     * Use STRINGIFY() rather than defining with quotes */
    #define BLENDER_VERSION         279
    #define BLENDER_SUBVERSION      2
    /* Several breakages with 270, e.g. constraint deg vs rad */
    #define BLENDER_MINVERSION      270
    #define BLENDER_MINSUBVERSION   6
    
    /* used by packaging tools */
    /* can be left blank, otherwise a,b,c... etc with no quotes */
    #define BLENDER_VERSION_CHAR
    /* alpha/beta/rc/release, docs use this */
    #define BLENDER_VERSION_CYCLE   alpha
    
    extern char versionstr[]; /* from blender.c */
    
    #endif  /* __BKE_BLENDER_VERSION_H__ */
    Right now it's the following issue:

    viewport rendering of BMW from official benchmark pack takes 12seconds on 1080TI, 20seconds on Vega64 and 16 seconds using both. With F12 render, that's the opposite, Vega is faster with 82sec (at 128x128, best time), 1080Ti takes 93seconds (at 16x16, best time) and both take 44seconds using latest master with initial_num_samples at 5000.
    To sum up:

    • viewport seem really slow in latest master. OpenCL. 2.78c with selective node compilation for viewport renders nearly 2x faster on Vega 64. It's not due to SSS or volume as those are not compiled in viewport kernel either. I can investigate on that.
    • multi-device rendering is slower with viewport/progressive rendering than the fastest device alone. Logic would be to wait for the slowest half to finish, which would be around 10seconds for Vega?
    I'm glad it's not going into 2.8 first. Let us bang on it and file bugs to test, and then later roll it up into the big future release.



  9. #9
    Member
    Join Date
    Dec 2007
    Location
    Wroclaw, Poland
    Posts
    327
    That is quite promising.. my system will finally get a full workout when rendering...
    https://blenderartists.org/forum/sho...69-UEA-Pelican
    2x E5-2687w :: 32GB :: 4x RX 480 :: SSD goodness



  10. #10
    Originally Posted by Grzesiek View Post
    That is quite promising.. my system will finally get a full workout when rendering...
    Crossfit for computers.



  11. #11
    Originally Posted by Marc Driftmeyer View Post
    From what I can tell this is going into 2.79 next revision
    Anything currently being committed into master has no guarantee what so ever it will end up in 2.79a/b/c. When the time comes for 2.79a we will look though all commits, and cherry pick the needed bug-fixes and transfer them over to the 2.79 branch, new features and especially risky and/or compatibility breaking changes generally don't make the cut. But really until we sit down and sift through the commits nobody can know for sure.



  12. #12
    I just did a test with a BB Build and the result is AMAZING!!

    In a test scene I used I got 9:18 for the GPU only render (GTX1080) and 5 minutes (I donīt rememeber seconds right now) for the GPU+CPU... this is AWESOME!!!
    Around a 40% improvement

    Here is the test scene rendered with GPU+CPU:
    CPU_GPU_result.jpg



  13. #13



  14. #14
    Member
    Join Date
    Oct 2013
    Location
    Tellus
    Posts
    367
    Originally Posted by razin View Post
    can we have a link to download the branch please ?
    Its here: https://builder.blender.org/download...113d-win64.zip (sorry for double post)



  15. #15
    Member
    Join Date
    Oct 2013
    Location
    Tellus
    Posts
    367
    Hmm if I am not mistaken, the combination of CPU + GPU is not so that it will go faster, technically it should be about the same render time. I think the combination is for memory saving. I often get out of memory with my 4GB card, with the combination of cpu+gpu this should no longer be a problem, the render time should be the same as with GPU only.

    Correct me if I am wrong please.



  16. #16
    Member SunBurn's Avatar
    Join Date
    Aug 2012
    Location
    Greece
    Posts
    229
    I think esimacio is right.

    The 2.79 GPU + CPU builds give me always slower results, (sometimes close to GPU but only if i set my tile to 32X32).

    But on scenes where my GTX 970 memory is limiting my GPU only render I can go for all three, 970,1060 + i7 using CPU+GPU.

    Unfortunately today I get weird pinkish results , (only in CPU+GPU) but I'll investigate further.



  17. #17
    Member SterlingRoth's Avatar
    Join Date
    Mar 2006
    Location
    Portland, OR
    Posts
    2,000
    I'm also getting a 40% speed boost on most scenes I throw at it, and most of those aren't very memory intensive.



  18. #18
    It's only faster on smaller tiles and that is not optimal for amd cards that uses big tiles.
    I can only give suggestions, personal opinions and constructive critique, but it is your decision what you do with it.



  19. #19
    Originally Posted by esimacio View Post
    Hmm if I am not mistaken, the combination of CPU + GPU is not so that it will go faster, technically it should be about the same render time. I think the combination is for memory saving. I often get out of memory with my 4GB card, with the combination of cpu+gpu this should no longer be a problem, the render time should be the same as with GPU only.

    Correct me if I am wrong please.
    This is wrong, it's about improving render time only. You are still limited by the memory of the GPU, solving that is something else.

    As other explained, you need to use small tiles to get the render time reduction. We have done some optimizations to render small tiles faster and more are planned, with the goal of eventually removing the manual tile size setting entirely to better balance CPU and GPU work.



  20. #20
    Yep, around 40% speed boost for me too.

    Test scene with 1280 Samples:
    Only GPU - 8min 53sec
    GPU+CPU - 5min 29sec
    WIN 7
    750TI
    Xeon E3 1230 4x3.3GHz

    Tilesize Test:
    128 Samples
    64 Tiles = 39sec
    16 Tiles = 55sec

    Same Scene only change Samples to 1280:
    64 Tiles = 5min 45sec
    16 Tiles = 5min 29sec



Page 1 of 4 123 ... LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •