Screen coordinates to World coordinates

I’m working on a drag and drop script for the game engine. We need it
to move objects around on the screen. I have handled all of the
collision math and logic for moving objects around but there is one
last thing that I haven’t been able to figure out…

How do we convert Screen (window) Coordinates to World Coordinates

And

How do we convert World Coordinates to Screen Coordinates

(No, I’m not interested in using a ray sensor. All of the movable
objects are read in from external libraries. We haven’t been able to
figure out how to attach a logic brick to them using a script)

I’ve been able to piece together most of it from Letter Rips code and
from the MouseOverSensor source code. One thing that I know I haven’t
been able to find is how to properly convert the depth coordinate into
a normalized screen coordinate.

I’ve included a simple script. Place a mouse click sensor on an object
and connect it to a python controller for this script. In game mode,
when you click on the screen, the script will output conversion of
the screen coordinates to the world coordinates and world coordinates
(of the object) to screen coordinates.

There are two behaviors:

Orthogonal views work ok, except for the depth values. I think I just
need help with the Normalized device coordinates to fix that

Perspective works well at the center of the screen and then errors
increase the further away from the center of the screen you get.




import Blender
from Blender import Window, Mathutils
from Blender.Mathutils import *
import GameLogic
import Rasterizer
import math

#dehomonogizes a vector
def dehom(v):
    ret = Vector(v[0]/v[3],v[1]/v[3],v[2]/v[3])
    return ret

#takes the inverse of a matrix and returns it
def inverseMatrix(m):
    ret = Matrix([0,0,0,0],[0,0,0,0],[0,0,0,0],[0,0,0,0])
    det = m.determinant()

    for i in range(4):
        for j in range(4):
            temp = Matrix([0,0,0],[0,0,0],[0,0,0])
            col = 0
            for x in range(4):
                if x != i:
                    row = 0
                    for y in range(4):
                        if y != j:
                            temp[col][row] = m[x][y]
                            row = row +1
                    col = col +1
            tdet = temp.determinant()
            total = i+j
            if (total % 2):
                sign = -1
            else:
                sign = 1
                
            #i and j are flipped
            ret[j][i] = (sign * tdet) / det
    return ret



#move a 4x4 list into a matrix
def M2M(M):
    return (Matrix(M[0],M[1],M[2],M[3]))

#move a 3x3 list into a matrix
def M2MD3(M):
    return (Matrix(M[0],M[1],M[2]))


#transforms the world cordinates to screen (window) coordinates
def transWC2SC(wc_x,wc_y,wc_z):

  #clip taken from letter rip's code--have to assume that the vals are ok
  for win3d in Window.GetScreenInfo(Window.Types.VIEW3D):
      # we search all 3dwins for the one containing the point
      #(screen_x, screen_y) (could be the mousecoords for example) 
      win_min_x, win_min_y, win_max_x, win_max_y = win3d['vertices']
      # calculate a few geometric extents for this window
      
      mid_x = (win_max_x + win_min_x)/2.0
      mid_y = (win_max_y + win_min_y)/2.0
      width = (win_max_x - win_min_x + 1.0)
      height = (win_max_y - win_min_y + 1.0)
      
  pm = Window.GetPerspMatrix()
  coords = Vector(wc_x,wc_y,wc_z,1.0)
  #print "object location coords", coords

  val = coords*pm    

  #now dehomonigize
  val = dehom(val)

  #convert from Normalized Device Coordinates to screen coordinates
  val[0] = mid_x+ val[0]*width/2.0
  val[1] = mid_y + val[1]*height/2.0
  val[2] =  (val[2]+1)/2 #This isn't right, but It is the only thing I've seen
                         #that seems to works sometimes
  return val


def transSC2WC(screen_x,screen_y,screen_z):

  found = 0;
  #clip taken from letter rips code
  for win3d in Window.GetScreenInfo(Window.Types.VIEW3D):
      # we search all 3dwins for the one containing the point
      #(screen_x, screen_y) (could be the mousecoords for example) 
      win_min_x, win_min_y, win_max_x, win_max_y = win3d['vertices']
      # calculate a few geometric extents for this window
      
      mid_x = (win_max_x + win_min_x)/2.0
      mid_y = (win_max_y + win_min_y)/2.0
      width = (win_max_x - win_min_x + 1.0)
      height = (win_max_y - win_min_y + 1.0)
      
      # check if screencoords (screen_x, screen_y) are within the 3dwin 
      if (win_max_x > screen_x > win_min_x) and (  win_max_y > screen_y > win_min_y):
          found = 1
          break
  if(not found):
      print "Not Found!"
      return 0, 0, 0


  NDC = Vector( 2* ( screen_x - mid_x)/ width,
        2*(screen_y - mid_y)/height,
        2*screen_z - 1,         #can't believe this seems to work
        1.0)

  persp = Window.GetPerspMatrix()
  invpersp = inverseMatrix(persp)

  coords = NDC


  newcoords = coords*invpersp #This Works for orthongonal!
  newcoords = dehom(newcoords)

  return newcoords


#found this in the documentation, using it to find the depth
#for converting screen coordinates to window coordinates
def getObjDepth(pos):
      depth =         pos[0]*cam.world_to_camera[2][0]
      depth = depth + pos[1]*cam.world_to_camera[2][1]
      depth = depth + pos[2]*cam.world_to_camera[2][2]
      depth = depth + cam.world_to_camera[2][3]
      return depth

#distance between two points (in world coordinates)
def distanceCamToObj(pos,cpos):
    dist = [cpos[0]-pos[0],cpos[1]-pos[1],cpos[2]-pos[2]]
    ret = dist[0]*dist[0]+dist[1]*dist[1]+dist[2]*dist[2]
    return math.sqrt(ret)



def main():

    
    screen_x, screen_y=Window.GetMouseCoords()
    pos = owner.getPosition()
    depth = getObjDepth(pos)
    print "

Screen Coords", screen_x,screen_y,-depth
    print "Translate Position(World) to Screen Coords"
    print transWC2SC(pos[0],pos[1],pos[2]),"
"    

    print "Position Coords",pos
    print "Translate Screen Coords to World Coords"
    print transSC2WC(screen_x,screen_y,-depth)




#define globals
controller=GameLogic.getCurrentController()
owner = controller.getOwner()
scene = GameLogic.getCurrentScene()
cam = scene.active_camera


#Initialize the script
try:
    owner.init
except AttributeError:
    owner.init = 1    
    Rasterizer.showMouse(1)

main()



I plan on placing the result on the forums for the next person that
askes this question. Lets put this one to rest!

Centre Prof

I added a function modified from letterRips to Blender 2.42’s python script bundle. its in BPyWindow. it does exactly what you want. in BPyMesh there are some functions that intersect the ray with a mesh object to get the face and intersection point.

This is it. not sure if I added to the CVS
BPyWindow.py


import Blender 
from Blender import Mathutils, Window, Scene, Draw, Mesh 
from Blender.Mathutils import CrossVecs, Matrix, Vector, Intersect, LineIntersect 
 
 
# DESCRIPTION: 
# screen_x, screen_y the origin point of the pick ray 
# it is either the mouse location 
# localMatrix is used if you want to have the returned values in an objects localspace. 
#    this is usefull when dealing with an objects data such as verts. 
# or if useMid is true, the midpoint of the current 3dview 
# returns 
# Origin - the origin point of the pick ray 
# Direction - the direction vector of the pick ray 
# in global coordinates 
epsilon = 1e-3 # just a small value to account for floating point errors 
 
def mouseViewRay(screen_x, screen_y, localMatrix=None, useMid = False): 
     
    # Constant function variables 
    p = mouseViewRay.p 
    d = mouseViewRay.d 
     
    for win3d in Window.GetScreenInfo(Window.Types.VIEW3D): # we search all 3dwins for the one containing the point (screen_x, screen_y) (could be the mousecoords for example)  
        win_min_x, win_min_y, win_max_x, win_max_y = win3d['vertices'] 
        # calculate a few geometric extents for this window 
 
        win_mid_x  = (win_max_x + win_min_x + 1.0) * 0.5 
        win_mid_y  = (win_max_y + win_min_y + 1.0) * 0.5 
        win_size_x = (win_max_x - win_min_x + 1.0) * 0.5 
        win_size_y = (win_max_y - win_min_y + 1.0) * 0.5 
 
        #useMid is for projecting the coordinates when we subdivide the screen into bins 
        if useMid: # == True 
            screen_x = win_mid_x 
            screen_y = win_mid_y 
         
        # if the given screencoords (screen_x, screen_y) are within the 3dwin we fount the right one... 
        if (win_max_x > screen_x > win_min_x) and (  win_max_y > screen_y > win_min_y): 
            # first we handle all pending events for this window (otherwise the matrices might come out wrong) 
            Window.QHandle(win3d['id']) 
             
            # now we get a few matrices for our window... 
            # sorry - i cannot explain here what they all do 
            # - if you're not familiar with all those matrices take a look at an introduction to OpenGL... 
            pm    = Window.GetPerspMatrix()   # the prespective matrix 
            pmi  = Matrix(pm); pmi.invert() # the inverted perspective matrix 
             
            if (1.0 - epsilon < pmi[3][3] < 1.0 + epsilon): 
                # pmi[3][3] is 1.0 if the 3dwin is in ortho-projection mode (toggled with numpad 5) 
                hms = mouseViewRay.hms 
                ortho_d = mouseViewRay.ortho_d 
                 
                # ortho mode: is a bit strange - actually there's no definite location of the camera ... 
                # but the camera could be displaced anywhere along the viewing direction. 
                 
                ortho_d.x, ortho_d.y, ortho_d.z = Window.GetViewVector() 
                ortho_d.w = 0 
                 
                # all rays are parallel in ortho mode - so the direction vector is simply the viewing direction 
                #hms.x, hms.y, hms.z, hms.w = (screen_x-win_mid_x) /win_size_x, (screen_y-win_mid_y) / win_size_y, 0.0, 1.0 
                hms[:] = (screen_x-win_mid_x) /win_size_x, (screen_y-win_mid_y) / win_size_y, 0.0, 1.0 
                 
                # these are the homogenious screencoords of the point (screen_x, screen_y) ranging from -1 to +1 
                p=(hms*pmi) + (1000*ortho_d) 
                p.resize3D() 
                d[:] = ortho_d[:3] 
                 
 
            # Finally we shift the position infinitely far away in 
            # the viewing direction to make sure the camera if outside the scene 
            # (this is actually a hack because this function 
            # is used in sculpt_mesh to initialize backface culling...) 
            else: 
                # PERSPECTIVE MODE: here everything is well defined - all rays converge at the camera's location 
                vmi  = Matrix(Window.GetViewMatrix()); vmi.invert() # the inverse viewing matrix 
                fp = mouseViewRay.fp 
                 
                dx = pm[3][3] * (((screen_x-win_min_x)/win_size_x)-1.0) - pm[3][0] 
                dy = pm[3][3] * (((screen_y-win_min_y)/win_size_y)-1.0) - pm[3][1] 
                 
                fp[:] = \ 
                pmi[0][0]*dx+pmi[1][0]*dy,\ 
                pmi[0][1]*dx+pmi[1][1]*dy,\ 
                pmi[0][2]*dx+pmi[1][2]*dy 
                 
                # fp is a global 3dpoint obtained from "unprojecting" the screenspace-point (screen_x, screen_y) 
                #- figuring out how to calculate this took me quite some time. 
                # The calculation of dxy and fp are simplified versions of my original code 
                #- so it's almost impossible to explain what's going on geometrically... sorry 
                 
                p[:] = vmi[3][:3] 
                 
                # the camera's location in global 3dcoords can be read directly from the inverted viewmatrix 
                #d.x, d.y, d.z =normalize_v3(sub_v3v3(p, fp)) 
                d[:] = p.x-fp.x, p.y-fp.y, p.z-fp.z 
                 
                #print 'd', d, 'p', p, 'fp', fp 
                 
             
            # the direction vector is simply the difference vector from the virtual camera's position 
            #to the unprojected (screenspace) point fp 
             
            # Do we want to return a direction in object's localspace? 
             
            if localMatrix: 
                localInvMatrix = Matrix(localMatrix) 
                localInvMatrix.invert() 
                p = p*localInvMatrix 
                d = d*localInvMatrix # normalize_v3 
                p.x += localInvMatrix[3][0] 
                p.y += localInvMatrix[3][1] 
                p.z += localInvMatrix[3][2] 
                 
            #else: # Worldspace, do nothing 
             
            d.normalize() 
            return True, p, d # Origin, Direction     
     
    # Mouse is not in any view, return None. 
    return False, None, None 
 
# Constant function variables 
mouseViewRay.d = Vector(0,0,0) # Perspective, 3d 
mouseViewRay.p = Vector(0,0,0) 
mouseViewRay.fp = Vector(0,0,0) 
 
mouseViewRay.hms = Vector(0,0,0,0) # ortho only 4d 
mouseViewRay.ortho_d = Vector(0,0,0,0) # ortho only 4d 
 
def spaceRect():
    '''
    Returns the space rect
    xmin,ymin,width,height
    '''
    
    __UI_RECT__ = Blender.BGL.Buffer(Blender.BGL.GL_FLOAT, 4)
    Blender.BGL.glGetFloatv(Blender.BGL.GL_SCISSOR_BOX, __UI_RECT__) 
    __UI_RECT__ = __UI_RECT__.list
    __UI_RECT__ = int(__UI_RECT__[0]), int(__UI_RECT__[1]), int(__UI_RECT__[2])-1, int(__UI_RECT__[3]) 
    
    return __UI_RECT__

def mouseRelativeLoc2d(__UI_RECT__= None):
    if not __UI_RECT__:
        __UI_RECT__ = spaceRect()
    
    mco = Window.GetMouseCoords()
    if    mco[0] > __UI_RECT__[0] and\
    mco[1] > __UI_RECT__[1] and\
    mco[0] < __UI_RECT__[0] + __UI_RECT__[2] and\
    mco[1] < __UI_RECT__[1] + __UI_RECT__[3]:
    
        return (mco[0] - __UI_RECT__[0], mco[1] - __UI_RECT__[1])
        
    else:
        return None

Cambo,

    Thanks for the response.    (You have an impressive collection of scripting work.)






 This is the code that I've been studying for the past 3 weeks.  It is literally the only thing out there.   

This is listed in the cookbook as how to find mouse coordinates. But that is slightly deceptive. This code finds a ray and a starting point. As you mentioned, you can use ray casting to determine where the ray would touch. I’ve had a lot of difficulty with getting the math to work from that code and it is documented that it has been obfuscated for optimization purposes.

 I feel that there are three reasons why we need to find how to convert these values directly, using  matrix math.
  1. As far as I can see there is no documentation on it. There are several matrices and no road map on how they fit together. Is there a difference between the matrices returned by calling:

getPerspMatrix, getWorldToCameraMatrix, getProjectionMaterix , getViewMatrix ?

I’ve not been able to find any information on how the Z normalized screen coordinate is generated (even in the world->screen) senario. I feel that it would be very good for the community if these things could be defined in order to empower users with knowledge of what these functions do. This seems like a good time/place to do that.

  1. My script will run solely in the game engine. The Blender API is very nice, and has all the hooks that you could imagine. I do not doubt there is anything that you cannot accomplish by hand that could not be programmed with a python script.

But the game engine API is much less mature. There are parallel structures that have limited functionality. It is not possible (please correct me if I’m wrong) to create logic brick using only python. A large part of our program relies generating objects from a file, objects that will later have to react to the environment. Without logic bricks, many of the game engine’s properties are lost to us. (Please don’t read that as ranting, think of it as a feature request. Once again if I’m wrong please tell me, it would make my life much easier:)

One of these limitations is that object, along with it’s child mesh, is no longer updated, but rather the game_object with its KX_MeshProxy is updated. I could not find any ray cast/sensor functions that I can call (without a sensor logic brick). In fact the only information that you can get out of the KX_Mesh are vertex locations (no edge data)

  1. Probably the silliest reason of all. I’ve developed an algorithm that has our computational geometrician interested. I would like to show a real world application of this technology. (As I’m an academic I also want to get a paper out of it with my student.)

I plan to release a working algorithm to the blender community, it seems like it would be useful not only for what it does but for helping people do things that are not currently possible


If you can’t help me off the top of your head could you direct me to the point in the blender code that a world coordinate is transformed into a screen coordinate? If I can get the code for one direction I can determine the code of the other direction (I’ve done it before).

Centre Prof

Hi Cambo,
I looked that the openGL code and finally figured it all out. I’ll post code tommorrow. Thanks for your help.

Centre Prof

Blender (as with most graphics programs) makes use of matrix
multiplication to convert coordinates of an object (local space) to
the coordinates on the screen (Window coordinates). Blender’s
evolution has created many matrices that are not defines in the
API. I hope that this thread will become a location for the
documentation on these matrices as people experiment with them.
(This information is valid for Blender 2.41 )

If you are not familiar with matrix math or how matrices are used to
move between different “spaces” I would recommend that you look at
this article, about how openGL does these things.

http://trac.bookofhook.com/bookofhook/trac.cgi/wiki/MousePicking

The terminology used by Blender is different for the different spaces.

Local->World->View?->Projection->NDC->World

One of the frustrating things is that the commands to get these
matrices are different if you are using the standard Blender API’s
or if you are using the GameEngine API’s. Below we will cover what
matrices to use to get through each of the spaces.

Convert to Homogeneous coordinates
to convert X,Y,Z to homogeneous coordinates, simply change the
3D vector to a 4D vector with a 1 as the 4th element. This
allows the matrix multiplication to account for translation effects.

Local->World
Blender: use object.getMatrix() (convert X,Y,Z coordinates to homogeneous)

   GameEngine: No matrix exists.  You will have to make your own
   Use KX_Object.getOrientation(),KX_Object.getPosition() and
   KX_Object.scaling to create a matrix that can transform the
   local coordinates to World coordinates

World->Projection
This works the same in for both systems. Use
Window.getPerspMatrix(). This matrix is actually the joining
of the matrices used to convert from World->View and From View
to Projection. There are other matrices found in the Window
and KX_Camera object, but you can ignore them.

Projection->NDC
This works the same in both systems. You now need to
de-homogenize your coordinates (from 4D to 3D) Simply divide
the first 3 coordinates by the 4th coordinate.

NDC->Screen
This works the same in both systems. screen values are
measured in pixels.

WinX = screen_mid_x+ NDC.x*screen_width/2.0
WinY = screen_mid_y + NDC.y*screen_height/2.0    
WinZ =  (NDC.z+1)/2

There you have it. When you do this in reverse there are two things
that you will have to remember.

 1)You have to start by finding the screen depth.  This can be
   done using a BGL.glReadPixels() call.  (See example below)

 2)You will have to take the inverse of the matrices to convert
   the coordinates in the other direction.

Below are two pieces of example code to convert World Coordinates to
screen coordinates


#dehomogenizes a vector
def dehom(v):
    ret = Vector(v[0]/v[3],v[1]/v[3],v[2]/v[3])
    return ret

#transforms the world coordinates to screen (window) coordinates
def transWC2SC(wc_x,wc_y,wc_z):

  #clip taken from letter rip's code
  #Unless we can pass in screen coords to check that we have the right window,
  #we will have to assume that the vals here are ok


  for win3d in Window.GetScreenInfo(Window.Types.VIEW3D):
      # we search all 3dwins for the one containing the point
      #(screen_x, screen_y) (could be the mousecoords for example) 
      win_min_x, win_min_y, win_max_x, win_max_y = win3d['vertices']
      # calculate a few geometric extents for this window
      
      mid_x = (win_max_x + win_min_x)/2.0
      mid_y = (win_max_y + win_min_y)/2.0
      width = (win_max_x - win_min_x + 1.0)
      height = (win_max_y - win_min_y + 1.0)
      
  pm = Window.GetPerspMatrix()
  coords = Vector(wc_x,wc_y,wc_z,1.0)

  #order is important!
  val = coords*pm

  #now dehomonigize
  val = dehom(val)

  #convert from Normalized Device Coordinates to screen coordinates
  val[0] = mid_x+ val[0]*width/2.0
  val[1] = mid_y + val[1]*height/2.0
  val[2] =  (val[2]+1)/2 #this info was hard to find,
                         #converts [-1,1] to [0,1]
                         
  return val



And Screen (window) coordinates to World Coordinates


#dehomonogizes a vector
def dehom(v):
    ret = Vector(v[0]/v[3],v[1]/v[3],v[2]/v[3])
    return ret

#takes the inverse of a matrix and returns it
def inverseMatrix(m):
    ret = Matrix([0,0,0,0],[0,0,0,0],[0,0,0,0],[0,0,0,0])
    det = m.determinant()

    for i in range(4):
        for j in range(4):
            temp = Matrix([0,0,0],[0,0,0],[0,0,0])
            col = 0
            for x in range(4):
                if x != i:
                    row = 0
                    for y in range(4):
                        if y != j:
                            temp[col][row] = m[x][y]
                            row = row +1
                    col = col +1
            tdet = temp.determinant()
            total = i+j
            if (total % 2):
                sign = -1
            else:
                sign = 1
                
            #i and j are flipped
            ret[j][i] = (sign * tdet) / det
    return ret

#grabs the depth of a pixel from a depth buffer
def getPixelDepth(x,y):


    z = BGL.Buffer(BGL.GL_FLOAT, [1])

    glReadPixels (x,y, 1, 1,
              GL_DEPTH_COMPONENT, GL_FLOAT, z);

    print "value from depth buffer is",z
    return z[0]

#screen_x and screen_y are from Window.GetMouseCoords()
#screen_z is from a call to getPixelDepth(screen_x,screen_y)
def transSC2WC(screen_x,screen_y,screen_z):

  found = 0;
  #clip taken from letter rips code
  for win3d in Window.GetScreenInfo(Window.Types.VIEW3D):
      # we search all 3dwins for the one containing the point
      #(screen_x, screen_y) (could be the mousecoords for example) 
      win_min_x, win_min_y, win_max_x, win_max_y = win3d['vertices']
      # calculate a few geometric extents for this window
      
      mid_x = (win_max_x + win_min_x)/2.0
      mid_y = (win_max_y + win_min_y)/2.0
      width = (win_max_x - win_min_x + 1.0)
      height = (win_max_y - win_min_y + 1.0)
      
      # check if screencoords (screen_x, screen_y) are within the 3dwin 
      if (win_max_x > screen_x > win_min_x) and (  win_max_y > screen_y > win_min_y):
          found = 1
          break
  if(not found):
      print "Not Found!"
      return 0, 0, 0


  coords = Vector( 2* ( screen_x - mid_x)/ width,
        2*(screen_y - mid_y)/height,
        2*screen_z - 1,         
        1.0)

  persp = Window.GetPerspMatrix()
  invpersp = inverseMatrix(persp)

  newcoords = coords*invpersp 
  newcoords = dehom(newcoords)

  return newcoords



I hope this helps.

Centre Prof

Hi Centre Prof

I’m working on my bachelor thesis and facing a problem very similar to what’s solved by your example code. So I’d like to know if I may incorporate that code into my project (with proper attribution, of course) and adapt it to fit my needs. Your explanation of what matrices do represent the transformation between which blender coordinate spaces already helped me a lot, many thanks for posting that.

Is mentioned paper available already? It’d be interesting to see what you are/were working on there.

das-g

Update for 2.5x
http://www.blender.org/documentation/blender_python_api_2_59_release/bpy_extras.view3d_utils.html
For code that may be edited for use in the BGE see: https://svn.blender.org/svnroot/bf-blender/trunk/blender/release/scripts/modules/bpy_extras/view3d_utils.py

For the BGE’s camera, can you use this to get worldspace vector and convert to a coordinate?
http://www.blender.org/documentation/blender_python_api_2_59_release/bge.types.html#bge.types.KX_Camera.getScreenVect
These may be useful too
http://www.blender.org/documentation/blender_python_api_2_59_release/bge.types.html#bge.types.KX_Camera.getScreenRay
http://www.blender.org/documentation/blender_python_api_2_59_release/bge.types.html#bge.types.KX_Camera.getScreenPosition