Kinect 3D Scanner development

Hi Kai,
are you still behind this thing?
Nothing heared in this thread for several days. Hope you didn’t get frustrated by this thread?

Yes, of course. I spent half of the week with developing a structure aware smoothing filter which kind of works now. The noise resembles the speckle light pattern which is probably the result of the resolution limit of the IR cam of 1600x1200. The higher the resolution the better the depth precision would be. I have also started to implement a scanning method using the motor to move the camera while scanning. That’s also a try to fight the noise by doing subsampling while shifting the pattern along the surface. Unfortunately it’s hard to control in an accurate way.

Also RGB image alignment isn’t perfect yet, I’ll have to work on that too.

I hope I can post some images soon. That all takes longer than I thought.

Oh, and thank you to the few people who dared to donate. That really helps me to focus on this.

Kai, what about this idea:
You could use a tracking library like opencv or libmv to track points inside the color image. because, thanks to kinect depth information, you can much easier estimate how the points moved, and thus reconstruct camera movement, if e.g. user takes kinect in his hand and moves it around in space. then you could much easier connect the resulting meshes from each frame of the kinect capture, and build a mesh. it would look a bit like this video:

where the robot maps the space. Also some part-fitting algorithm could be usefull, to put the single frames together, e.g. meshlab alignment system (sample images here: with such algorithm, you wouldn’t actually need to track camera position at all, since you could fit frames to each other based on a few tracked color points really easily, and the algorithm would itself provide the merging and avaraging(smoothing) of the results.
Also, thanks for starting this development, I believe it will be something awesome soon.

Hi there,

I specifically joined this forum because of this thread. Just wanted to point out that researchers would like to see a open-source, transportable, scalable 3D scanning device with a high spatial (< 0.1 cm or better) and a resonable temporal resolution (< 5 min).

I can see the kinect is not this device - yet. Tried to follow different threads on the net, but I thought I might just point out specific interest here. Developers: do you think this is possible with the kinect? If not, can you state why? Came across some posts on newsgroups stating that combination of a step motor (like in the our Breuckmann system) and a kinect might be an idea. Did see posts on multiple devices at one time. Might this tackle spatial interpolation for higher resolutions?

Background: in our department, we use a Microscribe System to get 3d point models for statistical analysis. We tried a Breuckmann 3D scanner, but we had the trouble that our objects literally wilt while scanning them. Nevertheless, we will use it again for specific questions. (There’s a master student working on this, I just happen to observe the results.)


OK, I think it’s time to post some images. I’m showing only geometry here, since the most complex part is to generate a good mesh. I used per vertex ambient occlusion from Meshlab to quickly produce nicely shaded results. The first two images are showing an old tiled stove. One as raw data and one with denoising filter active as comparison. All others have the filter on as well. This is all direct output from the scanner software.

I see problems with the precision everywhere. Even with good tracking, it’ll start to drift away each second. So the longer you’re scanning it becomes more and more inaccurate.

As you can see on my images, the spatial resolution isn’t even close to < 0.1 cm. I don’t see how that can be improved much further without redesigning the Kinect device itself. The effective depth resolution can’t be higher than the point count of the structured light speckle pattern, which is considerably less than 640x480 even.

Anyway, the resolution is better than that of any TOF camera I have seen so far. But it surely isn’t on par with any professional laser scanner. For 150,- € however it’s a great piece of technology. 3d scanning for the masses so to speak.

very fun stuff! :slight_smile:

Kai, I really don’t know much about Kinect so this may make no sense… just an idea: would it be possible to use physical objects as reference and then being able to move the Kinect device closer to the objects to gradually increase the resolution of the result?

For instance putting on a table a few physical boxes, or just rectangles printed on papers, in the middle the object we want to scan, and then move Kinect around the object. That way the object would increase gradually its resolution adding points relative to the reactangles on the papers.

Something similar to augmented reality.

Eclectiel, it’s maybe a bit similar to what i proposed.

Kai, I think the aligning algorithm which I talked about could provide also fixing of the drift caused by the low precision. I guess already now you could take several scans from different sides and merge them manually in meshlab with this algorithm. Sorry for just talking and not trying myself, I don’t have a kinect…

You’re right, sorry for that, I joined the thread after seeing the posted images. Should’ve checked before posting.

One thing I see different (correct me if I’m worng), is that the tracking I propose would be for adding gradually more detail, and your suggestion is more about joining the results (?).

Maybe a combination of both would be interesting, tracking the camera position and orientation, and then move Kinect like some kind of “watering can”, gathering points. As you get closer to an object you can gather more points on smaller features of it and increase the resolution of the result.

mh could you do the scanning with your face?

the resolution might not be the best (currently) but I can see this as a great tool
to capture basic proportions and model over them instead of using 2d blue prints.

cool job man kai


Great stuff Kai, as ussual, is there any script or something to try? Thanks.

Hi Kai, I have been further thinking about ways to do better scanning, and have come with further simple ideas - - for static shots, you can average multiple images(quite a stupid obvious idea…) - you could test if the kinect device has a stable deviation at some points of the image- if you point the kinect on a flat surface in a certain distance, and average many frames, you could get a surface, which would show if there are some stable inprecisions in the ir field/ir camera scanning.(let’s call this correction surface) then, you could compute a coeficient for each pixel in the 3d image like this: coef = correct flat surface distanc at pixel/ correction surface distance at pixel this could then be used to make better results, by the following formula you could maybe neutralize the noise a bit with averaged many frames: vertexdistance = vertexdistance * coef further thing to do would be measure this coefficient at several distances, then do averaging between them. Still don’t have a kinect, so just ideas.


out of curiosity

could you scan a face - I am interested to see the detail this scanner could capture.

I’m definitely interested. Can it be used to scan big objects like buildings, using track points?


you mean using it similar to measured photography to sample dimension and proportions ?

I gave it a try and that’s how it looks:

@Eclectiel & pildanovak:
Point cloud remeshing of multiple combined scans is much more complex than reconstructing the geometry of only one scan. I think it is possible, but that would need a much longer R&D period I probably can’t provide for the moment. You can of course try the various advanced remeshing methods from meshlab to combine multiple scans. But I have no idea if this can improve details for parts of the mesh in general.

The scanable distance is in range 0.60 - 10 meters. Anything closer than that will be cut off and anything farther away is either very, very inaccurate or being cut off too, because the light patterns can’t be recognized by the IR sensor anymore. It’s ok for interiors but whole buildings from outside are to big for the small and low-powered device. I’ve tested it all. Even at night and with snow covered landscape, that I expected to be more reflective, but there was no noticable improvement for me.

Kai kostack

10 meters is enough, as long as you can use multiple scanning basepoints. I don’t think you’d have to remesh multiple scans, but just to join scans together. However, identifying and recording those scanning points could be difficult for this kind of device, maybe it can be combined with one of those cheap one-shot lasers used for building and as-build measurements, just for tracking basepoints.

Anyway it is a very cool project, I’m sure you will find help to develop it. A pro laser 3D scanner for as-build projects is a very expensive toy.

Kai, neat project! You may find some helpful info here on combining multiple scans (you’ve likely seen this already, tho.)

I can envision setting up a mall booth where grandparents can bring their grandkids in, get them scanned, come back in an hour for a 3D-printed model / portrait of their loved one to be treasured through the years… the makings of a big bux franchise! (Remember, you saw it here first!.. ©, ®, ™, etc… :smiley: )

Hi Kai

could the detail level be increased? The head misses some importent details.
But I do not know if the Kinect scanner can or needs that resolution when
basically it is only a motion capture devise.

Highly interesting.

This might be useful:

Hello, a new video of kinect-3d-slam application, which accumulates multiple frames of moving kinect with the method I proposed:

the software is to be found here:

the software doesn’t perform any kind of merging of the scans, but it’s realtime and could be a good help for the future development