Why not code a Machine Learning Remesher?

How about creating a ML remesher which can be trained by feeding some human meshes with good topology and it can learn from those meshes. I think this kind of remesher would do good because they can only get better with time. What do you guys think about this? (PS i cant code, but i think this would do good).

2 Likes

That sound like a good idea. I think not everything is yet in the deep neural networks.
Who knows, one day I will have a device at home telling me the easiest way to get enough on my bankaccount. :grinning:

If a neural network kan denoise, why not remesh?
The only thing is there can be more solutions for remeshing.

We did a thread on this a month or so ago:

It would require a very good data set. The model would be very complicated to implement, but it would definitely be possible to create.

Overall, training a ML model would need A LOT of data to be useful. That would be the biggest obstacle.

@DeepBlender I only know you that you have skills in machine learning, so I ask you a question, to estimate, how much do you think is the degree of difficulty in achieving something like that?

Machine learning requires there to be a “correct” model to train against. There is no “correct” topology, and even if you took the topologies from a million production-quality human models, for instance, you’d likely get a million “correct” answers. You can train to resolve individual patches with varying boundary definitions, and this research has already been done:

https://igl.ethz.ch/projects/ddq/

but in my opinion, you could dump all of the money in the world into researching how to machine generate the “right” topology, and still only please 10% of use cases very specific to the training set, and even less pleasing to anyone creating models with a different purpose in mind. Remeshing without guiding the algorithm in some way is a wasted effort in all but the simplest of cases. It’s like running ZRemesher on a human sculpt. Sure, you might get some nice cylindrical topology down the arms and legs, but who’s to say that’s even what you want? We’re a long way off, even in academia, from a situation where artists in the professional field are going to choose an automatic remesher over something like R3DS Wrap for important work.

1 Like

It probably is a ways off, but not too far. I can envision a model that is first trained to assign probabilities to the verts as to which body part(s) they belong to, or are closest to. The more traditional remesh algorithm would use that as a guide for loops and poles.

It is a good point that there is never some absolute “correct” topology, making training targets tough or impossible to come by for a full end to end model.

The dev of ZRemesher was suppose to have an addon for Blender but the thread was done in January 2019 and so far he doesnt seem to reply in his thread on Polycount here ; https://polycount.com/discussion/208030/quadremesher-new-auto-retopo-plugin-for-maya-3dsmax#latest

This addon is not going to be free i think and i suppose since he develop it also for max and maya that we will prioritize them since he will get a pile of gold!

TLDR: Dealing with topologies like meshes is not yet well enough researched in machine learning that we could take solutions from comparable problems and expect to get something meaningful. Due to the increased amount of research with similar problems (graph neural/convolutional networks), it may become a viable problem within the next few years.

There are two main difficulties from my point of view:

  • For every machine learning problem, we have to find a data representation which is usable for a machine learning model. When it comes to images, they can usually be used quite directly. When we are talking about a mesh, we need to investigate first how the information is best represented, such that we can deal with an arbitrary number of vertices, edges and faces. An unfortunate consequence of this is most likely that a “simple” neural network would not anymore be sufficient, but most likely a recurrent neural network would be required and those are a lot more complicated to deal with.
  • The second difficulty is the problem which @m9105826 already described. There is no absolute best answer and we don’t know how to reliably detect reasonable topologies from an artistic point of view. Nevertheless, we can compute how closely two meshes match each other. In fact there are plenty of ways to do that and we would need to figure out which ones are (most) suitable. Certainly, none of them is simple. That would clearly be a starting point.

This topic is quite far away from being solved and practically usable (with machine learning), even for the simplest case. We don’t even know whether we could come up with a machine learning approach which is better than conventional solutions.

How I would try to tackle this problem
Nevertheless, it is very interesting and I believe it is becoming viable to have a look at the static mesh case. As a starting point, we would need something that produces somewhat reasonable results. Once that is achieved, it could be iteratively improved and it could be made more general.
As training data, I would likely take a few hundred objects, subdivide them, add a little noise and apply a decimate modifier. That’s just to ensure we are not only dealing with perfectly clean mesh data.
As an initial task, I would train it to predict a topology with just halve the amount of vertices. That’s just because we need a starting point. It would be naive to start with more difficult tasks, because first, we need something that somewhat works. We need some sort of prototype or proof of concept, just to know whether the problem is solvable at all.
In the past one or two years, there has been more research around graph neural networks which are likely the best way to represent meshes in neural networks. I am confident that there are a few variants which might fit for this problem.
Next, we would need to figure out what the output of the neural network should be. Does it need to directly predict vertex coordinates or does it learn better when it can predict locations on triangles from the input mesh? Edges and faces are likely not that complicated (but I may as well underestimate it!).
Now we need a way to calculate the difference between the input mesh and the predicted mesh, such that the neural network has something to learn.
With all that in place, we could finally start to train the neural network. It would likely take a few weeks or months of playing around with ideas before anything reasonable is produced (with toy examples!). That would now be the starting point to improve every single aspect, such that we could slowly approach a solution which might be practically viable.

This process may sound incredibly frustrating and demotivating. In this case, it is mostly because first a solutions needs to be found which can be used as a starting point. But even if someone figured that out, getting those kinds of neural networks to work is quite a challenge. Usually, you hear about the deep learning/machine learning hype and how computers can do almost everything. What you basically never hear, is the amount of frustration and hard work related to it.

6 Likes

Indeed, the reason that ML image processing has been so successful is that the monte carlo algorithm eventually results in something very close to correct, and has traceable and repeatable data, so you can teach a machine that if X bundle of pixels has data like this at 1/4/8 samples per pixel, and has data like this at 1024/2048/4096 samples per pixel, then this other image generated with a monte carlo algorithm probably follows similar rules. Train that on a dozen movies’ worth of frames and you get some pretty reasonable results.

You are absolutely right about the denoising topic. In general, it is not necessary for machine learning that the input is already very close the the ground truth. And you don’t even need to have a unique ground truth.
To give some examples, there are plenty of ways to achieve style transfer where you have an image and a style image with the goal to preserve the content form the image, but with the style of the style image. It works remarkably well and is close in my opinion to getting practically viable. The produced image is usually quite far away from both the image and the style image, but it certainly resembles them. The resulting image is also not the only possible solution, just as it would not be the case for a hypothetical retopo neural network.
In machine learning, there are plenty of techniques to train neural networks without having an absolute true result. The reason why I didn’t mention any of those is because they are more difficult to train. In such a situation, it is necessary to first know you have a solid basis that is capable to solve the task at hand. Without that, it doesn’t make sense to try fancy stuff.

There’re some latest work that defines specific types for CNN on triangular meshes, such as MeshCNN or MeshNet. Maybe digging into such model architecture can help solve the retopology problem.

1 Like

I just read through the MeshCNN paper. It defines the pooling operator which does edge collapse related to the task. It also defines the unpooling operator accordingly. It seems that we can reframe mesh retopology as a generative task or a style transfer task which does feature matching in some intermediate pooling layers.

Those are definitely interesting papers. But let’s put them into perspective right away for everyone without an in depth understanding: Those papers still belong into the basic research category and are quite far away from being practically relevant.

There seems to be quite some potential in the MeshCNN paper when it comes to retopology. Do you think with lots of tweaking of those ideas it could come close to hand crafted solutions? How far do you think did the authors go to optimize it?

Just want to place it here

(text adaptation for Russian language: media-xyz com /ru/ articles/1643-kak-rabotaet-instrument-dlia-protsedurnoi-gen)

Also check about Nanite in UE5 and right now I can only suggest retopoflow + z or quadremesher, or quadremesher from 3dsMax.

UPD:

Huh, it’s seems to me that this tool is here and for free

https://www.simplygon.com/

wow, I need to check it further

Lot’s of detailed thoughtful answers already.

I am no technical expert on any of this but what seems to be clear to me is that auto topo and re meshing is going to be worked on and just keep getting better and better. There is a clear desire for it and it feels like the obvious future. Painstaking manual retopo is such a drain on time and resources in 3D asset production.

A good blend between auto retopo and manual artist chosen edge loop guides for deforming organic animated assets seems the ideal. Of course this is already quite far along for organic assets and there is also mesh morphing now widely used for assets that share similar topologies combined with auto rigging systems as well. Could this be a case for creative artist driven guided machine based learning to speed things up even more ?

But the real big breakthroughs next will need to come with hard surface retopo. For hard surface the non destructive boolean and bevel driven workflows as seen in Blender for instance with Hard Ops and Fluent just have so much to offer in speed, ease of use and simple enjoyment of working, So many of us are also hard surface modelling as well now in sculpture based apps and workflows with ZBrush and 3D Coat using similar Boolean cut and paste based workflows. Using hybrid approaches now between poly modelling and full hi resolution sculpting. It’s just a much more natural flexible and relaxed way of working for most creatives than traditional quad based and loop cut box modelling.

I wonder if perhaps the boolean bevel and cutting workflow which has been there since the early days but always seen as the quick and dirty short cut compared to clean quad based box modelling for production might soon come to dominate as auto topo methods improve. In the a simillar way perhaps that simple polygon quads supplanted NURBS patch modelling in 3D animation production once that subdivision systems had sufficiently matured. There is quite a way to go yet so I don’t want to get too carried away with broad predictions.

I mean all this in the context of art based media and entertainment rather than structural engineering and industrial design. Hope I am not going too far off topic into broader hypertheticals.

Yes, things beyond our imagination can be done by smart people and good technology