Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to generate 3D assets with more number of faces? #58

Open
supersyz opened this issue Dec 12, 2024 · 20 comments
Open

How to generate 3D assets with more number of faces? #58

supersyz opened this issue Dec 12, 2024 · 20 comments
Labels
good first issue Good for newcomers

Comments

@supersyz
Copy link

Hi, Really appreciate for you great open-source work!
I notice the output objects have small number of faces, how to generate 3D assets with more number of faces?

@kitcheng
Copy link

In the example.py and app.py files, search for the term "simplify", and set it to a fixed value of 0.

This way, it will not reduce the number of faces.

Image

@JackDainzh
Copy link

Better just edit the Simplify gradio slider to reach to the desired minimum in app.py

@visualbruno
Copy link

I changed the app.py, added the number of vertices, increased the sampling steps to 100 and changed the slider for mesh simplification from 0 to 0.98 and increased the max texture size to 4096.

Image

Download my app.py file here

@realisticdreamer114514
Copy link

realisticdreamer114514 commented Dec 14, 2024

@visualbruno what guidance strengths are the best for details/fidelity in the 2 stages? Do we really need 100 steps for this level of detail?

@visualbruno
Copy link

@visualbruno what guidance strengths are the best for details/fidelity in the 2 stages? Do we really need 100 steps for this level of detail?

Hi. I'm not sure if 100 steps is better than 50. Sometimes with 20, it is not good enough.
When the guidance strength is at 10 in the 2 stages, it respects more the image.

Check the screenshots:
1st screenshot : Guidance of 0
2nd screenshot : Guidance of 10

The 2nd screenshot, the fidelity is very good

Guidance Strength of 0 Guidance Strength of 10

@cjjkoko
Copy link

cjjkoko commented Dec 15, 2024

@visualbruno what guidance strengths are the best for details/fidelity in the 2 stages? Do we really need 100 steps for this level of detail?

Hi. I'm not sure if 100 steps is better than 50. Sometimes with 20, it is not good enough. When the guidance strength is at 10 in the 2 stages, it respects more the image.

Check the screenshots: 1st screenshot : Guidance of 0 2nd screenshot : Guidance of 10

The 2nd screenshot, the fidelity is very good

Guidance Strength of 0 Guidance Strength of 10

My params:
simply:0.7
texture_size:2048

seed=20,
sparse_structure_sampler_params={
"steps": 100,
"cfg_strength": 7.5,
},
slat_sampler_params={
"steps": 100,
"cfg_strength": 3.5,
},
It works fine, but I'm still working on better parameters, including modifying some that aren't exposed

@visualbruno
Copy link

I think, the biggest issue is about the input picture resolution that is resized to 518x518 in trellis_image_to_3d.py
I tested with a higher resolution like 2058x2058, but the result was horrible. Probably they trained the model with this low resolution.

@cjjkoko
Copy link

cjjkoko commented Dec 16, 2024

I think, the biggest issue is about the input picture resolution that is resized to 518x518 in trellis_image_to_3d.py I tested with a higher resolution like 2058x2058, but the result was horrible. Probably they trained the model with this low resolution.

Image Image Maybe postprocessing will get a great result

@realisticdreamer114514
Copy link

@cjjkoko What kind of postprocessing do you use?

@cjjkoko
Copy link

cjjkoko commented Dec 16, 2024

@cjjkoko What kind of postprocessing do you use?

trimesh and open3d. use laplacian

@realisticdreamer114514
Copy link

realisticdreamer114514 commented Dec 16, 2024

trimesh and open3d

These are trying to smooth the output meshes without trying to improve the quality during generation.
You can see that even with a high-resolution input image and high texture size e.g. 4096, much detail of the final meshes' texture map is lost when they should be preserved (I tested with some character cosplay photos and found out about this). Might be better if someone has the GPU power to train/finetune the I23D model for it to work in input image resolution of say 770^2 or 1036^2, since as visualbruno points out the pipeline is set at the resolution the official model was trained on (518^2) and this kind of downsizing might explain detail loss.

@cjjkoko
Copy link

cjjkoko commented Dec 16, 2024

trimesh and open3d

These are trying to smooth the output meshes without trying to improve the quality during generation. You can see that even with a high-resolution input image and high texture size e.g. 4096, much detail of the final meshes' texture map is lost when they should be preserved (I tested with some character cosplay photos and found out about this). Might be better if someone has the GPU power to train/finetune the I23D model for it to work in input image resolution of say 770^2 or 1036^2, since as visualbruno points out the pipeline is set at the resolution the official model was trained on (518^2) and this kind of downsizing might explain detail loss.

Yes, but there is no specific date for the training.

@realisticdreamer114514
Copy link

there is no specific date for the training

Even at the current default resolution, the official I23D checkpoint seems quite undertrained (not sure if this is the right way to put it) so it doesn't adhere to the input image closely enough and tends to distort details that are still clear when downscaled. Finetuning on this framework can't come sooner...

@cjjkoko
Copy link

cjjkoko commented Dec 18, 2024

there is no specific date for the training

Even at the current default resolution, the official I23D checkpoint seems quite undertrained (not sure if this is the right way to put it) so it doesn't adhere to the input image closely enough and tends to distort details that are still clear when downscaled. Finetuning on this framework can't come sooner...

emmm, Expect great breaking updates in the next release. At this point, you can only adjust the seed to fit each image.I am currently using this very painful

@visualbruno
Copy link

I played with many parameters like the input image resizing, the number of sampling steps, texture resolution and the "number of views" used in the postprocessing. So I modified all main scripts and the app.py to play with these parameters.

The best result I got is with:

  • Input image resized to 770, instead of 518.
  • Number of Sampling Steps : 500
  • Texture resolution : 2048
  • Postprocessing "Number of Views": 120, instead of 100 (it removes a bit the artifacts on the texture)

For sure, with these values, it takes much more time to generate the model.

I tried to increase the input image resolution to 1036 and above, but the results were worse.
I tried with a number of sampling steps of 800 and 1000, but it did not improve the result too.
A texture resolution of 4096 is not better than 2048.
I tried with 200 for the "number of views" in post processing, it did not improve a lot the texture and the rendering time was multiplied by 10.

I tested with 2d anime pictures and it never renders very well, probably because this kind of pictures is flat and lacks relief and depth.

With Marlin from Seven Deadly Sins, Input Picture:
Image
Result:
Image

With Cleopatra, Input Picture:
Image
Result:
Image

With Knight, Input Picture:
Image
Result:
Image

@QuantumLight0
Copy link

I played with many parameters like the input image resizing, the number of sampling steps, texture resolution and the "number of views" used in the postprocessing. So I modified all main scripts and the app.py to play with these parameters.

The best result I got is with:

  • Input image resized to 770, instead of 518.
  • Number of Sampling Steps : 500
  • Texture resolution : 2048
  • Postprocessing "Number of Views": 120, instead of 100 (it removes a bit the artifacts on the texture)

For sure, with these values, it takes much more time to generate the model.

I tried to increase the input image resolution to 1036 and above, but the results were worse. I tried with a number of sampling steps of 800 and 1000, but it did not improve the result too. A texture resolution of 4096 is not better than 2048. I tried with 200 for the "number of views" in post processing, it did not improve a lot the texture and the rendering time was multiplied by 10.

I tested with 2d anime pictures and it never renders very well, probably because this kind of pictures is flat and lacks relief and depth.

With Marlin from Seven Deadly Sins, Input Picture: Image Result: Image

With Cleopatra, Input Picture: Image Result: Image

With Knight, Input Picture: Image Result: Image

I believe the model is under trained for anime models. anime models have flat normals so no depth, so I believe that a model needs to be train on anime model' faces in order for it to understand the faces. I do belive however the multidiffusion has potential in this area and I will provide a sample of why I believe so.
https://github.com/user-attachments/assets/bd2bf57b-0c55-42ad-b28c-2f8b2e5d84f9
Image
Image
Image

@visualbruno
Copy link

@QuantumLight0 What parameters did you use to generate this model ?

@QuantumLight0
Copy link

@QuantumLight0 What parameters did you use to generate this model ?
I set everything to max

Image

@visualbruno
Copy link

@QuantumLight0 I did not see they updated the repository with multi images algorithm. I will play with it.

@cjjkoko
Copy link

cjjkoko commented Dec 25, 2024

Any new breakthroughs ?

@YuDeng YuDeng added the good first issue Good for newcomers label Dec 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

8 participants