schirrmacher commited on
Commit
9fb8f60
·
verified ·
1 Parent(s): 15086d1

Upload ./README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +5 -27
README.md CHANGED
@@ -29,15 +29,9 @@ python utils/inference.py
29
 
30
  ## Training
31
 
32
- The model was trained with the [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans).
33
 
34
- After 10.000 iterations with a single NVIDIA GeForce RTX 4090 the following achievements were made:
35
-
36
- - Training Time: 8 hours
37
- - Training Loss: 0.1179
38
- - Validation Loss: 0.1284
39
- - Maximum F1 Score: 0.9928
40
- - Mean Absolute Error: 0.005
41
 
42
  Output model: `/models/ormbg.pth`.
43
 
@@ -81,25 +75,9 @@ python utils/pth_to_onnx.py
81
 
82
  # Research
83
 
84
- Synthetic datasets have limitations for achieving great segmentation results. This is because artificial lighting, occlusion, scale or backgrounds create a gap between synthetic and real images. A "model trained solely on synthetic data generated with naïve domain randomization struggles to generalize on the real domain", see [PEOPLESANSPEOPLE: A Synthetic Data Generator for Human-Centric Computer Vision (2022)](https://arxiv.org/pdf/2112.09290). However, hybrid training approaches seem to be promising and can even improve segmentation results.
85
-
86
- ## Ideas
87
-
88
- Currently I am doing research how to close this gap with the resources I have.
89
-
90
- - Apply ControlNet with LayerDiffuse for creating segmented humans in nearly realistic environments
91
- - ✅ very promising
92
- - ❌ ControlNet adds hairs and body parts occasionally and extends the segmented area which might confuse the model
93
- - Apply LayerDiffuse for foreground and background creation
94
- - ✅ easy to perform
95
- - ❌ quality not optimal, every 20th image usable (yet)
96
- - Consider pose estimations as additional training data, see [Cross-Domain Complementary Learning Using Pose for Multi-Person Part Segmentation (2019)](https://arxiv.org/pdf/1907.05193).
97
- - Create 3D models of humans
98
- - ❌ hair styles, clothes and environment difficult to generate in a realistic way and also a lot of work
99
- - Segment photos with commercial tools and use them for training
100
- - ❌ costs, model never better than competitors
101
- - Manually segment photos
102
- - ❌ huge amount of work
103
 
104
  ## Support
105
 
 
29
 
30
  ## Training
31
 
32
+ The model was trained with the [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans) which was created with [LayerDiffuse](https://github.com/layerdiffusion/LayerDiffuse) and [IC-Light](https://github.com/lllyasviel/IC-Light).
33
 
34
+ The model was trained with 10.000 iterations and on a NVIDIA GeForce RTX 4090.
 
 
 
 
 
 
35
 
36
  Output model: `/models/ormbg.pth`.
37
 
 
75
 
76
  # Research
77
 
78
+ Synthetic datasets have limitations for achieving great segmentation results. This is because artificial lighting, occlusion, scale or backgrounds create a gap between synthetic and real images. A "model trained solely on synthetic data generated with naïve domain randomization struggles to generalize on the real domain", see [PEOPLESANSPEOPLE: A Synthetic Data Generator for Human-Centric Computer Vision (2022)](https://arxiv.org/pdf/2112.09290).
79
+
80
+ Currently I am doing research how to close this gap. Latest research is about creating segmented humans with [LayerDiffuse](https://github.com/layerdiffusion/LayerDiffuse) and then apply [IC-Light](https://github.com/lllyasviel/IC-Light) for creating realistic light effects and shadows.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
 
82
  ## Support
83