Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: returned non-zero exit status 1. Something went wrong #2956

Open
WilliamLehmus opened this issue Nov 29, 2024 · 6 comments
Open

Error: returned non-zero exit status 1. Something went wrong #2956

WilliamLehmus opened this issue Nov 29, 2024 · 6 comments

Comments

@WilliamLehmus
Copy link

This error happens whenever I try to start the training.
I've been trying to figure out what's going wrong, but can't figure it out on my own.
Seems to be coming from the library and dependencies. I've tried all combinations of settings, the error persists regardless even
with standard setting.

Training the UNet...
'########:'########:::::'###::::'####:'##::: ##:'####:'##::: ##::'######:::
... ##..:: ##.... ##:::'## ##:::. ##:: ###:: ##:. ##:: ###:: ##:'##... ##::
::: ##:::: ##:::: ##::'##:. ##::: ##:: ####: ##:: ##:: ####: ##: ##:::..:::
::: ##:::: ########::'##:::. ##:: ##:: ## ## ##:: ##:: ## ## ##: ##::'####:
::: ##:::: ##.. ##::: #########:: ##:: ##. ####:: ##:: ##. ####: ##::: ##::
::: ##:::: ##::. ##:: ##.... ##:: ##:: ##:. ###:: ##:: ##:. ###: ##::: ##::
::: ##:::: ##:::. ##: ##:::: ##:'####: ##::. ##:'####: ##::. ##:. ######:::
:::..:::::..:::::..::..:::::..::....::..::::..::....::..::::..:::......::::

Traceback (most recent call last):
File "/content/diffusers/examples/dreambooth/train_dreambooth.py", line 803, in
main()
File "/content/diffusers/examples/dreambooth/train_dreambooth.py", line 640, in main
accelerator.init_trackers("dreambooth", config=vars(args))
File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 1067, in init_trackers
tracker.store_init_configuration(config)
File "/usr/local/lib/python3.10/dist-packages/accelerate/tracking.py", line 152, in store_init_configuration
self.writer.add_hparams(values, metric_dict={})
File "/usr/local/lib/python3.10/dist-packages/torch/utils/tensorboard/writer.py", line 330, in add_hparams
exp, ssi, sei = hparams(hparam_dict, metric_dict, hparam_domain_discrete)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/tensorboard/summary.py", line 270, in hparams
ssi.hparams[k].string_value = v
File "/usr/local/lib/python3.10/dist-packages/google/protobuf/internal/containers.py", line 70, in getitem
return self._values[key]
TypeError: list indices must be integers or slices, not str
Traceback (most recent call last):
File "/usr/local/bin/accelerate", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 43, in main
args.func(args)
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 837, in launch_command
simple_launcher(args)
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 354, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', '/content/diffusers/examples/dreambooth/train_dreambooth.py', '--external_captions', '--image_captions_filename', '--train_only_unet', '--save_starting_step=500', '--save_n_steps=0', '--Session_dir=/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/bs-johannes', '--pretrained_model_name_or_path=/content/stable-diffusion-custom', '--instance_data_dir=/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/bs-johannes/instance_images', '--output_dir=/content/models/bs-johannes', '--captions_dir=/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/bs-johannes/captions', '--instance_prompt=', '--seed=858353', '--resolution=512', '--mixed_precision=fp16', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--use_8bit_adam', '--learning_rate=2e-06', '--lr_scheduler=linear', '--lr_warmup_steps=0', '--max_train_steps=1500']' returned non-zero exit status 1.
Something went wrong

@MatthysTruter
Copy link

i keep getting the same error.

@ParkHangah
Copy link

ParkHangah commented Dec 2, 2024

me too. :(
I successfully completed my Dreambooth studies a month ago.
but now, it runs smoothly until the very last step, where it unexpectedly fails and an error occurs.
The same error as this article.
It's not an error with the colab notebook file,
than I have no choice but to rely on the developer ;ㅁ;
Dreambooth Unet Error 2024-12-02 195746

@ParkHangah
Copy link

I was Researching this problem.
The part where the error can occur among 'most recent call last' is 'File "/usr/local/lib/python3.10/dist-packages/google/protobuf/internal/containers.py", line 70, in getitem
return self._values[key]'.

This file is part of Google’s Protocol Buffers (protobuf) library and serves as one of its internal modules.

Notably, this error did not occur a month ago.

I have requested Google to investigate whether any changes or updates made to this file or the library in the past month might have caused this issue.

@akshay88apps
Copy link

File "/usr/local/lib/python3.10/dist-packages/google/protobuf/internal/containers.py", line 70, in getitem
return self._values[key]
TypeError: list indices must be integers or slices, not str
Traceback (most recent call last):
File "/usr/local/bin/accelerate", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 43, in main
args.func(args)
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 837, in launch_command
simple_launcher(args)
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 354, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', '/content/diffusers/examples/dreambooth/train_dreambooth.py', '--image_captions_filename', '--train_only_unet', '--save_starting_step=500', '--save_n_steps=500', '--Session_dir=/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/ghconfap', '--pretrained_model_name_or_path=/content/stable-diffusion-v2-512', '--instance_data_dir=/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/ghconfap/instance_images', '--output_dir=/content/models/ghconfap', '--captions_dir=/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/ghconfap/captions', '--instance_prompt=', '--seed=471428', '--resolution=512', '--mixed_precision=fp16', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--use_8bit_adam', '--learning_rate=2e-06', '--lr_scheduler=linear', '--lr_warmup_steps=0', '--max_train_steps=1500']' returned non-zero exit status 1.
Something went wrong

@Talionic
Copy link

Talionic commented Dec 4, 2024

I'm having the same problem all of a sudden.

1 similar comment
@Talionic
Copy link

Talionic commented Dec 4, 2024

I'm having the same problem all of a sudden.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants